Jump to content

Calling convention

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by CBM (talk | contribs) at 14:45, 16 July 2006 (Standard Exit and Entry Sequences: edit link). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

A calling convention is a method for a programming language to send data to a function, and receive data back from functions. When writing a piece of software in multiple languages and modules, it is necessary for all modules to use compatible calling conventions.

Calling conventions differ in the order of passed parameters on the stack, methods for sending data to a function, how returned data is received from a function, and methods of Name mangling.

cdecl

The cdecl calling convention is used by many C and C++ systems for the x86 architecture. In cdecl, function parameters are passed on the stack in a right-to-left order. Function return values are returned in the EAX register. Registers EAX, ECX, and EDX are available for use in the function.

For instance, the following C code function prototype and function call:

int function(int, int, int);
int a, b, c, x;
...
x = function(a, b, c);

will produce the following x86 Assembly code (written in MASM syntax):

push c
push b
push a
call function
add esp, 12 ;Stack clearing
mov x, eax

The calling function cleans the stack after the function call returns.

The cdecl calling convention is usually the default calling convention for x86 C compilers, although many compilers provide options to automatically change the calling conventions used. To manually define a function to be cdecl, some support the following syntax:

void _cdecl function(params);

The _cdecl modifier must be included in the function prototype, and in the function declaration to override any other settings that might be in place.

Pascal

The Pascal calling convention is the reverse of the C calling convention. The parameters are pushed on the stack in left-to-right order and the callee is responsible for balancing the stack before return.

The callee balances the stack by the assembly code: "ret freestack", where freestack is a constant integer.

Register (fastcall)

The Register or fastcall calling convention is compiler-specific for historical reasons. In general, however, it states that the few first arguments that fit into a processor's register (i.e. with a size up to 32 bits for x86 architecture) will be passed via registers instead of being put onto the stack. The remaining arguments are passed right-to-left on the stack (like in cdecl). Return values are passed through the AL, AX, or EAX register. The stack is usually callee-cleared unless the function takes a variable number of parameters. Most RTL functions, however, take a small number of parameters, so they don't have to clear the stack at all.

  • Microsoft __fastcall[1] convention (aka __msfastcall) passes first TWO arguments via ECX and EDX;
  • Borland __fastcall convention passes first THREE arguments via EAX, EDX, ECX;
  • Watcom __fastcall convention passes first FOUR arguments via EAX, EDX, EBX and ECX, thus kicking out the most perfomance gain of all the three versions while still having enough spare registers to operate freely.

The Watcom C/C++ compiler also uses the #pragma aux[2] directive that allows you to specify your own calling convention. According to its manual, "Very few users are likely to need this method, but if it is needed, it can be a lifesaver".

stdcall

The stdcall[3] calling convention is the de facto standard calling convention for the Microsoft Windows NT application programming interface. Function parameters are passed right-to-left. Registers EAX, ECX, and EDX are preserved for use within the function. Return values are stored in the EAX register. Unlike cdecl, the called function cleans the stack, instead of the calling function. Because of this fact, stdcall functions cannot support variable-length argument lists.

On a Microsoft Windows system, a function may be declared to be stdcall using the following syntax in the function prototype, and in the function declaration:

void __stdcall function(params);

Stdcall functions are easy to recognize in ASM code because those functions will all unwind the stack prior to returning. The x86 ret instruction allows an optional byte parameter that specifies the number of stack locations to unwind before returning to the caller. such code looks like this:

ret 14

safecall

In Borland Delphi on Microsoft Windows, the safecall calling convention encapsulates COM (Component Object Model) error handling, so that exceptions aren't leaked out to the caller, but are reported in the HRESULT return value, as required by COM/OLE. When calling a safecall function from Delphi code, Delphi also automatically checks the returned HRESULT and raises an exception if necessary. Together with language-level support for COM interfaces and automatic IUnknown handling (implicit AddRef/Release/QueryInterface calls), the safecall calling convention makes COM/OLE programming in Delphi very nice and elegant.

thiscall

This calling convention is used for calling C++ member functions. There are two primary versions of thiscall used depending on the compiler and whether or not the function uses variable arguments.

For the GCC compiler, thiscall is almost identical to cdecl: the calling function cleans the stack, and the parameters are passed in right-to-left order. The difference is the addition of the this pointer, which is pushed onto the stack last, after all the parameters. This is the same method used by Windows thiscall functions that use variable arguments.

Windows functions that do not use variable arguments also have their arguments passed in right-to-left order, but the called function cleans the stack and the this pointer is passed in the ECX register.

The thiscall calling convention cannot be explicitly specified as thiscall is not a keyword. (Disassemblers like IDA, however, have to specify it anyway. So IDA uses keyword __thiscall__ for this)

Intel ABI

The Intel Application Binary Interface is a computer programming standard that most compilers and languages follow. According to the Intel ABI, the EAX, EDX, and ECX are to be free for use within a procedure or function, and need not be preserved.

Microsoft x64 calling convention

The x64 calling convention takes advantage of additional register space in the x86-64 / Intel EM64T platform. The registers RCX, RDX, R8, R9 are used for integer and pointer arguments, and XMM0, XMM1, XMM2, XMM3 are used for floating point arguments. Additional arguments are pushed onto the stack. The return value is stored in RAX.

Standard Exit and Entry Sequences

The standard entry sequence to a function is as follows:

_function:
    push ebp       ;store the old base pointer
    mov ebp, esp   ;make the base pointer point to the current stack location
                   ;which is where the parameters are
    sub esp, x     ;x is the size, in bytes, of all "automatic variables"
                   ;in the function

This sequence preserves the original base pointer ebp, points ebp to the location of the function parameters on the stack, and creates space for automatic variables on the stack. Local variables are created on the stack with each call to the function, and are cleaned up at the end of each function. This behavior allows for functions to be called recursively. In C and C++, variables declared "automatic" are created in this way.

The Standard Exit Sequence goes as follows:

   mov esp, ebp   ;reset the stack to "clean" away the local variables
   pop ebp        ;restore the original base pointer
   ret            ;return from the function

The following C function:

int _cdecl MyFunction(int i){ 
    int k;
    return i + k;
}

would produce the equivalent asm code:

   ;entry sequence
   push ebp
   mov ebp, esp
   sub esp, 4     ;create space for "int k"

   ;function code
   mov eax, [ebp + 8] 
                  ;move parameter i to accumulator
   add eax, [ebp - 4]
                  ;add k to i
                  ;answer is returned in eax

   ;exit sequence
   mov esp, ebp
   pop ebp
   ret

See also