zoukankan      html  css  js  c++  java
  • Calling Convention

    Calling Conventions Demystified

    Visual C++ calling conventions explained

    Introduction

    During the long, hard, but yet beautiful process of learning C++ programming for Windows, you have probably been curious about the strange specifiers that sometime appear in front of function declarations, like __cdecl, __stdcall, __fastcall, WINAPI, etc. After looking through MSDN, or some other reference, you probably found out that these specifiers specify the calling conventions for functions. In this article, I will try to explain different calling conventions used by Visual C++ (and probably other Windows C/C++ compilers). I emphasize that above mentioned specifiers are Microsoft-specific, and that you should not use them if you want to write portable code.

    So, what are the calling conventions? When a function is called, the arguments are typically passed to it, and the return value is retrieved. A calling convention describes how the arguments are passed and values returned by functions. It also specifies how the function names are decorated. Is it really necessary to understand the calling conventions to write good C/C++ programs? Not at all. However, it may be helpful with debugging. Also, it is necessary for linking C/C++ with assembly code.

    To understand this article, you will need to have some very basic knowledge of assembly programming.

    No matter which calling convention is used, the following things will happen:

    1. All arguments are widened to 4 bytes (on Win32, of course), and put into appropriate memory locations. These locations are typically on the stack, but may also be in registers; this is specified by calling conventions.
    2. Program execution jumps to the address of the called function.
    3. Inside the function, registers ESI, EDI, EBX, and EBP are saved on the stack. The part of code that performs these operations is called function prolog and usually is generated by the compiler.
    4. The function-specific code is executed, and the return value is placed into the EAX register.
    5. Registers ESI, EDI, EBX, and EBP are restored from the stack. The piece of code that does this is called function epilog, and as with the function prolog, in most cases the compiler generates it.
    6. Arguments are removed from the stack. This operation is called stack cleanup and may be performed either inside the called function or by the caller, depending on the calling convention used.

    As an example for the calling conventions (except for this), we are going to use a simple function:

    int sumExample (int a, int b)
    {
        return a + b;
    }

    The call to this function will look like this:

    int c = sum (2, 3);
    

    For __cdecl, __stdcall, and __fastcall calling conventions, I compiled the example code as C (not C++). The function name decorations, mentioned later in the article, apply to the C decoration schema. C++ name decorations are beyond the scope of this article.

    C calling convention (__cdecl)

    This convention is the default for C/C++ programs (compiler option /Gd). If a project is set to use some other calling convention, we can still declare a function to use __cdecl:

    int __cdecl sumExample (int a, int b);

    The main characteristics of __cdecl calling convention are:

    1. Arguments are passed from right to left, and placed on the stack.
    2. Stack cleanup is performed by the caller.
    3. Function name is decorated by prefixing it with an underscore character '_' .

    Now, take a look at an example of a __cdecl call:

    ; // push arguments to the stack, from right to left
    push        3    
    push        2    
    
    ; // call the function
    call        _sumExample 
    
    ; // cleanup the stack by adding the size of the arguments to ESP register
    add         esp,8 
    
    ; // copy the return value from EAX to a local variable (int c)
    mov         dword ptr [c],eax

    The called function is shown below:

    ; // function prolog
      push        ebp  
      mov         ebp,esp 
      sub         esp,0C0h 
      push        ebx  
      push        esi  
      push        edi  
      lea         edi,[ebp-0C0h] 
      mov         ecx,30h 
      mov         eax,0CCCCCCCCh 
      rep stos    dword ptr [edi] 
      
    ; //    return a + b;
      mov         eax,dword ptr [a] 
      add         eax,dword ptr [b] 
    
    ; // function epilog
      pop         edi  
      pop         esi  
      pop         ebx  
      mov         esp,ebp 
      pop         ebp  
      ret

    Standard calling convention (__stdcall)

    This convention is usually used to call Win32 API functions. In fact, WINAPI is nothing but another name for __stdcall:

    #define WINAPI __stdcall

    We can explicitly declare a function to use the __stdcall convention:

    int __stdcall sumExample (int a, int b);

    Also, we can use the compiler option /Gz to specify __stdcall for all functions not explicitly declared with some other calling convention.

    The main characteristics of __stdcall calling convention are:

    1. Arguments are passed from right to left, and placed on the stack.
    2. Stack cleanup is performed by the called function.
    3. Function name is decorated by prepending an underscore character and appending a '@' character and the number of bytes of stack space required.

    The example follows:

    ; // push arguments to the stack, from right to left
      push        3    
      push        2    
      
    ; // call the function
      call        _sumExample@8
    
    ; // copy the return value from EAX to a local variable (int c)  
      mov         dword ptr [c],eax

    The function code is shown below:

    ; // function prolog goes here (the same code as in the __cdecl example)
    
    ; //    return a + b;
      mov         eax,dword ptr [a] 
      add         eax,dword ptr [b] 
    
    ; // function epilog goes here (the same code as in the __cdecl example)
    
    ; // cleanup the stack and return
      ret         8

    Because the stack is cleaned by the called function, the __stdcall calling convention creates smaller executables than __cdecl, in which the code for stack cleanup must be generated for each function call. On the other hand, functions with the variable number of arguments (like printf()) must use __cdecl, because only the caller knows the number of arguments in each function call; therefore only the caller can perform the stack cleanup.

    Fast calling convention (__fastcall)

    Fast calling convention indicates that the arguments should be placed in registers, rather than on the stack, whenever possible. This reduces the cost of a function call, because operations with registers are faster than with the stack.

    We can explicitly declare a function to use the __fastcall convention as shown:

    int __fastcall sumExample (int a, int b);

    We can also use the compiler option /Gr to specify __fastcall for all functions not explicitly declared with some other calling convention.

    The main characteristics of __fastcall calling convention are:

    1. The first two function arguments that require 32 bits or less are placed into registers ECX and EDX. The rest of them are pushed on the stack from right to left.
    2. Arguments are popped from the stack by the called function.
    3. Function name is decorated by by prepending a '@' character and appending a '@' and the number of bytes (decimal) of space required by the arguments.

    Note: Microsoft have reserved the right to change the registers for passing the arguments in future compiler versions.

    Here goes an example:

    ; // put the arguments in the registers EDX and ECX
      mov         edx,3 
      mov         ecx,2 
      
    ; // call the function
      call        @fastcallSum@8
      
    ; // copy the return value from EAX to a local variable (int c)  
      mov         dword ptr [c],eax

    Function code:

    ; // function prolog
    
      push        ebp  
      mov         ebp,esp 
      sub         esp,0D8h 
      push        ebx  
      push        esi  
      push        edi  
      push        ecx  
      lea         edi,[ebp-0D8h] 
      mov         ecx,36h 
      mov         eax,0CCCCCCCCh 
      rep stos    dword ptr [edi] 
      pop         ecx  
      mov         dword ptr [ebp-14h],edx 
      mov         dword ptr [ebp-8],ecx 
    ; // return a + b;
      mov         eax,dword ptr [a] 
      add         eax,dword ptr [b] 
    ;// function epilog  
      pop         edi  
      pop         esi  
      pop         ebx  
      mov         esp,ebp 
      pop         ebp  
      ret

    How fast is this calling convention, comparing to __cdecl and __stdcall? Find out for yourselves. Set the compiler option /Gr, and compare the execution time. I didn't find __fastcall to be any faster than other calling conventons, but you may come to different conclusions.

    Thiscall

    Thiscall is the default calling convention for calling member functions of C++ classes (except for those with a variable number of arguments).

    The main characteristics of thiscall calling convention are:

    1. Arguments are passed from right to left, and placed on the stack. this is placed in ECX.
    2. Stack cleanup is performed by the called function.

    The example for this calling convention had to be a little different. First, the code is compiled as C++, and not C. Second, we have a struct with a member function, instead of a global function.

    struct CSum
    {
        int sum ( int a, int b) {return a+b;}
    };

    The assembly code for the function call looks like this:

    push        3
    push        2
    lea         ecx,[sumObj]
    call        ?sum@CSum@@QAEHHH@Z            ; CSum::sum
    mov         dword ptr [s4],eax
    

    The function itself is given below:

    push        ebp
    mov         ebp,esp
    sub         esp,0CCh
    push        ebx
    push        esi
    push        edi
    push        ecx
    lea         edi,[ebp-0CCh]
    mov         ecx,33h
    mov         eax,0CCCCCCCCh
    rep stos    dword ptr [edi]
    pop         ecx
    mov         dword ptr [ebp-8],ecx
    mov         eax,dword ptr [a]
    add         eax,dword ptr [b]
    pop         edi
    pop         esi
    pop         ebx
    mov         esp,ebp
    pop         ebp
    ret         8
    

    Now, what happens if we have a member function with a variable number of arguments? In that case, __cdecl is used, and this is pushed onto the stack last.

    Conclusion

    To cut a long story short, we'll outline the main differences between the calling conventions:

    • __cdecl is the default calling convention for C and C++ programs. The advantage of this calling convetion is that it allows functions with a variable number of arguments to be used. The disadvantage is that it creates larger executables.
    • __stdcall is used to call Win32 API functions. It does not allow functions to have a variable number of arguments.
    • __fastcall attempts to put arguments in registers, rather than on the stack, thus making function calls faster.
    • Thiscall calling convention is the default calling convention used by C++ member functions that do not use variable arguments.

    In most cases, this is all you'll ever need to know about the calling conventions.

    The following calling conventions are supported by the Visual C/C++ compiler.

    Table 1
    KeywordStack cleanupParameter passing
    __cdecl Caller Pushes parameters on the stack, in reverse order (right to left)
    __clrcall n/a Load parameters onto CLR expression stack in order (left to right).
    __stdcall Callee Pushes parameters on the stack, in reverse order (right to left)
    __fastcall Callee Stored in registers, then pushed on stack
    __thiscall Callee Pushed on stack; this pointer stored in ECX
    __vectorcall Callee Stored in registers, then pushed on stack in reverse order (right to left)

    Okay, here we go: The 32-bit x86 calling conventions. (By the way, in case people didn’t get it: I’m only talking in the context of calling conventions you’re likely to encounter when doing Windows programming or which are used by Microsoft compilers. I do not intend to cover calling conventions for other operating systems or that are specific to a particular language or compiler vendor.) Remember: If a calling convention is used for a C++ member function, then there is a hidden “this” parameter that is the implicit first parameter to the function.

    All
    The 32-bit x86 calling conventions all preserve the EDI, ESI, EBP, and EBX registers, using the EDX:EAX pair for return values.
    C (__cdecl)

    The same constraints apply to the 32-bit world as in the 16-bit world. The parameters are pushed from right to left (so that the first parameter is nearest to top-of-stack), and the caller cleans the parameters. Function names are decorated by a leading underscore.

    __stdcall

    This is the calling convention used for Win32, with exceptions for variadic functions (which necessarily use __cdecl) and a very few functions that use __fastcall. Parameters are pushed from right to left [corrected 10:18am] and the callee cleans the stack. Function names are decorated by a leading underscore and a trailing @-sign followed by the number of bytes of parameters taken by the function.

    __fastcall

    The first two parameters are passed in ECX and EDX, with the remainder passed on the stack as in __stdcall. Again, the callee cleans the stack. Function names are decorated by a leading @-sign and a trailing @-sign followed by the number of bytes of parameters taken by the function (including the register parameters).

    thiscall

    The first parameter (which is the “this” parameter) is passed in ECX, with the remainder passed on the stack as in __stdcall. Once again, the callee cleans the stack. Function names are decorated by the C++ compiler in an extraordinarily complicated mechanism that encodes the types of each of the parameters, among other things. This is necessary because C++ permits function overloading, so a complex decoration scheme must be used so that the various overloads have different decorated names.

    Here is a quick overview of common calling conventions. Note that the calling conventions are usually more complex than represented here (for instance, how is a large struct returned? How about a struct that fits in two registers? How about va_list's?). Look up the specifications if you want to be certain. It may be useful to write a test function and use gcc -S to see how the compiler generates code, which may give a hint of how the calling convention specification should be interpreted.

    PlatformReturn ValueParameter RegistersAdditional ParametersStack AlignmentScratch RegistersPreserved RegistersCall List
    System V i386 eax, edx none stack (right to left)1   eax, ecx, edx ebx, esi, edi, ebp, esp ebp
    System V X86_642 rax, rdx rdi, rsi, rdx, rcx, r8, r9 stack (right to left)1 16-byte at call3 rax, rdi, rsi, rdx, rcx, r8, r9, r10, r11 rbx, rsp, rbp, r12, r13, r14, r15 rbp
    Microsoft x64 rax rcx, rdx, r8, r9 stack (right to left)1 16-byte at call3 rax, rcx, rdx, r8, r9, r10, r11 rbx, rdi, rsi, rsp, rbp, r12, r13, r14, r15 rbp
    ARM r0, r1 r0, r1, r2, r3 stack 8 byte4 r0, r1, r2, r3, r12 r4, r5, r6, r7, r8, r9, r10, r11, r13, r14  

    Note 1: The called function is allowed to modify the arguments on the stack and the caller must not assume the stack parameters are preserved. The caller should clean up the stack.

    Note 2: There is a 128 byte area below the stack called the 'red zone', which may be used by leaf functions without increasing %rsp. This requires the kernel to increase %rsp by an additional 128 bytes upon signals in user-space. This is not done by the CPU - if interrupts use the current stack (as with kernel code), and the red zone is enabled (default), then interrupts will silently corrupt the stack. Always pass -mno-red-zone to kernel code (even support libraries such as libc's embedded in the kernel) if interrupts don't respect the red zone.

    Note 3: Stack is 16 byte aligned at time of call. The call pushes %rip, so the stack is 16-byte aligned again if the callee pushes %rbp.

    Note 4: Stack is 8 byte aligned at all times outside of prologue/epilogue of function.

    https://wiki.osdev.org/Calling_Conventions

    https://devblogs.microsoft.com/oldnewthing/20040108-00/?p=41163

    https://www.codeproject.com/articles/1388/calling-conventions-demystified#:~:text=A%20calling%20convention%20describes%20how%20the%20arguments%20are,the%20calling%20conventions%20to%20write%20good%20C%2FC%2B%2B%20programs%3F

    https://docs.microsoft.com/en-us/cpp/cpp/argument-passing-and-naming-conventions?view=msvc-160

    https://docs.microsoft.com/en-us/cpp/build/x64-calling-convention?view=msvc-160

    https://stackoverflow.com/questions/3404372/stdcall-and-cdecl

  • 相关阅读:
    DExpose2:Windows 下窗体平铺预览
    第二章 随机变量及其分布3
    资源文件分享到QQ群共享里的方法
    第三章 多维随机变量及其分布1
    RegexBuddy
    第四章 随机变量的数字特征3
    html 表格排序
    关于微软自带的身份和角色验证
    学习中小型软件开发步骤
    学习路线图
  • 原文地址:https://www.cnblogs.com/Searchor/p/14023898.html
Copyright © 2011-2022 走看看