zoukankan      html  css  js  c++  java
  • GCC 嵌入汇编代码

    The format of basic inline assembly is very much straight forward. Its basic form is

    基本汇编嵌入格式如下:

    asm("assembly code");

    Example.


    1. asm("movl %ecx %eax"); /* moves the contents of ecx to eax */  
    2. __asm__("movb %bh (%eax)"); /*moves the byte from bh to the memory pointed by eax */  

    You might have noticed that here I’ve used asm and __asm__. Both are valid. We can use__asm__ if the keywordasm conflicts with something in our program. If we have more than one instructions, we write one per line in double quotes, and also suffix a ’ ’ and ’ ’ to the instruction. This is because gcc sends each instruction as a string toas(GAS) and by using the newline/tab we send correctly formatted lines to the assembler.

    可能注意到了这里使用了 asm 和 __asm__. 都是有效的.如果关键字asm在程序中有冲突, 则可以使用__asm__. 如果我们需要使用一条以上的汇编指令, 我们应该每条占用一行, 用双引号括起,并加上' '和' '后缀. 这是因为gcc把用字符串的格式把汇编指令传给as(GAS), 然后利用换行符, 把它们转换成正确的汇编格式.

    Example.


    1. __asm__ ("movl %eax, %ebx "  
    2.          "movl $56, %esi "  
    3.          "movl %ecx, $label(%edx,%ebx,$4) "  
    4.          "movb %ah, (%ebx)");  

    If in our code we touch (ie, change the contents) some registers and return from asm without fixing those changes, something bad is going to happen. This is because GCC have no idea about the changes in the register contents and this leads us to trouble, especially when compiler makes some optimizations. It will suppose that some register contains the value of some variable that we might have changed without informing GCC, and it continues like nothing happened. What we can do is either use those instructions having no side effects or fix things when we quit or wait for something to crash. This is where we want some extended functionality. Extended asm provides us with that functionality.

    如果我们的代码里使用了寄存器, 并且在返回的时候没有还原它, 这将有坏的情况发生. 因为GCC并不知道寄存器的值改变了, 特别是编译器对代码进行优化的时候. 编译器会认为,那些存放变量的寄存器,我们并没有改变它,然后继续自己的优化. 为了避免这种情况, 要么, 我们不改变寄存器的值, 要么, 汇编函数返回之前, 还原寄存器使用前的值, 或者 等着代码崩溃(wait for something to crash). 正是由于存在这样的问题,我们需要使用"Extended Asm". 它将提供给我们扩展功能, 解决上边的问题.


    5. Extended Asm.

    In basic inline assembly, we had only instructions. In extended assembly, we can also specify the operands. It allows us to specify the input registers,output registers and a list of clobbered registers. It is not mandatory to specify the registers to use, we can leave that head ache to GCC and that probably fit into GCC’s optimization scheme better. Anyway the basic format is:

    在基本嵌入汇编格式中,我们只使用了指令. 在扩展汇编中, 我们还可以指定更多操作. 它允许我们指定输入寄存器, 输出寄存器和变化表(clobber list).  我们并不一定要指定使用哪些寄存器. 我们可以把这件事情交给GCC去做.  扩展汇编的格式如下:


    1. asm ( assembler template   
    2.     : output operands                  /* optional */  
    3.     : input operands                   /* optional */  
    4.     : list of clobbered registers      /* optional */  
    5.     );  

    The assembler template consists of assembly instructions. Each operand is described by an operand-constraint string followed by the C expression in parentheses. A colon separates the assembler template from the first output operand and another separates the last output operand from the first input, if any. Commas separate the operands within each group. The total number of operands is limited to ten or to the maximum number of operands in any instruction pattern in the machine description, whichever is greater.

    这个模板由若干条汇编指令组成, 每个操作数(括号里C语言的变量)都有一个限制符(“”中的内容)加以描述. 冒号用来分割输入的操作和输出的操作. 如果每组内有多个操作数,用逗号分割它们.  操作数最多为10个, 或者依照具体机器而异 .

    If there are no output operands but there are input operands, you must place two consecutive colons surrounding the place where the output operands would go.

    如果没有输出操作, 但是又有输入, 你必须使用连续两个冒号, 两个连续冒号中无内容, 表示没有输出结果的数据操作 .

    Example:


    1. asm ("cld "  
    2.      "rep "  
    3.      "stosl"  
    4.      : /* no output registers */  
    5.      : "c" (count), "a" (fill_value), "D" (dest)  
    6.      : "%ecx""%edi"   
    7.      );  

    Now, what does this code do? The above inline fills the fill_value count times to the location pointed to by the registeredi. It also says to gcc that, the contents of registerseax andedi are no longer valid. Let us see one more example to make things more clearer.

    上面这段代码做了什么? 这段内嵌汇编把 fill_value, count装入寄存器,同时告知GCC,clobber list目录中的寄存器eax,edi,已经改变.  我们来看下一个例子:


    1. int a=10, b;  
    2. asm ("movl %1, %%eax;   
    3.       movl %%eax, %0;"  
    4.      :"=r"(b)        /* output */  
    5.      :"r"(a)         /* input */  
    6.      :"%eax"         /* clobbered register */  
    7.      );        

    Here what we did is we made the value of ’b’ equal to that of ’a’ using assembly instructions. Some points of interest are:

    代码目的是让'b'的值与'a'的值相等. 

    • "b" is the output operand, referred to by %0 and "a" is the input operand, referred to by %1.
    • "r" is a constraint on the operands. We’ll see constraints in detail later. For the time being, "r" says to GCC to use any register for storing the operands. output operand constraint should have a constraint modifier "=".And this modifier says that it is the output operand and is write-only.
    • There are two %’s prefixed to the register name. This helps GCC to distinguish between the operands and registers. operands have a single % as prefix.
    • The clobbered register %eax after the third colon tells GCC that the value of %eax is to be modified inside "asm", so GCC won’t use this register to store any other value.
    • 'b'是要输出的数据,%0也指它。 'a'是输入的数据,%1也指它。
    • 'r' 是对操作数的约束。呆会在详细了解。 暂时这样理解,‘r’告诉GCC选择一个可用的寄存器来保存这个操作数。 输出操作数,应该使用‘=’, 表示这个数据只写。
    • 双%%前缀,指明这是一个寄存器名。 单%指明操作数。 这帮组GCC辨别 操作数和寄存器。
    • 第三个冒号后边, 这个变化表(clobber list)里的寄存器%eax,告诉gcc声明的寄存器值已经改变,这样,GCC不会在其他地方使用这个寄存器了。

    When the execution of "asm" is complete, "b" will reflect the updated value, as it is specified as an output operand. In other words, the change made to "b" inside "asm" is supposed to be reflected outside the "asm".

    当这段汇编代码执行完毕,'b'变量将会存储这个结果,,正如例子里声明这个变量为输出。 换句话说, 'b'用来反映汇编程序里值的变化。

    Now we may look each field in detail.

    现在,深入的理解每一块,看看细节。

    5.1 Assembler Template.

    The assembler template contains the set of assembly instructions that gets inserted inside the C program. The format is like: either each instruction should be enclosed within double quotes, or the entire group of instructions should be within double quotes. Each instruction should also end with a delimiter. The valid delimiters are newline( ) and semicolon(;). ’ ’ may be followed by a tab( ). We know the reason of newline/tab, right?. Operands corresponding to the C expressions are represented by %0, %1 ... etc.

    这个汇编模板包含一套完整的汇编指令,帮助在c语言内嵌入汇编语言。具体格式如下:每条指令应该加上双括号,或者给整套汇编指令加上双括号(如,最后一个例子)。每条指令结尾都应加上结束符,合法的结束符有( )和(;),或许还应该在 后边加上一个 ,我们应该了解原因吧? 括号里的若干操作数,依次对应%0,%1。。。等。

    5.2 Operands.

    C expressions serve as operands for the assembly instructions inside "asm". Each operand is written as first an operand constraint in double quotes. For output operands, there’ll be a constraint modifier also within the quotes and then follows the C expression which stands for the operand. ie,

    "constraint" (C expression) is the general form. For output operands an additional modifier will be there. Constraints are primarily used to decide the addressing modes for operands. They are also used in specifying the registers to be used.

    内嵌的汇编指令需要C变量为其提供一个操作数,这个操作数应加上括号。以输出操作为例,首先会有一个限制符,然后跟上C变量,运算结果将存入这个变量。

    双引号内的“限制符”是一个规定的格式。在输出操作中,这个限制符会额外多一个符号(=)。限制符主要用来决定操作数的寻址方式。同时还可指定使用某一个寄存器。

    If we use more than one operand, they are separated by comma.

    In the assembler template, each operand is referenced by numbers. Numbering is done as follows. If there are a total of n operands (both input and output inclusive), then the first output operand is numbered 0, continuing in increasing order, and the last input operand is numbered n-1. The maximum number of operands is as we saw in the previous section.

    Output operand expressions must be lvalues. The input operands are not restricted like this. They may be expressions. The extended asm feature is most often used for machine instructions the compiler itself does not know as existing ;-). If the output expression cannot be directly addressed (for example, it is a bit-field), our constraint must allow a register. In that case, GCC will use the register as the output of the asm, and then store that register contents into the output.

    As stated above, ordinary output operands must be write-only; GCC will assume that the values in these operands before the instruction are dead and need not be generated. Extended asm also supports input-output or read-write operands.

    如果我们不止一个操作(有输入,有输出),就必须使用冒号将他们分开。

    在标准汇编模板中,每个操作数会有一个Number与之对应。如果我们一共使用了n个操作数,那么输出操作里的第一个操作数就排0号,之后递增,所以最后一个输出操作的操作数编号为n-1。操作数的最多个数,前边已经提到过了。(一般最多10个或者某些机器指令支持更多)

    输出操作的表达式必须是数值,输入操作没有这个限制,他可能是表达式。扩展汇编常常用于实现机器平台自身特殊的指令,编译器可能并不能识别他们:-)。如果输出表达式不能直接被寻址(比如,他是一个位字段),我们就应该使用“限制符”指定一个寄存器。这样,GCC会使用此寄存器存储输出结果,然后再将寄存器的值存入输出操作数。

    So now we concentrate on some examples. We want to multiply a number by 5.For that we use the instructionlea.

    我们现在分析几个例子。我们想给一个数乘以5。因此,我们使用lea指令:   (汇编语句leal(r1,r2,4),r3语句表示r1+r2*4→r3。这个例子可以非常快地将x乘5。


    1. asm ("leal (%1,%1,4), %0"  
    2.      : "=r" (five_times_x)  
    3.      : "r" (x)   
    4.      );  

    Here our input is in ’x’. We didn’t specify the register to be used. GCC will choose some register for input, one for output and does what we desired.If we want the input and output to reside in the same register, we can instruct GCC to do so. Here we use those types of read-write operands. By specifying proper constraints, here we do it.

    这里输入一个变量x,我们并没指定特定的寄存器来存储它,GCC会选择一个(“r”表示gcc选择)。如我们所要求的,gcc会自动选择两个寄存器,一个给input一个给output如果我们想给input和output指定同一个寄存器,我们可以要求GCC这样做(通过更改“限制符”内容)。


    1. asm ("leal (%0,%0,4), %0"  
    2.      : "=r" (five_times_x)  
    3.      : "0" (x)   
    4.      );  

    Now the input and output operands are in the same register. But we don’t know which register. Now if we want to specify that also, there is a way.

    上例,我们就让input和output使用同一个寄存器,但是不知道具体哪一个。(如果输入操作的限制符为0或为空,则说明使用与相应输出一样的寄存器。)如果,我们想指定使用具体一个寄存器,可以看看如下代码:“c”表示使用寄存器ecx


    1. asm ("leal (%%ecx,%%ecx,4), %%ecx"  
    2.      : "=c" (x)  
    3.      : "c" (x)   
    4.      );  

    In all the three examples above, we didn’t put any register to the clobber list. why? In the first two examples, GCC decides the registers and it knows what changes happen. In the last one, we don’t have to putecx on the clobberlist, gcc knows it goes into x. Therefore, since it can know the value ofecx, it isn’t considered clobbered.

    上面三个例子,我们没有把任何寄存器放入clobber list,为什么?前两个例子,由GCC选择寄存器,所以它知道那些寄存器值改变了。最后一个例子,我们没有把ecx寄存器放入clobber list,GCC知道它的值变成x了。因此,既然GCC知道ecx寄存器的值,就没必要加入到clobber list

    5.3 Clobber List.

    Some instructions clobber some hardware registers. We have to list those registers in the clobber-list, ie the field after the third ’:’ in the asm function. This is to inform gcc that we will use and modify them ourselves. So gcc will not assume that the values it loads into these registers will be valid. We shoudn’t list the input and output registers in this list. Because, gcc knows that "asm" uses them (because they are specified explicitly as constraints). +If the instructions use any other registers, implicitly or explicitly (and the registers are not present either in input or in the output constraint list), then those registers have to be specified in theclobbered list.

    一些指令改变了硬件寄存器的值。这时需要在clobber list中列举出这些寄存器,位置所在汇编代码的最后一个“:”之后。这是为了告知GCC,我们将使用和更改列举出的寄存器。那么,GCC就知道之前装载到寄存器里的值已经无效了,不会使用寄存器的旧值进行错误操作。我们不必把input,output所使用的寄存器列入clobber list,因为GCC知道汇编代码已经使用和改变了那些寄存器。

    If our instruction can alter the condition code register, we have to add "cc" to the list of clobbered registers.

    如果汇编代码将改变条件码寄存器,我们需要在clobber list中加入“cc”。

    If our instruction modifies memory in an unpredictable fashion, add "memory" to the list of clobbered registers. This will cause GCC to not keep memory values cached in registers across the assembler instruction. We also have to add thevolatile keyword if the memory affected is not listed in the inputs or outputs of the asm.

    如果汇编指令更改了内存值,需在clobber list中加入“memory”。这样,在 汇编语句执行过程中,GCC不再使用寄存器内的值。我们还需要加入volatile关键字,如果汇编的输出输入操作影响到了内存值,而且并没有将这种变化加入到clobber list。

    We can read and write the clobbered registers as many times as we like. Consider the example of multiple instructions in a template; it assumes the subroutine _foo accepts arguments in registerseax andecx.

    clobber list中的寄存器可以反复读写。参考下面这个例子,代码子程序__foo用eax,ecx寄存器传递参数。则这俩寄存器的值不再可靠,所以加入到clobber list中。


    1. asm ("movl %0,%%eax;  
    2.       movl %1,%%ecx;  
    3.       call _foo"  
    4.      : /* no outputs */  
    5.      : "g" (from), "g" (to)  
    6.      : "eax""ecx"  
    7.      );  

    5.4 Volatile ...?

    If you are familiar with kernel sources or some beautiful code like that, you must have seen many functions declared asvolatile or__volatile__ which follows anasm or__asm__. I mentioned earlier about the keywordsasm and__asm__. So what is thisvolatile?

    If our assembly statement must execute where we put it, (i.e. must not be moved out of a loop as an optimization), put the keywordvolatile after asm and before the ()’s. So to keep it from moving, deleting and all, we declare it as

    asm volatile ( ... : ... : ... : ...);

    如果你熟悉内核代码或者像她一样漂亮的代码,你一定见到过许多函数被volatile或__volatile__修饰,通常紧跟在 asm或__asm__后边。我先前提到过asm和__asm__的区别。那volatile呢?

    Use __volatile__ when we have to be verymuch careful.

    If our assembly is just for doing some calculations and doesn’t have any side effects, it’s better not to use the keywordvolatile. Avoiding it helps gcc in optimizing the code and making it more beautiful.

    In the section Some Useful Recipes, I have provided many examples for inline asm functions. There we can see the clobber-list in detail.

    如果我们的汇编代码仅仅做一些计算并且没有什么副作用,那么最好不用volatile。不使用它,可以帮助GCC优化代码,让代码更漂亮。

    下边Some Useful Recipes部分,我提供了许多内嵌汇编代码的例子,我们可以看到更多细节。

     


    6. More about constraints.

    By this time, you might have understood that constraints have got a lot to do with inline assembly. But we’ve said little about constraints. Constraints can say whether an operand may be in a register, and which kinds of register;whether the operand can be a memory reference, and which kinds of address; whether the operand may be an immediate constant, and which possible values (ie range of values)it may have.... etc.

    看到这里,你应该知道汇编里的限制符(constraint)做了很多的事。但,我们只花了很少的篇幅叙述限制符。比如,限制符可以指定一个寄存器,限制符可以指向一块内存空间,限制符可以是一个立即数。。。等。

    6.1 Commonly used constraints.

    There are a number of constraints of which only a few are used frequently. We’ll have a look at those constraints.

    有大量的限制符,我们常用使用其中很少一部份,现在来看看:

        • Register operand constraint(r)

          When operands are specified using this constraint, they get stored in General Purpose Registers(GPR). Take the following example:

          当操作指定了“r”限制符,那么操作数将会被存储在通用寄存器内。看下例:

          asm ("movl %%eax, %0 " :"=r"(myval));

          Here the variable myval is kept in a register, the value in register eax is copied onto that register, and the value ofmyval is updated into the memory from this register. When the "r" constraint is specified, gcc may keep the variable in any of the available GPRs. To specify the register, you must directly specify the register names by using specific register constraints. They are:

          这里的变量myval被存储在一个寄存器内,代码将eax寄存器的值拷贝到myval占用的寄存器内,然后myval寄存器的值将更新myval的内存值。当“r”限制符被指定,GCC可能分配任意一个通用寄存器来存储操作数。如果要确切使用某个寄存器,你应该指定这个寄存器名称,通过下表的格式:

          1. +---+--------------------+  
          2. | r |    Register(s)     |  
          3. +---+--------------------+  
          4. | a |   %eax, %ax, %al   |  
          5. | b |   %ebx, %bx, %bl   |  
          6. | c |   %ecx, %cx, %cl   |  
          7. | d |   %edx, %dx, %dl   |  
          8. | S |   %esi, %si        |  
          9. | D |   %edi, %di        |  
          10. +---+--------------------+  

        • Memory operand constraint(m)

          When the operands are in the memory, any operations performed on them will occur directly in the memory location, as opposed to register constraints, which first store the value in a register to be modified and then write it back to the memory location. But register constraints are usually used only when they are absolutely necessary for an instruction or they significantly speed up the process. Memory constraints can be used most efficiently in cases where a C variable needs to be updated inside "asm" and you really don’t want to use a register to hold its value. For example, the value of idtr is stored in the memory location loc:

          如果限制符“m”后的操作数在内存中,任何对它们的操作都会直接更改内存值。与“r”限制符不同,“r”首先将操作数保存在寄存器内,然后在寄存器里进行数据操作,接着把数据写回内存区域。使用“r”限制符,通常是由于某些指令必须使用,或者为了加快程序运行,所以占用寄存器。“m”限制符运用更频繁,当我们希望在汇编执行过程中就更新内存,或者不希望额外占用一个宝贵的寄存器来装载变量值,就使用“m”限制符。如下:idtr的值就被保存在loc那块内存。

          asm("sidt %0 " : :"m"(loc));

           

        • Matching(Digit) constraints

          In some cases, a single variable may serve as both the input and theoutput operand. Such cases may be specified in "asm" by using matching constraints.

          在某些情况下,输出输入操作可能是同一个操作数。这种情况我们需要指定匹配限制符“数字”。 

          asm ("incl %0" :"=a"(var):"0"(var));

          We saw similar examples in operands subsection also. In this example for matching constraints, the register %eax is used as both the input and the output variable. var input is read to %eax and updated %eax is stored in var again after increment. "0" here specifies the same constraint as the 0th output variable. That is, it specifies that the output instance of var should be stored in %eax only. This constraint can be used:

          上边见到过类似的例子,此例“0”使用了匹配限制符,寄存器eax同时供input,output使用。输入变量var被读入到eax,运算结束后,再被存储到eax。“0”这个限制符表示:与第0个操作数使用相同的寄存器。这样,就指明了输出输入使用同一个寄存器。这个限制符在如下地方可能用到:

          • In cases where input is read from a variable or the variable is modified and modification is written back to the same variable.
          • In cases where separate instances of input and output operands are not necessary.
          • 输出输入为同一变量时。
          • 没有必要使用更多的寄存器时。

          The most important effect of using matching restraints is that they lead to the efficient use of available registers.

          使用匹配限制符最重要的作用是:使得对有限寄存器资源使用更高效。

          Some other constraints used are:

          其他一些限制符:

          1. "m" : A memory operand is allowed, with any kind of address that the machine supports in general.
          2. "o" : A memory operand is allowed, but only if the address is offsettable. ie, adding a small offset to the address gives a valid address.
          3. "V" : A memory operand that is not offsettable. In other words, anything that would fit the `m’ constraint but not the `o’constraint.
          4. "i" : An immediate integer operand (one with constant value) is allowed. This includes symbolic constants whose values will be known only at assembly time.
          5. "n" : An immediate integer operand with a known numeric value is allowed. Many systems cannot support assembly-time constants for operands less than a word wide. Constraints for these operands should use ’n’ rather than ’i’.
          6. "g" : Any register, memory or immediate integer operand is allowed, except for registers that are not general registers.
          7. “m”:对内存的操作被允许,用一个合法内存空间来做操作数。
          8. “o”:对内存的操作被允许,但是必须支持地址偏移值,即,对于给出的地址,加上一个偏移量,此时也是一个合法的地址。
          9. “V”:对内存的操作被允许,但是不支持偏移量。也就是说,支持“m”限制符,但不支持“o”的那些地址。
          10. “i”:对立即整数(const,常值)的操作被允许,这个常值可以是运行到汇编内才被赋值。
          11. “n”:对立即整数的操作被允许。许多系统不支持汇编中的操作数小于一个字宽,对于这些操作数,应该使用“n”而非“i”。
          12. “g”:任意寄存器,内存,立即数都被允许。除了非通用寄存器。

          Following constraints are x86 specific.

          下面的限制符是x86特定的:

          1. "r" : Register operand constraint, look table given above.
          2. "q" : Registers a, b, c or d.
          3. "I" : Constant in range 0 to 31 (for 32-bit shifts).
          4. "J" : Constant in range 0 to 63 (for 64-bit shifts).
          5. "K" : 0xff.
          6. "L" : 0xffff.
          7. "M" : 0, 1, 2, or 3 (shifts for lea instruction).
          8. "N" : Constant in range 0 to 255 (for out instruction).
          9. "f" : Floating point register
          10. "t" : First (top of stack) floating point register
          11. "u" : Second floating point register
          12. "A" : Specifies the `a’ or `d’ registers. This is primarily useful for 64-bit integer values intended to be returned with the `d’ register holding the most significant bits and the `a’ register holding the least significant bits.

          6.2 Constraint Modifiers.

          While using constraints, for more precise control over the effects of constraints, GCC provides us with constraint modifiers. Mostly used constraint modifiers are

          在使用限制符的时候,为了更准确的利用限制符的功能,GCC提供给我们一些限制语句修饰符。最常用的修饰符有:“=”,“&”。

          1. "=" : Means that this operand is write-only for this instruction; the previous value is discarded and replaced by output data.
          2. "&" : Means that this operand is an earlyclobber operand, which is modified before the instruction is finished using the input operands. Therefore, this operand may not lie in a register that is used as an input operand or as part of any memory address. An input operand can be tied to an earlyclobber operand if its only use as an input occurs before the early result is written.
          3. “=”表示此操作数类型是只写。之前的值会被输出数据值替代。
          4. “&”表示此操作数是一个很早更变的(earlyclobber)操作数。在指令执行过程中,输出操作数产生之前,输入操作数还未使用完成,所以输出操作数不能与该指令的任何输入操作数公用同一寄存器。这个声明就为防止这种合并寄存器的优化。因此,这个输入操作数可能没有被保存到寄存器。

            The list and explanation of constraints is by no means complete. Examples can give a better understanding of the use and usage of inline asm. In the next section we’ll see some examples, there we’ll find more about clobber-lists and constraints.

            上边对限制符的解释绝不完全,下边的例子可以让我们更好的理解嵌入汇编的用法。下边一节我们将看一些例子,那里我们将遇到更多的colbber list和限制符。


          7. Some Useful Recipes.

          Now we have covered the basic theory about GCC inline assembly, now we shall concentrate on some simple examples. It is always handy to write inlineasm functions as MACRO’s. We can see many asm functions in the kernel code.(/usr/src/linux/include/asm/*.h).

          我们已经接触过内嵌汇编的基本理论。现在我们专注几个例子。使用内嵌汇编来定义宏是非常精妙的,我们经常可以在内核代码中看到。

          1. First we start with a simple example. We’ll write a program to add two numbers.

            首先,我们写个小程序,使两个数做加法。

            1. int main(void)  
            2. {  
            3.         int foo = 10, bar = 15;  
            4.         __asm__ __volatile__("addl  %%ebx,%%eax"  
            5.                              :"=a"(foo)  
            6.                              :"a"(foo), "b"(bar)  
            7.                              );  
            8.         printf("foo+bar=%d ", foo);  
            9.         return 0;  
            10. }  

            Here we insist GCC to store foo in %eax, bar in %ebx and we also want the result in %eax. The ’=’ sign shows that it is an output register. Now we can add an integer to a variable in some other way.

            这里我们指定GCC使用eax寄存器来存储foo,bar存放在ebx寄存器,并且结果存放到eax中。这个“=”表示这是输出结果的寄存器。接下来,我们用其他方法把一个整数存放进一个变量中。


            1. __asm__ __volatile__(  
            2.                      "   lock       ; "  
            3.                      "   addl %1,%0 ; "  
            4.                      : "=m"  (my_var)  
            5.                      : "ir"  (my_int), "m" (my_var)  
            6.                      :                                 /* no clobber-list */  
            7.                      );  

            This is an atomic addition. We can remove the instruction ’lock’ to remove the atomicity. In the output field, "=m" says that my_var is an output and it is in memory. Similarly, "ir" says that, my_int is an integer and should reside in some register (recall the table we saw above). No registers are in the clobber list.

            这是一个(atomic addition)。我们可以去掉‘lock’来去除这个(atomicity)。在输出区域,“=m”是说my_var是输出操作数,并且它在内存里。“ir”是说my_int是一个整数,并且会选择一个通用寄存器存放它。没有寄存器在clobber list。

          2. Now we’ll perform some action on some registers/variables and compare the value.

            现在我们实现一些操作,使用寄存器/变量,并比较他们的值。


            1. __asm__ __volatile__(  "decl %0; sete %1"  
            2.                      : "=m" (my_var), "=q" (cond)  
            3.                      : "m" (my_var)   
            4.                      : "memory"  
            5.                      );  

            Here, the value of my_var is decremented by one and if the resulting value is 0then, the variable cond is set. We can add atomicity by adding an instruction "lock; "as the first instruction in assembler template.

            In a similar way we can use "incl %0" instead of "decl %0", so as to increment my_var.

            Points to note here are that (i) my_var is a variable residing in memory. (ii) cond is in any of the registers eax, ebx, ecx and edx. The constraint "=q" guarantees it. (iii) And we can see that memory is there in the clobber list. ie, the code is changing the contents of memory.

            这里,my_var的值递减到1,如果结果值为0。那么,cond就确定了。

            要点:1.my_var存在内存中。2.cond被“=q”限制,所以它占用 eax,ebx,ecx,edx中的一个。3.memory在clobber list中,即,代码改变了内存。

          3. How to set/clear a bit in a register? As next recipe, we are going to see it.

            怎么设置/清除寄存器的一位?下个例子:


            1. __asm__ __volatile__(   "btsl %1,%0"  
            2.                       : "=m" (ADDR)  
            3.                       : "Ir" (pos)  
            4.                       : "cc"  
            5.                       );  

            Here, the bit at the position ’pos’ of variable at ADDR ( a memory variable ) is set to1We can use ’btrl’ for ’btsl’ to clear the bit. The constraint "Ir" of pos says that, pos is in a register, and it’s value ranges from 0-31 (x86 dependant constraint). ie, we can set/clear any bit from 0th to 31st of the variable at ADDR. As the condition codes will be changed, we are adding "cc" to clobberlist.

            内存变量ADDR的'pos'字位被置1。我们可以使用‘btrl’指令来清除这位。限制符“Ir”表示pos存放到寄存器内,并且pos的范围是0~31(x86专用限制符,上边有提及)。随着条件代码的改变,我们把“cc”加入到clobber list。

          4. Now we look at some more complicated but useful function. String copy.

            现在我们看实现String copy函数:

            1. static inline char * strcpy(char * dest,const char *src)  
            2. {  
            3. int d0, d1, d2;  
            4. __asm__ __volatile__(  "1: lodsb "  
            5.                        "stosb "  
            6.                        "testb %%al,%%al "  
            7.                        "jne 1b"  
            8.                      : "=&S" (d0), "=&D" (d1), "=&a" (d2)  
            9.                      : "0" (src),"1" (dest)   
            10.                      : "memory");  
            11. return dest;  
            12. }  

            The source address is stored in esi, destination in edi, and then starts the copy, when we reach at 0, copying is complete. Constraints "&S", "&D", "&a"say that the registers esi, edi and eax are early clobber registers, ie, their contents willchange before the completion of the function. Here also it’s clear that why memory is in clobberlist.

            We can see a similar function which moves a block of double words. Notice that the function is declared as a macro.

            源地址存放在esi中,目的地址在edi,接下来开始拷贝。当到达0时,程序完成。限制符“&S”,“&D”,“&a”分别指定esi,edi,eax,并且他们是早期改变的寄存器,即,他们的内容在程序结束前,就会有变化。同样memroy出现在clobber list。下例,类似的功能,移动两个字节的内存块,注意,它是以宏的方式实现的。


            1. #define mov_blk(src, dest, numwords)   
            2. __asm__ __volatile__ (                                            
            3.                        "cld "                                  
            4.                        "rep "                                  
            5.                        "movsl"                                    
            6.                        :                                          
            7.                        : "S" (src), "D" (dest), "c" (numwords)    
            8.                        : "%ecx""%esi""%edi"                   
            9.                        )  

            Here we have no outputs, so the changes that happen to the contents of the registers ecx, esi and edi are side effects of the block movement. So we have to add them to the clobber list.

            这里没有输出,使用两个连续的::。ecx,esi,edi的变化实现内存块搬移。所以,把他们也加到clobber list。

          5. In Linux, system calls are implemented using GCC inline assembly. Let us look how a system call is implemented.All the system calls are written as macros (linux/unistd.h). For example, a system call with three arguments is defined as a macro as shown below.

            Linux中,系统调用用内嵌汇编实现。我们一起看看系统调用怎样实现的。所有的系统调用都用汇编书写。例如,系统调用3个参数,被定义成如下宏:

            1. #define _syscall3(type,name,type1,arg1,type2,arg2,type3,arg3)   
            2. type name(type1 arg1,type2 arg2,type3 arg3)   
            3. {   
            4. long __res;   
            5. __asm__ volatile (  "int $0x80"   
            6.                   : "=a" (__res)   
            7.                   : "0" (__NR_##name),"b" ((long)(arg1)),"c" ((long)(arg2)),   
            8.                     "d" ((long)(arg3)));   
            9. __syscall_return(type,__res);   
            10. }  


            Whenever a system call with three arguments is made, the macro shown above is used to make the call.The syscall number is placed in eax, then each parameters in ebx, ecx, edx. And finally "int 0x80" is the instruction which makes the system call work. The return value can be collected from eax.

            每当系统调用一个3参数的函数时,就使用上边这个宏来调用。系统调用的名称name被放置在eax,其后的参数分别占用ebx,ecx,edx。最后“int 0x80”这条指令实现了调用,而且返回值存入eax内。

            Every system calls are implemented in a similar way. Exit is a single parameter syscall and let’s see how it’s code will look like. It is as shown below.

            每个系统调用的实现都相似,Exit()系统函数只有一个参数,我们来看看它的代码长什么样子。如下:


            1. {  
            2.         asm("movl $1,%%eax;         /* SYS_exit is 1 */  
            3.              xorl %%ebx,%%ebx;      /* Argument is in ebx, it is 0 */  
            4.              int  $0x80"            /* Enter kernel mode */  
            5.              );  
            6. }  

            The number of exit is "1" and here, it’s parameter is 0. So we arrange eax to contain 1 and ebx to contain 0 and byint $0x80, theexit(0) is executed. This is how exit works.

            exit的编号是1,它的参数是0。所以我们安排eax存放1,ebx存放2,然后指令“int $0x80”,exit(0)就执行了。这就是它的工作方式。


          8. Concluding Remarks.

          This document has gone through the basics of GCC Inline Assembly. Once you have understood the basic concept it is not difficult to take steps by your own. We saw some examples which are helpful in understanding the frequently used features of GCC Inline Assembly.

          GCC Inlining is a vast subject and this article is by no means complete. More details about the syntax’s we discussed about is available in the official documentation for GNU Assembler. Similarly, for a complete list of theconstraints refer to the official documentation of GCC.

          And of-course, the Linux kernel use GCC Inline in a large scale. So we can find many examples of various kinds in the kernel sources. They can help us a lot.

          If you have found any glaring typos, or outdated info in thisdocument, please let us know.

          这个文档纵览了GCC内嵌汇编的基本内容。一旦你理解了这些基本概念,那麽修行靠个人。

          GCC内嵌汇编是一个巨大的工程,并且这样的艺术之作绝不会完成。更多细节请参见GNU Assembler官方文档。

          当然,Linux大规模使用了GCC内嵌汇编,所以我们从中可以找到大量的例子。她们会给我们很多帮助。

  • 相关阅读:
    好用的视频播放器
    如何屏蔽weGame今日推荐窗口
    存一个大佬的地图编辑器
    过渡页面,加载进度
    Lua中正弦,余弦函数的使用
    如何替换loadingBar的底图
    使用精灵帧缓存替换纹理
    setTexture和loadTexture之间的区别
    我胡汉三又回来了
    python中单斜杆和双斜杠的区别
  • 原文地址:https://www.cnblogs.com/pengdonglin137/p/3477259.html
Copyright © 2011-2022 走看看