zoukankan      html  css  js  c++  java
  • Linux assemblers: A comparison of GAS and NASM

    A side-by-side look at GNU Assembler (GAS) and Netwide Assembler (NASM)

    developerWorks
    Document options
    Set printer orientation to landscape mode

    Print this page

    Email this page

    E-mail this page

    Document options requiring JavaScript are not displayed


    Rate this page

    Help us improve this content


    Ram Narayan (mailto:rnaraya2@in.ibm.com?subject=Linux assemblers: A comparison of GAS and NASM), Software Engineer, IBM

    17 Oct 2007

    This article explains some of the more important syntactic and semantic differences between two of the most popular assemblers for Linux®, GNU Assembler (GAS) and Netwide Assembler (NASM), including differences in basic syntax, variables and memory access, macro handling, functions and external routines, stack handling, and techniques for easily repeating blocks of code.

    Introduction

    Unlike other languages, assembly programming involves understanding the processor architecture of the machine that is being programmed. Assembly programs are not at all portable and are often cumbersome to maintain and understand, and can often contain a large number of lines of code. But with these limitations comes the advantage of speed and size of the runtime binary that executes on that machine.

    Though much information is already available on assembly level programming on Linux, this article aims to more specifically show the differences between syntaxes in a way that will help you more easily convert from one flavor of assembly to the another. The article evolved from my own quest to improve at this conversion.

    This article uses a series of program examples. Each program illustrates some feature and is followed by a discussion and comparison of the syntaxes. Although it's not possible to cover every difference that exists between NASM and GAS, I do try to cover the main points and provide a foundation for further investigation. And for those already familiar with both NASM and GAS, you might still find something useful here, such as macros.

    This article assumes you have at least a basic understanding of assembly terminology and have programmed with an assembler using Intel® syntax, perhaps using NASM on Linux or Windows. This article does not teach how to type code into an editor or how to assemble and link (but see the sidebar for a quick refresher). You should be familiar with the Linux operating system (any Linux distribution will do; I used Red Hat and Slackware) and basic GNU tools such as gcc and ld, and you should be programming on an x86 machine.

    Now I'll describe what this article does and does not cover.

    Building the examples


    Assembling:
    GAS:
    as –o program.o program.s

    NASM:
    nasm –f elf –o program.o program.asm

    Linking (common to both kinds of assembler):
    ld –o program program.o

    Linking when an external C library is to be used:
    ld –-dynamic-linker /lib/ld-linux.so.2 –lc –o program program.o

    This article covers:

    • Basic syntactical differences between NASM and GAS
    • Common assembly level constructs such as variables, loops, labels, and macros
    • A bit about calling external C routines and using functions
    • Assembly mnemonic differences and usage
    • Memory addressing methods

    This article does not cover:

    • The processor instruction set
    • Various forms of macros and other constructs particular to an assembler
    • Assembler directives peculiar to either NASM or GAS
    • Features that are not commonly used or are found only in one assembler but not in the other

    For more information, refer to the official assembler manuals (see Resources for links), as those are the most complete sources of information.



    Back to top


    Basic structure

    Listing 1 shows a very simple program that simply exits with an exit code of 2. This little program describes the basic structure of an assembly program for both GAS and NASM.

    Listing 1. A program that exits with an exit code of 2
    Line NASM GAS
    001
                                        002
                                        003
                                        004
                                        005
                                        006
                                        007
                                        008
                                        009
                                        010
                                        011
                                        012
                                        013
                                        014
                                        015
                                        016
                                        

    ; Text segment begins
                                        section .text
                                        global _start
                                        ; Program entry point
                                        _start:
                                        ; Put the code number for system call
                                        mov   eax, 1
                                        ; Return value
                                        mov   ebx, 2
                                        ; Call the OS
                                        int   80h
                                        

    # Text segment begins
                                        .section .text
                                        .globl _start
                                        # Program entry point
                                        _start:
                                        # Put the code number for system call
                                        movl  $1, %eax
                                        /* Return value */
                                        movl  $2, %ebx
                                        # Call the OS
                                        int   $0x80
                                        

    Now for a bit of explanation.

    One of the biggest differences between NASM and GAS is the syntax. GAS uses the AT&T syntax, a relatively archaic syntax that is specific to GAS and some older assemblers, whereas NASM uses the Intel syntax, supported by a majority of assemblers such as TASM and MASM. (Modern versions of GAS do support a directive called .intel_syntax, which allows the use of Intel syntax with GAS.)

    The following are some of the major differences summarized from the GAS manual:

    • AT&T and Intel syntax use the opposite order for source and destination operands. For example:
      • Intel: mov eax, 4
      • AT&T: movl $4, %eax
    • In AT&T syntax, immediate operands are preceded by $; in Intel syntax, immediate operands are not. For example:
      • Intel: push 4
      • AT&T: pushl $4
    • In AT&T syntax, register operands are preceded by %; in Intel syntax, they are not.
    • In AT&T syntax, the size of memory operands is determined from the last character of the opcode name. Opcode suffixes of b, w, and l specify byte (8-bit), word (16-bit), and long (32-bit) memory references. Intel syntax accomplishes this by prefixing memory operands (not the opcodes themselves) with byte ptr, word ptr, and dword ptr. Thus:
      • Intel: mov al, byte ptr foo
      • AT&T: movb foo, %al
    • Immediate form long jumps and calls are lcall/ljmp $section, $offset in AT&T syntax; the Intel syntax is call/jmp far section:offset. The far return instruction is lret $stack-adjust in AT&T syntax, whereas Intel uses ret far stack-adjust.

    In both the assemblers, the names of registers remain the same, but the syntax for using them is different as is the syntax for addressing modes. In addition, assembler directives in GAS begin with a ".", but not in NASM.

    The .text section is where the processor begins code execution. The global (also .globl or .global in GAS) keyword is used to make a symbol visible to the linker and available to other linking object modules. On the NASM side of Listing 1, global _start marks the symbol _start as a visible identifier so the linker knows where to jump into the program and begin execution. As with NASM, GAS looks for this _start label as the default entry point of a program. A label always ends with a colon in both GAS and NASM.

    Interrupts are a way to inform the OS that its services are required. The int instruction in line 16 does this job in our program. Both GAS and NASM use the same mnemonic for interrupts. GAS uses the 0x prefix to specify a hex number, whereas NASM uses the h suffix. Because immediate operands are prefixed with $ in GAS, 80 hex is $0x80.

    int $0x80 (or 80h in NASM) is used to invoke Linux and request a service. The service code is present in the EAX register. A value of 1 (for the Linux exit system call) is stored in EAX to request that the program exit. Register EBX contains the exit code (2, in our case), a number that is returned to the OS. (You can track this number by typing echo $? at the command prompt.)

    Finally, a word about comments. GAS supports both C style (/* */), C++ style (//), and shell style (#) comments. NASM supports single-line comments that begin with the ";" character.



    Back to top


    Variables and accessing memory

    This section begins with an example program that finds the largest of three numbers.

    Listing 2. A program that finds the maximum of three numbers
    Line NASM GAS
    001
                                        002
                                        003
                                        004
                                        005
                                        006
                                        007
                                        008
                                        009
                                        010
                                        011
                                        012
                                        013
                                        014
                                        015
                                        016
                                        017
                                        018
                                        019
                                        020
                                        021
                                        022
                                        023
                                        024
                                        025
                                        026
                                        027
                                        028
                                        029
                                        030
                                        031
                                        

    ; Data section begins
                                        section .data
                                        var1 dd 40
                                        var2 dd 20
                                        var3 dd 30
                                        section .text
                                        global _start
                                        _start:
                                        ; Move the contents of variables
                                        mov   ecx, [var1]
                                        cmp   ecx, [var2]
                                        jg    check_third_var
                                        mov   ecx, [var2]
                                        check_third_var:
                                        cmp   ecx, [var3]
                                        jg    _exit
                                        mov   ecx, [var3]
                                        _exit:
                                        mov   eax, 1
                                        mov   ebx, ecx
                                        int   80h
                                        

    // Data section begins
                                        .section .data
                                        var1:
                                        .int 40
                                        var2:
                                        .int 20
                                        var3:
                                        .int 30
                                        .section .text
                                        .globl _start
                                        _start:
                                        # move the contents of variables
                                        movl  (var1), %ecx
                                        cmpl  (var2), %ecx
                                        jg    check_third_var
                                        movl  (var2), %ecx
                                        check_third_var:
                                        cmpl  (var3), %ecx
                                        jg    _exit
                                        movl  (var3), %ecx
                                        _exit:
                                        movl  $1, %eax
                                        movl  %ecx, %ebx
                                        int   $0x80
                                        

    You can see several differences above in the declaration of memory variables. NASM uses the dd, dw, and db directives to declare 32-, 16-, and 8-bit numbers, respectively, whereas GAS uses the .long, .int, and .byte for the same purpose. GAS has other directives too, such as .ascii, .asciz, and .string. In GAS, you declare variables just like other labels (using a colon), but in NASM you simply type a variable name (without the colon) before the memory allocation directive (dd, dw, etc.), followed by the value of the variable.

    Line 18 in Listing 2 illustrates the memory indirect addressing mode. NASM uses square brackets to dereference the value at the address pointed to by a memory location: [var1]. GAS uses a circular brace to dereference the same value: (var1). The use of other addressing modes is covered later in this article.



    Back to top


    Using macros

    Listing 3 illustrates the concepts of this section; it accepts the user's name as input and returns a greeting.

    Listing 3. A program to read a string and display a greeting to the user
    Line NASM GAS
    001
                                        002
                                        003
                                        004
                                        005
                                        006
                                        007
                                        008
                                        009
                                        010
                                        011
                                        012
                                        013
                                        014
                                        015
                                        016
                                        017
                                        018
                                        019
                                        020
                                        021
                                        022
                                        023
                                        024
                                        025
                                        026
                                        027
                                        028
                                        029
                                        030
                                        031
                                        032
                                        033
                                        034
                                        035
                                        036
                                        037
                                        038
                                        039
                                        040
                                        041
                                        042
                                        043
                                        044
                                        045
                                        046
                                        047
                                        048
                                        049
                                        050
                                        051
                                        052
                                        053
                                        054
                                        055
                                        056
                                        057
                                        058
                                        059
                                        060
                                        061
                                        062
                                        

    section .data
                                        prompt_str  db   'Enter your name: '
                                        ; $ is the location counter
                                        STR_SIZE  equ  $ - prompt_str
                                        greet_str  db  'Hello '
                                        GSTR_SIZE  equ  $ - greet_str
                                        section .bss
                                        ; Reserve 32 bytes of memory
                                        buff  resb  32
                                        ; A macro with two parameters
                                        ; Implements the write system call
                                        %macro write 2
                                        mov   eax, 4
                                        mov   ebx, 1
                                        mov   ecx, %1
                                        mov   edx, %2
                                        int   80h
                                        %endmacro
                                        ; Implements the read system call
                                        %macro read 2
                                        mov   eax, 3
                                        mov   ebx, 0
                                        mov   ecx, %1
                                        mov   edx, %2
                                        int   80h
                                        %endmacro
                                        section .text
                                        global _start
                                        _start:
                                        write prompt_str, STR_SIZE
                                        read  buff, 32
                                        ; Read returns the length in eax
                                        push  eax
                                        ; Print the hello text
                                        write greet_str, GSTR_SIZE
                                        pop   edx
                                        ; edx  = length returned by read
                                        write buff, edx
                                        _exit:
                                        mov   eax, 1
                                        mov   ebx, 0
                                        int   80h
                                        

    .section .data
                                        prompt_str:
                                        .ascii "Enter Your Name: "
                                        pstr_end:
                                        .set STR_SIZE, pstr_end - prompt_str
                                        greet_str:
                                        .ascii "Hello "
                                        gstr_end:
                                        .set GSTR_SIZE, gstr_end - greet_str
                                        .section .bss
                                        // Reserve 32 bytes of memory
                                        .lcomm  buff, 32
                                        // A macro with two parameters
                                        //  implements the write system call
                                        .macro write str, str_size
                                        movl  $4, %eax
                                        movl  $1, %ebx
                                        movl  \str, %ecx
                                        movl  \str_size, %edx
                                        int   $0x80
                                        .endm
                                        // Implements the read system call
                                        .macro read buff, buff_size
                                        movl  $3, %eax
                                        movl  $0, %ebx
                                        movl  \buff, %ecx
                                        movl  \buff_size, %edx
                                        int   $0x80
                                        .endm
                                        .section .text
                                        .globl _start
                                        _start:
                                        write $prompt_str, $STR_SIZE
                                        read  $buff, $32
                                        // Read returns the length in eax
                                        pushl %eax
                                        // Print the hello text
                                        write $greet_str, $GSTR_SIZE
                                        popl  %edx
                                        // edx = length returned by read
                                        write $buff, %edx
                                        _exit:
                                        movl  $1, %eax
                                        movl  $0, %ebx
                                        int   $0x80
                                        

    The heading for this section promises a discussion of macros, and both NASM and GAS certainly support them. But before we get into macros, a few other features are worth comparing.

    Listing 3 illustrates the concept of uninitialized memory, defined using the .bss section directive (line 14). BSS stands for "block storage segment" (originally, "block started by symbol"), and the memory reserved in the BSS section is initialized to zero during the start of the program. Objects in the BSS section have only a name and a size, and no value. Variables declared in the BSS section don't actually take space, unlike in the data segment.

    NASM uses the resb, resw, and resd keywords to allocated byte, word, and dword space in the BSS section. GAS, on the other hand, uses the .lcomm keyword to allocate byte-level space. Notice the way the variable name is declared in both versions of the program. In NASM the variable name precedes the resb (or resw or resd) keyword, followed by the amount of space to be reserved, whereas in GAS the variable name follows the .lcomm keyword, which is then followed by a comma and then the amount of space to be reserved. This shows the difference:

    NASM: varname resb size

    GAS: .lcomm varname, size

    Listing 2 also introduces the concept of a location counter (line 6). NASM provides a special variable (the $ and $$ variables) to manipulate the location counter. In GAS, there is no method to manipulate the location counter and you have to use labels to calculate the next storage location (data, instruction, etc.).

    For example, to calculate the length of a string, you would use the following idiom in NASM:

    prompt_str db 'Enter your name: '
    STR_SIZE equ $ - prompt_str     ; $ is the location counter

    The $ gives the current value of the location counter, and subtracting the value of the label (all variable names are labels) from this location counter gives the number of bytes present between the declaration of the label and the current location. The equ directive is used to set the value of the variable STR_SIZE to the expression following it. A similar idiom in GAS looks like this:

    prompt_str:
         .ascii "Enter Your Name: "

    pstr_end:
         .set STR_SIZE, pstr_end - prompt_str

    The end label (pstr_end) gives the next location address, and subtracting the starting label address gives the size. Also note the use of .set to initialize the value of the variable STR_SIZE to the expression following the comma. A corresponding .equ can also be used. There is no alternative to GAS's set directive in NASM.

    As I mentioned, Listing 3 uses macros (line 21). Different macro techniques exist in NASM and GAS, including single-line macros and macro overloading, but I only deal with the basic type here. A common use of macros in assembly is clarity. Instead of typing the same piece of code again and again, you can create reusable macros that both avoid this repetition and enhance the look and readability of the code by reducing clutter.

    NASM users might be familiar with declaring macros using the %beginmacro directive and ending them with an %endmacro directive. A %beginmacro directive is followed by the macro name. After the macro name comes a count, the number of macro arguments the macro is supposed to have. In NASM, macro arguments are numbered sequentially starting with 1. That is, the first argument to a macro is %1, the second is %2, the third is %3, and so on. For example:

    %beginmacro macroname 2
         mov eax, %1
         mov ebx, %2
    %endmacro

    This creates a macro with two arguments, the first being %1 and the second being %2. Thus, a call to the above macro would look something like this:

    macroname 5, 6

    Macros can also be created without arguments, in which case they don't specify any number.

    Now let's take a look at how GAS uses macros. GAS provides the .macro and .endm directives to create macros. A .macro directive is followed by a macro name, which may or may not have arguments. In GAS, macro arguments are given by name. For example:

    .macro macroname arg1, arg2
         movl \arg1, %eax
         movl \arg2, %ebx
    .endm

    A backslash precedes the name of each argument of the macro when the name is actually used inside a macro. If this is not done, the linker would treat the names as labels rather then as arguments and will report an error.



    Back to top


    Functions, external routines, and the stack

    The example program for this section implements a selection sort on an array of integers.

    Listing 4. Implementation of selection sort on an integer array
    Line NASM GAS
    001
                                        002
                                        003
                                        004
                                        005
                                        006
                                        007
                                        008
                                        009
                                        010
                                        011
                                        012
                                        013
                                        014
                                        015
                                        016
                                        017
                                        018
                                        019
                                        020
                                        021
                                        022
                                        023
                                        024
                                        025
                                        026
                                        027
                                        028
                                        029
                                        030
                                        031
                                        032
                                        033
                                        034
                                        035
                                        036
                                        037
                                        038
                                        039
                                        040
                                        041
                                        042
                                        043
                                        044
                                        045
                                        046
                                        047
                                        048
                                        049
                                        050
                                        051
                                        052
                                        053
                                        054
                                        055
                                        056
                                        057
                                        058
                                        059
                                        060
                                        061
                                        062
                                        063
                                        064
                                        065
                                        066
                                        067
                                        068
                                        069
                                        070
                                        071
                                        072
                                        073
                                        074
                                        075
                                        076
                                        077
                                        078
                                        079
                                        080
                                        081
                                        082
                                        083
                                        084
                                        085
                                        086
                                        087
                                        088
                                        089
                                        090
                                        091
                                        092
                                        093
                                        094
                                        095
                                        096
                                        097
                                        098
                                        099
                                        100
                                        101
                                        102
                                        103
                                        104
                                        105
                                        106
                                        107
                                        108
                                        109
                                        110
                                        111
                                        112
                                        113
                                        114
                                        115
                                        116
                                        117
                                        118
                                        119
                                        120
                                        121
                                        122
                                        123
                                        124
                                        125
                                        126
                                        127
                                        128
                                        129
                                        130
                                        131
                                        132
                                        133
                                        134
                                        135
                                        136
                                        137
                                        138
                                        139
                                        140
                                        141
                                        142
                                        143
                                        144
                                        145
                                        

    section .data
                                        array db
                                        89, 10, 67, 1, 4, 27, 12, 34,
                                        86, 3
                                        ARRAY_SIZE equ $ - array
                                        array_fmt db "  %d", 0
                                        usort_str db "unsorted array:", 0
                                        sort_str db "sorted array:", 0
                                        newline db 10, 0
                                        section .text
                                        extern puts
                                        global _start
                                        _start:
                                        push  usort_str
                                        call  puts
                                        add   esp, 4
                                        push  ARRAY_SIZE
                                        push  array
                                        push  array_fmt
                                        call  print_array10
                                        add   esp, 12
                                        push  ARRAY_SIZE
                                        push  array
                                        call  sort_routine20
                                        ; Adjust the stack pointer
                                        add   esp, 8
                                        push  sort_str
                                        call  puts
                                        add   esp, 4
                                        push  ARRAY_SIZE
                                        push  array
                                        push  array_fmt
                                        call  print_array10
                                        add   esp, 12
                                        jmp   _exit
                                        extern printf
                                        print_array10:
                                        push  ebp
                                        mov   ebp, esp
                                        sub   esp, 4
                                        mov   edx, [ebp + 8]
                                        mov   ebx, [ebp + 12]
                                        mov   ecx, [ebp + 16]
                                        mov   esi, 0
                                        push_loop:
                                        mov   [ebp - 4], ecx
                                        mov   edx, [ebp + 8]
                                        xor   eax, eax
                                        mov   al, byte [ebx + esi]
                                        push  eax
                                        push  edx
                                        call  printf
                                        add   esp, 8
                                        mov   ecx, [ebp - 4]
                                        inc   esi
                                        loop  push_loop
                                        push  newline
                                        call  printf
                                        add   esp, 4
                                        mov   esp, ebp
                                        pop   ebp
                                        ret
                                        sort_routine20:
                                        push  ebp
                                        mov   ebp, esp
                                        ; Allocate a word of space in stack
                                        sub   esp, 4
                                        ; Get the address of the array
                                        mov   ebx, [ebp + 8]
                                        ; Store array size
                                        mov   ecx, [ebp + 12]
                                        dec   ecx
                                        ; Prepare for outer loop here
                                        xor   esi, esi
                                        outer_loop:
                                        ; This stores the min index
                                        mov   [ebp - 4], esi
                                        mov   edi, esi
                                        inc   edi
                                        inner_loop:
                                        cmp   edi, ARRAY_SIZE
                                        jge   swap_vars
                                        xor   al, al
                                        mov   edx, [ebp - 4]
                                        mov   al, byte [ebx + edx]
                                        cmp   byte [ebx + edi], al
                                        jge   check_next
                                        mov   [ebp - 4], edi
                                        check_next:
                                        inc   edi
                                        jmp   inner_loop
                                        swap_vars:
                                        mov   edi, [ebp - 4]
                                        mov   dl, byte [ebx + edi]
                                        mov   al, byte [ebx + esi]
                                        mov   byte [ebx + esi], dl
                                        mov   byte [ebx + edi], al
                                        inc   esi
                                        loop  outer_loop
                                        mov   esp, ebp
                                        pop   ebp
                                        ret
                                        _exit:
                                        mov   eax, 1
                                        mov   ebx, 0
                                        int   80h
                                        

    .section .data
                                        array:
                                        .byte  89, 10, 67, 1, 4, 27, 12,
                                        34, 86, 3
                                        array_end:
                                        .equ ARRAY_SIZE, array_end - array
                                        array_fmt:
                                        .asciz "  %d"
                                        usort_str:
                                        .asciz "unsorted array:"
                                        sort_str:
                                        .asciz "sorted array:"
                                        newline:
                                        .asciz "\n"
                                        .section .text
                                        .globl _start
                                        _start:
                                        pushl $usort_str
                                        call  puts
                                        addl  $4, %esp
                                        pushl $ARRAY_SIZE
                                        pushl $array
                                        pushl $array_fmt
                                        call  print_array10
                                        addl  $12, %esp
                                        pushl $ARRAY_SIZE
                                        pushl $array
                                        call  sort_routine20
                                        # Adjust the stack pointer
                                        addl  $8, %esp
                                        pushl $sort_str
                                        call  puts
                                        addl  $4, %esp
                                        pushl $ARRAY_SIZE
                                        pushl $array
                                        pushl $array_fmt
                                        call  print_array10
                                        addl  $12, %esp
                                        jmp   _exit
                                        print_array10:
                                        pushl %ebp
                                        movl  %esp, %ebp
                                        subl  $4, %esp
                                        movl  8(%ebp), %edx
                                        movl  12(%ebp), %ebx
                                        movl  16(%ebp), %ecx
                                        movl  $0, %esi
                                        push_loop:
                                        movl  %ecx, -4(%ebp)
                                        movl  8(%ebp), %edx
                                        xorl  %eax, %eax
                                        movb  (%ebx, %esi, 1), %al
                                        pushl %eax
                                        pushl %edx
                                        call  printf
                                        addl  $8, %esp
                                        movl  -4(%ebp), %ecx
                                        incl  %esi
                                        loop  push_loop
                                        pushl $newline
                                        call  printf
                                        addl  $4, %esp
                                        movl  %ebp, %esp
                                        popl  %ebp
                                        ret
                                        sort_routine20:
                                        pushl %ebp
                                        movl  %esp, %ebp
                                        # Allocate a word of space in stack
                                        subl  $4, %esp
                                        # Get the address of the array
                                        movl  8(%ebp), %ebx
                                        # Store array size
                                        movl  12(%ebp), %ecx
                                        decl  %ecx
                                        # Prepare for outer loop here
                                        xorl  %esi, %esi
                                        outer_loop:
                                        # This stores the min index
                                        movl  %esi, -4(%ebp)
                                        movl  %esi, %edi
                                        incl  %edi
                                        inner_loop:
                                        cmpl  $ARRAY_SIZE, %edi
                                        jge   swap_vars
                                        xorb  %al, %al
                                        movl  -4(%ebp), %edx
                                        movb  (%ebx, %edx, 1), %al
                                        cmpb  %al, (%ebx, %edi, 1)
                                        jge   check_next
                                        movl  %edi, -4(%ebp)
                                        check_next:
                                        incl  %edi
                                        jmp   inner_loop
                                        swap_vars:
                                        movl  -4(%ebp), %edi
                                        movb  (%ebx, %edi, 1), %dl
                                        movb  (%ebx, %esi, 1), %al
                                        movb  %dl, (%ebx, %esi, 1)
                                        movb  %al, (%ebx,  %edi, 1)
                                        incl  %esi
                                        loop  outer_loop
                                        movl  %ebp, %esp
                                        popl  %ebp
                                        ret
                                        _exit:
                                        movl  $1, %eax
                                        movl  0, %ebx
                                        int   $0x80
                                        

    Listing 4 might look overwhelming at first, but in fact it's very simple. The listing introduces the concept of functions, various memory addressing schemes, the stack and the use of a library function. The program sorts an array of 10 numbers and uses the external C library functions puts and printf to print out the entire contents of the unsorted and sorted array. For modularity and to introduce the concept of functions, the sort routine itself is implemented as a separate procedure along with the array print routine. Let's deal with them one by one.

    After the data declarations, the program execution begins with a call to puts (line 31). The puts function displays a string on the console. Its only argument is the address of the string to be displayed, which is passed on to it by pushing the address of the string in the stack (line 30).

    In NASM, any label that is not part of our program and needs to be resolved during link time must be predefined, which is the function of the extern keyword (line 24). GAS doesn't have such requirements. After this, the address of the string usort_str is pushed onto the stack (line 30). In NASM, a memory variable such as usort_str represents the address of the memory location itself, and thus a call such as push usort_str actually pushes the address on top of the stack. In GAS, on the other hand, the variable usort_str must be prefixed with $, so that it is treated as an immediate address. If it's not prefixed with $, the actual bytes represented by the memory variable are pushed onto the stack instead of the address.

    Since pushing a variable essentially moves the stack pointer by a dword, the stack pointer is adjusted by adding 4 (the size of a dword) to it (line 32).

    Three arguments are now pushed onto the stack, and the print_array10 function is called (line 37). Functions are declared the same way in both NASM and GAS. They are nothing but labels, which are invoked using the call instruction.

    After a function call, ESP represents the top of the stack. A value of esp + 4 represents the return address, and a value of esp + 8 represents the first argument to the function. All subsequent arguments are accessed by adding the size of a dword variable to the stack pointer (that is, esp + 12, esp + 16, and so on).

    Once inside a function, a local stack frame is created by copying esp to ebp (line 62). You can also allocate space for local variables as is done in the program (line 63). You do this by subtracting the number of bytes required from esp. A value of esp – 4 represents a space of 4 bytes allocated for a local variable, and this can continue as long as there is enough space in the stack to accommodate your local variables.

    Listing 4 illustrates the base indirect addressing mode (line 64), so called because you start with a base address and add an offset to it to arrive at a final address. On the NASM side of the listing, [ebp + 8] is one such example, as is [ebp – 4] (line 71). In GAS, the addressing is a bit more terse: 4(%ebp) and -4(%ebp), respectively.

    In the print_array10 routine, you can see another kind of addressing mode being used after the push_loop label (line 74). The line is represented in NASM and GAS, respectively, like so:

    NASM: mov al, byte [ebx + esi]

    GAS: movb (%ebx, %esi, 1), %al

    This addressing mode is the base indexed addressing mode. Here, there are three entities: one is the base address, the second is the index register, and the third is the multiplier. Because it's not possible to determine the number of bytes to be accessed from a memory location, a method is needed to find out the amount of memory addressed. NASM uses the byte operator to tell the assembler that a byte of data is to be moved. In GAS the same problem is solved by using a multiplier as well as using the b, w, or l suffix in the mnemonic (for example, movb). The syntax of GAS can seem somewhat complex when first encountered.

    The general form of base indexed addressing in GAS is as follows:

    %segment:ADDRESS (, index, multiplier)

    or

    %segment:(offset, index, multiplier)

    or

    %segment:ADDRESS(base, index, multiplier)

    The final address is calculated using this formula:

    ADDRESS or offset + base + index * multiplier.

    Thus, to access a byte, a multiplier of 1 is used, for a word, 2, and for a dword, 4. Of course, NASM uses a simpler syntax. Thus, the above in NASM would be represented like so:

    Segment:[ADDRESS or offset + index * multiplier]

    A prefix of byte, word, or dword is used before this memory address to access 1, 2, or 4 bytes of memory, respectively.



    Back to top


    Leftovers

    Listing 5 reads a list of command line arguments, stores them in memory, and then prints them.

    Listing 5. A program that reads command line arguments, stores them in memory, and prints them
    Line NASM GAS
    001
                                        002
                                        003
                                        004
                                        005
                                        006
                                        007
                                        008
                                        009
                                        010
                                        011
                                        012
                                        013
                                        014
                                        015
                                        016
                                        017
                                        018
                                        019
                                        020
                                        021
                                        022
                                        023
                                        024
                                        025
                                        026
                                        027
                                        028
                                        029
                                        030
                                        031
                                        032
                                        033
                                        034
                                        035
                                        036
                                        037
                                        038
                                        039
                                        040
                                        041
                                        042
                                        043
                                        044
                                        045
                                        046
                                        047
                                        048
                                        049
                                        050
                                        051
                                        052
                                        053
                                        054
                                        055
                                        056
                                        057
                                        058
                                        059
                                        060
                                        061
                                        

    section .data
                                        ; Command table to store at most
                                        ;  10 command line arguments
                                        cmd_tbl:
                                        %rep 10
                                        dd 0
                                        %endrep
                                        section .text
                                        global _start
                                        _start:
                                        ; Set up the stack frame
                                        mov   ebp, esp
                                        ; Top of stack contains the
                                        ;  number of command line arguments.
                                        ; The default value is 1
                                        mov   ecx, [ebp]
                                        ; Exit if arguments are more than 10
                                        cmp   ecx, 10
                                        jg    _exit
                                        mov   esi, 1
                                        mov   edi, 0
                                        ; Store the command line arguments
                                        ;  in the command table
                                        store_loop:
                                        mov   eax, [ebp + esi * 4]
                                        mov   [cmd_tbl + edi * 4], eax
                                        inc   esi
                                        inc   edi
                                        loop  store_loop
                                        mov   ecx, edi
                                        mov   esi, 0
                                        extern puts
                                        print_loop:
                                        ; Make some local space
                                        sub   esp, 4
                                        ; puts function corrupts ecx
                                        mov   [ebp - 4], ecx
                                        mov   eax, [cmd_tbl + esi * 4]
                                        push  eax
                                        call  puts
                                        add   esp, 4
                                        mov   ecx, [ebp - 4]
                                        inc   esi
                                        loop  print_loop
                                        jmp   _exit
                                        _exit:
                                        mov   eax, 1
                                        mov   ebx, 0
                                        int   80h
                                        

    .section .data
                                        // Command table to store at most
                                        //  10 command line arguments
                                        cmd_tbl:
                                        .rept 10
                                        .long 0
                                        .endr
                                        .section .text
                                        .globl _start
                                        _start:
                                        // Set up the stack frame
                                        movl  %esp, %ebp
                                        // Top of stack contains the
                                        //  number of command line arguments.
                                        // The default value is 1
                                        movl  (%ebp), %ecx
                                        // Exit if arguments are more than 10
                                        cmpl  $10, %ecx
                                        jg    _exit
                                        movl  $1, %esi
                                        movl  $0, %edi
                                        // Store the command line arguments
                                        //  in the command table
                                        store_loop:
                                        movl  (%ebp, %esi, 4), %eax
                                        movl  %eax, cmd_tbl( , %edi, 4)
                                        incl  %esi
                                        incl  %edi
                                        loop  store_loop
                                        movl  %edi, %ecx
                                        movl  $0, %esi
                                        print_loop:
                                        // Make some local space
                                        subl  $4, %esp
                                        // puts functions corrupts ecx
                                        movl  %ecx, -4(%ebp)
                                        movl  cmd_tbl( , %esi, 4), %eax
                                        pushl %eax
                                        call  puts
                                        addl  $4, %esp
                                        movl  -4(%ebp), %ecx
                                        incl  %esi
                                        loop  print_loop
                                        jmp   _exit
                                        _exit:
                                        movl  $1, %eax
                                        movl  $0, %ebx
                                        int   $0x80
                                        

    Listing 5 shows a construct that repeats instructions in assembly. Naturally enough, it's called the repeat construct. In GAS, the repeat construct is started using the .rept directive (line 6). This directive has to be closed using an .endr directive (line 8). .rept is followed by a count in GAS that specifies the number of times the expression enclosed inside the .rept/.endr construct is to be repeated. Any instruction placed inside this construct is equivalent to writing that instruction count number of times, each on a separate line.

    For example, for a count of 3:

    .rept 3
         movl $2, %eax
    .endr

    This is equivalent to:

    movl $2, %eax
    movl $2, %eax
    movl $2, %eax

    In NASM, a similar construct is used at the preprocessor level. It begins with the %rep directive and ends with %endrep. The %rep directive is followed by an expression (unlike in GAS where the .rept directive is followed by a count):

    %rep <expression>
         nop
    %endrep

    There is also an alternative in NASM, the times directive. Similar to %rep, it works at the assembler level, and it, too, is followed by an expression. For example, the above %rep construct is equivalent to this:

    times <expression> nop

    And this:

    %rep 3
         mov eax, 2
    %endrep

    is equivalent to this:

    times 3 mov eax, 2

    and both are equivalent to this:

    mov eax, 2
    mov eax, 2
    mov eax, 2

    In Listing 5, the .rept (or %rep) directive is used to create a memory data area for 10 double words. The command line arguments are then accessed one by one from the stack and stored in the memory area until the command table gets full.

    As for command line arguments, they are accessed similarly with both assemblers. ESP or the top of the stack stores the number of command line arguments supplied to a program, which is 1 by default (for no command line arguments). esp + 4 stores the first command line argument, which is always the name of the program that was invoked from the command line. esp + 8, esp + 12, and so on store subsequent command line arguments.

    Also watch the way the memory command table is being accessed on both sides in Listing 5. Here, memory indirect addressing mode (line 33) is used to access the command table along with an offset in ESI (and EDI) and a multiplier. Thus, [cmd_tbl + esi * 4] in NASM is equal to cmd_tbl(, %esi, 4) in GAS.



    Back to top


    Conclusion

    Even though the differences between these two assemblers are substantial, it's not that difficult to convert from one form to another. You might find that the AT&T syntax seems at first difficult to understand, but once mastered, it's as simple as the Intel syntax.



    Resources

    Learn

    Get products and technologies
    • Order the SEK for Linux, a two-DVD set containing the latest IBM trial software for Linux from DB2®, Lotus®, Rational®, Tivoli®, and WebSphere®.

    • With IBM trial software, available for download directly from developerWorks, build your next development project on Linux.


    Discuss


    About the author

    Ram holds a post graduate degree in computer science and is working as a software engineer in IBM's India Software Labs, Rational Division, developing and adding features to Rational ClearCase. He has worked on various flavors of Linux, UNIX, and Windows, as well as real-time mobile-based operating systems such as Symbian and Windows Mobile. In his spare time he hacks Linux and reads books.

  • 相关阅读:
    锦哥同济大学2021年数学分析高等代数考研试题参考解答
    上海交通大学2021年数学分析高等代数考研试题参考解答
    锦哥山东大学2021年数学分析考研试题
    锦哥南京大学2021年高等代数考研试题
    锦哥南京大学2021年数学分析考研试题
    锦哥华中师范大学2021年高等代数考研试题
    裴礼文数学分析中的典型问题与方法第3版1.1.3笔记Wallis公式
    锦哥中南大学2021年高等代数考研试题
    锦哥中南大学2021年数学分析考研试题
    裴礼文数学分析中的典型问题与方法第3版1.1.2几个常用的不等式
  • 原文地址:https://www.cnblogs.com/adylee/p/1334108.html
Copyright © 2011-2022 走看看