zoukankan      html  css  js  c++  java
  • Linking(2)

    Hiding variable and function names with static 

    C programmers use the static attribute to hide variable and function declarations inside modules, much as you would use public and private declarations in Java and C++.  

    Symbol tables are built by assemblers, using symbols exported by the compiler into the assembly-language .s file.

    An ELF symbol table is contained in the .symtab section. It contains an array of entries. Figure 7.4 shows the format of each entry.

    The name is a byte offset into the string table that points to the null-terminated string name of the symbol. 

    The value is the symbol’s address.

    For relocatable modules, the value is an offset from the beginning of the section where the object is defined.

    For executable object files, the value is an absolute run-time address.

    The size is the size (in bytes) of the object.  

    The type is usually either data or function.  

    The binding field indicates whether the symbol is local or global

     Each symbol is associated with some section of the object file, denoted by the section field, which is an index into the section header table.  

    There are three special pseudo sections that don’t have entries in the section header table:

    ABS is for symbols that should not be relocated.

    UNDEF is for undefined sym- bols, that is, symbols that are referenced in this object module but defined else- where.

    COMMON is for uninitialized data objects that are not yet allocated. For COMMON symbols, the value field gives the alignment requirement, and size gives the minimum size

    For example, here are the last three entries in the symbol table for main.o, as displayed by the GNU readelf tool.

    The first eight entries, which are not shown, are local symbols that the linker uses internally

    In this example, we see an entry for the definition of global symbol buf, an 8- byte object located at an offset (i.e., value) of zero in the .data section.

    This is followed by the definition of the global symbol main, a 17-byte function located at an offset of zero in the .text section.

    The last entry comes from the reference for the external symbol swap.

    Readelf identifies each section by an integer index. Ndx=1 denotes the .text section, and Ndx=3 denotes the .data section. 

    Symbol Resolution 

    The linker resolves symbol references by associating each reference with exactly one symbol definition from the symbol tables of its input relocatable object files.

    Symbol resolution is straightforward for references to local symbols that are de- fined in the same module as the reference.

    The compiler allows only one definition of each local symbol per module.

    The compiler also ensures that static local vari- ables, which get local linker symbols, have unique names. 

    Symbol resolution procedure for global symbol:

    When the com- piler encounters a symbol (either a variable or function name) that is not defined in the current module, it assumes that it is defined in some other module,

    gener- ates a linker symbol table entry, and leaves it for the linker to handle. If the linker is unable to find a definition for the referenced symbol in any of its input modules,

    it prints an (often cryptic) error message and terminates. 

    situation:

    Symbol resolution for global symbols is also tricky because the same symbol might be defined by multiple object files. In this case, the linker must either flag an error or somehow choose one of the definitions and discard the rest.

    The approach adopted by Unix systems involves cooperation between the compiler, assembler, and linker.

     How Linkers Resolve Multiply Defined Global Symbols 

    At compile time, the compiler exports each global symbol to the assembler as either strong or weak, and the assembler encodes this information implicitly in the symbol table of the relocatable object file.  

    Functions and initialized global variables get strong symbols. Uninitialized global variables get weak symbols.  

    Unix linkers use the following rules for dealing with multiply defined symbols: 

    • .  Rule 1: Multiple strong symbols are not allowed.

    • .  Rule 2: Given a strong symbol and multiple weak symbols, choose the strong

      symbol.

    • .  Rule 3: Given multiple weak symbols, choose any of the weak symbols. 

    Rule 1 :

    In this case, the linker will generate an error message because the strong symbol main is defined multiple times (rule 1): 

    Similarly, the linker will generate an error message for the following modules because the strong symbol x is defined twice (rule 1): 

    Rule 2:

    However, if x is uninitialized in one module, then the linker will quietly choose

    the strong symbol defined in the other (rule 2): 

    Notice that the linker normally gives no indication that it has detected multiple definitions of x: 

    Rule 3:

    The same thing can happen if there are two weak definitions of x (rule 3): 

    Rule 2 and Rule 3:

    On an IA32/Linux machine, doubles are 8 bytes and ints are 4 bytes. Thus, the assignment x = -0.0 in line 6 of bar5.c will overwrite the memory locations for x and y

    (lines 5 and 6 in foo5.c) with the double-precision floating-point representation of negative zero! 

    linux> gcc -o foobar5 foo5.c bar5.c 
    linux> ./foobar5
    x = 0x0 y = 0x80000000

    When in doubt, invoke the linker with a flag such as the gcc -fno-common flag, which triggers an error if it encounters multiply defined global symbols. 

  • 相关阅读:
    react的路由以及传值方法
    三连击
    给网页添加鼠标样式
    单词统计(续)
    个人课程总结
    构建之法阅读笔记02
    构建之法阅读笔记01
    第十六周总结
    计算最长英语单词链
    第十五周总结
  • 原文地址:https://www.cnblogs.com/geeklove01/p/9210759.html
Copyright © 2011-2022 走看看