zoukankan      html  css  js  c++  java
  • 什么是字节对齐,为什么要对齐?

    Computer Systems: A Programmer's Perspective:

    3.9.3 Data Alignment

    Many computer systems place restrictions on the allowable addresses for the primitive data types, requiring that the address for some type of object must be a multiple of some value K (typically 2, 4, or 8). Such alignment restrictions simplify the design of the hardware forming the interface between the processor and the memory system. For example, suppose a processor always fetches 8 bytes from memory with an address that must be a multiple of 8. If we can guarantee that any double will be aligned to have its address be a multiple of 8, then the value can be read or written with a single memory operation. Otherwise, we may need to perform two memory accesses, since the object might be split across two 8-byte memory blocks.

    The IA32 hardware will work correctly regardless of the alignment of data. However, Intel recommends that data be aligned to improve memory system performance. Linux follows an alignment policy where 2-byte data types (e.g., short) must have an address that is a multiple of 2, while any larger data types (e.g., int, int *, float, and double) must have an address that is a multiple of 4. Note that this requirement means that the least significant bit of the address of an object of type short must equal zero. Similarly, any object of type int, or any pointer, must be at an address having the low-order 2 bits equal to zero.

    Aside: A case of mandatory alignment

    For most IA32 instructions, keeping data aligned improves efficiency, but it does not affect program behavior. On the other hand, some of the SSE instructions for implementing multimedia operations will not work correctly with unaligned data. These instructions operate on 16-byte blocks of data, and the instructions that transfer data between the SSE unit and memory require the memory addresses to be multiples of 16. Any attempt to access memory with an address that does not satisfy this alignment will lead to an exception, with the default behavior for the program to terminate.

    This is the motivation behind the IA32 convention of making sure that every stack frame is a multiple of 16 bytes long (see the aside of page 226). The compiler can allocate storage within a stack frame in such a way that a block can be stored with a 16-byte alignment.

    Aside: Alignment with Microsoft Windows

    Microsoft Windows imposes a stronger alignment requirement—any primitive object of K bytes, for K = 2, 4, or 8, must have an address that is a multiple of K. In particular, it requires that the address of a double or a long long be a multiple of 8. This requirement enhances the memory performance at the expense of some wasted space. The Linux convention, where 8-byte values are aligned on 4-byte boundaries was probably good for the i386, back when memory was scarce and memory interfaces were only 4 bytes wide. With modern processors, Microsoft’s alignment is a better design decision. Data type long double, for which gcc generates IA32 code allocating 12 bytes (even though the actual data type requires only 10 bytes) has a 4-byte alignment requirement with both Windows and Linux.
     
    Alignment is enforced by making sure that every data type is organized and allocated in such a way that every object within the type satisfies its alignment restrictions. The compiler places directives in the assembly code indicating the desired alignment for global data. For example, the assembly-code declaration of the jump table beginning on page 217 contains the following directive on line 2:

    .align 4

    This ensures that the data following it (in this case the start of the jump table) will start with an address that is a multiple of 4. Since each table entry is 4 bytes long, the successive elements will obey the 4-byte alignment restriction.

    Library routines that allocate memory, such as malloc, must be designed so that they return a pointer that satisfies the worst-case alignment restriction
    for the machine it is running on, typically 4 or 8. For code involving structures, the compiler may need to insert gaps in the field allocation to ensure that each structure element satisfies its alignment requirement. The structure then has some required alignment for its starting address.

    For example, consider the following structure declaration:
    struct S1 {
      int  i;
      char c;
      int  j;
    };

    Suppose the compiler used the minimal 9-byte allocation, diagrammed as follows:
    Then it would be impossible to satisfy the 4-byte alignment requirement for both fields i (offset 0) and j (offset 5). Instead, the compiler inserts a 3-byte gap (shown
    here as shaded in blue) between fields c and j:
    As a result, j has offset 8, and the overall structure size is 12 bytes. Furthermore, the compiler must ensure that any pointer p of type struct S1* satisfies
    a 4-byte alignment. Using our earlier notation, let pointer p have value xp. Then xp must be a multiple of 4. This guarantees that both p->i (address xp)
    and p->j (address xp + 8) will satisfy their 4-byte alignment requirements.

    In addition, the compiler may need to add padding to the end of the structure so that each element in an array of structures will satisfy its alignment requirement.
    For example, consider the following structure declaration:
    struct S2 {
      int  i;
      int  j;
      char c;
    };
    If we pack this structure into 9 bytes, we can still satisfy the alignment requirements for fields i and j by making sure that the starting address of the structure satisfies
    a 4-byte alignment requirement. Consider, however, the following declaration:

    struct S2 d[4];

    With the 9-byte allocation, it is not possible to satisfy the alignment requirement for each element of d, because these elements will have addresses xd, xd + 9,
    xd + 18, and xd + 27. Instead, the compiler allocates 12 bytes for structure S2, with the final 3 bytes being wasted space:
    That way the elements of d will have addresses xd, xd + 12, xd + 24, and xd + 36. As long as xd is a multiple of 4, all of the alignment restrictions will be satisfied.
     
  • 相关阅读:
    人脸识别-常用的数据库Face Databases From Other Research Groups
    447. Number of Boomerangs
    356. Line Reflection
    149. Max Points on a Line
    279. Perfect Squares
    264. Ugly Number II
    204. Count Primes
    263. Ugly Number
    202. Happy Number
    4. Median of Two Sorted Arrays
  • 原文地址:https://www.cnblogs.com/skullboyer/p/7762710.html
Copyright © 2011-2022 走看看