zoukankan html css js c++ java

ext2元数据结构

概述

本篇博客主要描述ext2文件系统中的各种典型元数据结构，其中包括文件系统级别的元数据，如超级块，块组描述符等，也包括文件级的元数据，如文件目录项，文件inode等。

ext2超级块

这里的超级块指的是ext2文件系统存储在磁盘上的超级块结构，之所以这么说是因为每个文件系统除了存储在磁盘上的超级块外，还在内存中也存储了一个超级块结构，基本上内存中的超级块是在磁盘超级块的基础上增加了一些额外的管理信息而成，因此，在这里我们主要关注的是ext2存储在磁盘上的超级块的数据结构。

ext2磁盘超级块的定义如下：

 1 /*
 2  * Structure of the super block
 3  */
 4 struct ext2_super_block {
 5     __le32    s_inodes_count;        /* Inodes count 索引节点总数*/
 6     __le32    s_blocks_count;        /* Blocks count 块大小，即文件系统以块为单位的大小*/
 7     __le32    s_r_blocks_count;    /* Reserved blocks count */
 8     __le32    s_free_blocks_count;    /* Free blocks count */
 9     __le32    s_free_inodes_count;    /* Free inodes count */
10     __le32    s_first_data_block;    /* First Data Block */
11     __le32    s_log_block_size;    /* Block size */
12     __le32    s_log_frag_size;    /* Fragment size */
13     __le32    s_blocks_per_group;    /* # Blocks per group 每个块组中的块数*/
14     __le32    s_frags_per_group;    /* # Fragments per group */
15     __le32    s_inodes_per_group;    /* # Inodes per group 每个块组中的索引节点个数*/

16     __le32    s_mtime;        /* Mount time */
17     __le32    s_wtime;        /* Write time */
18     __le16    s_mnt_count;        /* Mount count */
19     __le16    s_max_mnt_count;    /* Maximal mount count */
20     __le16    s_magic;        /* Magic signature */
21     __le16    s_state;        /* File system state */
22     __le16    s_errors;        /* Behaviour when detecting errors */
23     __le16    s_minor_rev_level;     /* minor revision level */
24     __le32    s_lastcheck;        /* time of last check */
25     __le32    s_checkinterval;    /* max. time between checks */
26     __le32    s_creator_os;        /* OS */
27     __le32    s_rev_level;        /* Revision level */
28     __le16    s_def_resuid;        /* Default uid for reserved blocks */
29     __le16    s_def_resgid;        /* Default gid for reserved blocks */
30     /*
31      * These fields are for EXT2_DYNAMIC_REV superblocks only.
32      *
33      * Note: the difference between the compatible feature set and
34      * the incompatible feature set is that if there is a bit set
35      * in the incompatible feature set that the kernel doesn't
36      * know about, it should refuse to mount the filesystem.
37      * 
38      * e2fsck's requirements are more strict; if it doesn't know
39      * about a feature in either the compatible or incompatible
40      * feature set, it must abort and not try to meddle with
41      * things it doesn't understand...
42      */
43     __le32    s_first_ino;         /* First non-reserved inode */
44     __le16   s_inode_size;         /* size of inode structure */
45     __le16    s_block_group_nr;     /* block group # of this superblock */
46     __le32    s_feature_compat;     /* compatible feature set */
47     __le32    s_feature_incompat;     /* incompatible feature set */
48     __le32    s_feature_ro_compat;     /* readonly-compatible feature set */
49     __u8    s_uuid[16];        /* 128-bit uuid for volume */
50     char    s_volume_name[16];     /* volume name */
51     char    s_last_mounted[64];     /* directory where last mounted */
52     __le32    s_algorithm_usage_bitmap; /* For compression */
53     /*
54      * Performance hints.  Directory preallocation should only
55      * happen if the EXT2_COMPAT_PREALLOC flag is on.
56      */
57     __u8    s_prealloc_blocks;    /* Nr of blocks to try to preallocate*/
58     __u8    s_prealloc_dir_blocks;    /* Nr to preallocate for dirs */
59     __u16    s_padding1;
60     /*
61      * Journaling support valid if EXT3_FEATURE_COMPAT_HAS_JOURNAL set.
62      */
63     __u8    s_journal_uuid[16];    /* uuid of journal superblock */
64     __u32    s_journal_inum;        /* inode number of journal file */
65     __u32    s_journal_dev;        /* device number of journal file */
66     __u32    s_last_orphan;        /* start of list of inodes to delete */
67     __u32    s_hash_seed[4];        /* HTREE hash seed */
68     __u8    s_def_hash_version;    /* Default hash version to use */
69     __u8    s_reserved_char_pad;
70     __u16    s_reserved_word_pad;
71     __le32    s_default_mount_opts;
72      __le32    s_first_meta_bg;     /* First metablock block group */
73     __u32    s_reserved[190];    /* Padding to the end of the block */
74 };

可以看到，ext2磁盘超级块结构中大部分是描述整个文件系统的信息，如文件系统中块组的数量，inode数量，磁盘块的数量等等，不一而足，基本上从代码的注释我们就能比较清楚各个成员的含义，而且在后续的文章中我们或多或少地也会遇到这些成员，另外ext2超级块的最后一部分成员是为了兼容ext3而设计的，可能是为了更方便地从ext2升级至ext3吧，当然这只是我的猜测而已。

ext2块组描述符

前面的描述中我们知道，ext2文件系统将磁盘（分区）划分成大小相等的块组，以提高文件存取的连续性。而且块组中存在inode表，inode位图，数据块位图中众多信息，因此，有必要对每个块组生成一个描述符来管理块组，在ext2中，该数据结构如下定义：

/*
 * Structure of a blocks group descriptor
 */
struct ext2_group_desc
{
    __le32    bg_block_bitmap;        /* Blocks bitmap block */
    __le32    bg_inode_bitmap;        /* Inodes bitmap block */
    __le32    bg_inode_table;        /* Inodes table block */
    __le16    bg_free_blocks_count;    /* Free blocks count */
    __le16    bg_free_inodes_count;    /* Free inodes count */
    __le16    bg_used_dirs_count;    /* Directories count */
    __le16    bg_pad;
    __le32    bg_reserved[3];
};

相对来说，块组描述符简单得多了，记录了块组中数据块位图和inode位图的块号，这些块号是相对于块组而言的，而非绝对块号，另外还记录了该块组中创建的目录数量，之所以记录这个是在后来创建目录时会将其作为考虑的参数，之所以这样做的目的是将目录分散在文件系统的所有块组中，避免某些块组过满而另外一些较为空闲的情况，当然这只是一种很简单的策略，效果也不见得多好。

ext2文件目录项

熟悉文件系统的朋友都知道，在linux文件系统中是通过目录一级一级索引直至找到最终的文件。文件是被组织在目录下的，要先找到文件我们必须先定位其所在目录，而且在linux中，一切皆文件，目录也是一个文件，也有数据块，其数据块中保存的是该目录下所有文件和子目录的文件目录项，因此，linux下文件查找的过程便是读出目录的数据块，在其中查找感兴趣的文件的文件目录项，进而访问文件更详细的信息。

因此，对于ext2文件来说，每个文件的首先的元数据信息便是文件目录项，而且它是存储在磁盘上的，只不过它是存储在父目录的数据块中，但这并不影响其重要性，ext2文件系统的文件目录项结构如下：

struct ext2_dir_entry_2 {
    __le32    inode;            /* Inode number inode编号 */
    __le16    rec_len;        /* Directory entry length */
    __u8    name_len;        /* Name length */
    __u8    file_type;
    char    name[EXT2_NAME_LEN];    /* File name */
};

文件目录项主要是存储文件名至文件inode的映射关系，这样，根据文件名在父目录数据块中查找感兴趣文件就能获取该文件的inode号，进而可以得到该文件的所有信息。

在该结构中，inode代表该文件inode编号，rec_len表示本文件目录项的大小，为什么需要这个rec_len呢，结构体定义好了整个长度不也就确认了嘛？非也，这是因为该结构体的最后一个成员name并不是固定长度的，其最大可以支持256字节，因此必须要有一个长度域来保存当前目录项长度，name_len指的是文件名长度，既然已经有了rec_len，为什么还需要文件名长呢，岂不多此一举？这是考虑到存在文件名填充的问题。从效率上来考虑，每个struct ext2_dir_entry_2最终都会被填充成4字节整数倍，对于目录项不是4字节整数倍的，需要在最后name文件名后面填充若干个0，因此name_len中记录的便是name[]域中有效文件名长度（即不包含0）。考虑下图所示事例：

1. “.”和“..”文件名后都填充了‘’以使文件目录项总长度为4的整数倍；

2. music和src文件/目录均也填充了‘’以使文件目录项总长度为4的整数倍;

3. test.txt因为其文件目录项已经是16个字节，无需填充。

ext2索引节点

该数据结构可能是一个文件最重要的元数据信息了，因为描述文件的一切属性都保存在这里了（除了文件名），重要性不言而喻，而且索引节点也是持久化存储在磁盘之上，每个块组都有专门的inode表来存储文件索引节点，ext2的索引节点结构如下所描述：

/*
 * Structure of an inode on the disk
 */
struct ext2_inode {
    __le16    i_mode;        /* File mode */
    __le16    i_uid;        /* Low 16 bits of Owner Uid */
    __le32    i_size;        /* Size in bytes */
    __le32    i_atime;    /* Access time */
    __le32    i_ctime;    /* Creation time */
    __le32    i_mtime;    /* Modification time */
    __le32    i_dtime;    /* Deletion Time */
    __le16    i_gid;        /* Low 16 bits of Group Id */
    __le16    i_links_count;    /* Links count */
    __le32    i_blocks;    /* Blocks count */
    __le32    i_flags;    /* File flags */
    union {
        struct {
            __le32  l_i_reserved1;
        } linux1;
        struct {
            __le32  h_i_translator;
        } hurd1;
        struct {
            __le32  m_i_reserved1;
        } masix1;
    } osd1;                /* OS dependent 1 */
    __le32    i_block[EXT2_N_BLOCKS];/* Pointers to blocks */
    __le32    i_generation;    /* File version (for NFS) */
    __le32    i_file_acl;    /* File ACL */
    __le32    i_dir_acl;    /* Directory ACL */
    __le32    i_faddr;    /* Fragment address */
    union {
        struct {
            __u8    l_i_frag;    /* Fragment number */
            __u8    l_i_fsize;    /* Fragment size */
            __u16    i_pad1;
            __le16    l_i_uid_high;    /* these 2 fields    */
            __le16    l_i_gid_high;    /* were reserved2[0] */
            __u32    l_i_reserved2;
        } linux2;
        struct {
            __u8    h_i_frag;    /* Fragment number */
            __u8    h_i_fsize;    /* Fragment size */
            __le16    h_i_mode_high;
            __le16    h_i_uid_high;
            __le16    h_i_gid_high;
            __le32    h_i_author;
        } hurd2;
        struct {
            __u8    m_i_frag;    /* Fragment number */
            __u8    m_i_fsize;    /* Fragment size */
            __u16    m_pad1;
            __u32    m_i_reserved2[2];
        } masix2;
    } osd2;                /* OS dependent 2 */
};

索引节点记录了文件的各种属性，如文件大小，文件模式，文件各种时间信息，文件数据块位置信息等。而其中最重要的信息就得数文件数据块位置了。ext2文件系统采用了巧妙的办法来记录文件数据块，兼顾了效率和空间利用率，具体方法如下图所示：

ext2中，将文件的数据块索引组织成数组的形式。在ext2的索引结构中有一个i_block[]，该数组共有15项，每项记录的都是物理磁盘块号，其中前12项记录的是一级索引，即该该索引记录的是文件数据块地址，因此，对于小于12个数据块的文件来说，只需要查一次索引即可获得文件数据块位置。i_block[]的第13项是一个二级索引，即其中的块号指向的并不是文件数据块，而是一个存储索引的数据块，该索引数据块中保存了文件数据块的块号，因此，对于大一点的文件，获取数据可能得经历两次索引查询，另外还有三级索引等等，ext2就是采取这种策略来组织文件数据，这种方式在后面的博客中还会有专门的篇幅来阐述，这里点到为止。

查看全文

相关阅读:
阿里IM技术分享(六)：闲鱼亿级IM消息系统的离线推送到达率优化
 IM开发基础知识补课(十)：大型IM系统有多难？万字长文，搞懂异地多活！
长连接网关技术专题(六)：石墨文档单机50万WebSocket长连接架构实践
 手把手教你实现网页端社交应用中的@人功能：技术原理、代码示例等
 跟着源码学IM(九)：基于Netty实现一套分布式IM系统
 网络编程懒人入门(十三)：一泡尿的时间，快速搞懂TCP和UDP的区别
 探探的IM长连接技术实践：技术选型、架构设计、性能优化
 直播系统聊天技术(六)：百万人在线的直播间实时聊天消息分发技术实践
 基于实践：一套百万消息量小规模IM系统技术要点总结
 Datafram 实现作为正文发送邮件

原文地址：https://www.cnblogs.com/wuchanming/p/3862210.html