zoukankan      html  css  js  c++  java
  • DEX文件解析---1、dex文件头解析

    DEX文件解析---1、dex文件头解析


    一、dex文件

        dex文件是Android平台上可执行文件的一种文件类型。它的文件格式可以下面这张图概括:
    dex文件格式
        dex文件头一般固定为0x70个字节大小,包含标志、版本号、校验码、sha-1签名以及其他一些方法、类的数量和偏移地址等信息。如下图所示:
    dex文件头


    二、dex文件头各字段解析

        dex文件头包含以下各个字段:

    1. magic:包含了dex文件标识符以及版本,从0x00开始,长度为8个字节
    2. checksum:dex文件校验码,偏移量为:0x08,长度为4个字节。
    3. signature:dex sha-1签名,偏移量为0x0c,长度为20个字节
    4. file_szie:dex文件大小,偏移量为0x20,长度为4个字节
    5. header_size:dex文件头大小,偏移量为0x24,长度为4个字节,一般为0x70
    6. endian_tag:dex文件判断字节序是否交换,偏移量为0x28,长度为4个字节,一般情况下为0x78563412
    7. link_size:dex文件链接段大小,为0则表示为静态链接,偏移量为0x2c,长度为4个字节
    8. link_off:dex文件链接段偏移位置,偏移量为0x30,长度为4个字节
    9. map_off:dex文件中map数据段偏移位置,偏移位置为0x34,长度为4个字节
    10. string_ids_size:dex文件包含的字符串数量,偏移量为0x38,长度为4个字节
    11. string_ids_off:dex文件字符串开始偏移位置,偏移量为0x3c,长度为4个字节
    12. type_ids_size:dex文件类数量,偏移量为0x40,长度为4个字节
    13. type_ids_off:dex文件类偏移位置,偏移量为0x44,长度为4个字节
    14. photo_ids_size:dex文件中方法原型数量,偏移量为0x48,长度为4个字节
    15. photo_ids_off:dex文件中方法原型偏移位置,偏移量为0x4c,长度为4个字节
    16. field_ids_size:dex文件中字段数量,偏移量为0x50,长度为4个字节
    17. field_ids_off:dex文件中字段偏移位置,偏移量为0x54,长度为4个字节
    18. method_ids_size:dex文件中方法数量,偏移量为0x58,长度为4个字节
    19. method_ids_off:dex文件中方法偏移位置,偏移量为0x5c,长度为4个字节
    20. class_defs_size:dex文件中类定义数量,偏移量为0x60,长度为4个字节
    21. class_defs_off:dex文件中类定义偏移位置,偏移量为0x64,长度为4个字节
    22. data_size:dex数据段大小,偏移量为0x68,长度为4个字节
    23. data_off:dex数据段偏移位置,偏移量为0x6c,长度为4个字节

    三、dex文件头代码解析示例(python)

        dex使用open函数以二进制打开文件,然后使用seek函数移动文件指针,例如magic就是f.seek(0x00),然后读取相应信息的字节数即可,例如读取版本号f.seek(0x04) f.read(4),然后做相应打印操作就行,dex文件头较简单,不涉及编码等,所以解析起来感觉脑子都不用带。。。。。具体代码可以看下面或者github,下面附上代码运行图:
    代码运行图


    四、dex文件头解析实现代码(python实现)

    import binascii
    
    def parserHeader(f):
    f.seek(0x00)
    magic_mask = f.read(4)
    magic_mask = binascii.b2a_hex(magic_mask)
    magic_mask = str(magic_mask,encoding='utf-8')
    print('文件标识符: ',end='')
    print(magic_mask)  
    
    f.seek(0x04)
    magic_version = f.read(4)
    magic_version = binascii.b2a_hex(magic_version)
    magic_version = str(magic_version,encoding='utf-8')
    print('文件版本: ',end='')
    print(magic_version)
    
    f.seek(0x08)
    checksum = f.read(4)
    checksum = binascii.b2a_hex(checksum)
    checksum = str(checksum,encoding='utf-8')
    print('校验码: ',end='')
    print(checksum)
    
    f.seek(0x0c)
    signature = f.read(20)
    signature = binascii.b2a_hex(signature)
    signature = str(signature,encoding='utf-8')
    print('SHA-1签名: ',end='')
    print(signature)
    
    f.seek(0x20)
    file_size = f.read(4)
    a = bytearray(file_size)
    a.reverse()
    file_size = bytes(a)
    file_size = binascii.b2a_hex(file_size)
    file_size = str(file_size,encoding='utf-8')
    print('文件大小: ',end='')
    print(int(file_size,16),end='')
    print(' byte')
    
    f.seek(0x24)
    header_size = f.read(4)
    a = bytearray(header_size)
    a.reverse()
    header_size = bytes(a)
    header_size = binascii.b2a_hex(header_size)
    header_size = str(header_size,encoding='utf-8')
    print('文件头大小: ',end='')
    print(int(header_size,16),end='')
    print(' byte')
    
    f.seek(0x28)
    endian_tag = f.read(4)
    endian_tag = binascii.b2a_hex(endian_tag)
    endian_tag = str(endian_tag,encoding='utf-8')
    print('字节序交换标志: ',end='')
    print(endian_tag)
    
    f.seek(0x2c)
    link_size = f.read(4)
    a = bytearray(link_size)
    a.reverse()
    link_size = bytes(a)
    link_size = binascii.b2a_hex(link_size)
    link_size = str(link_size,encoding='utf-8')
    print('链接段大小: ',end='')
    print(int(link_size,16),end='')
    print(' byte')
    
    f.seek(0x30)
    link_off = f.read(4)
    a = bytearray(link_off)
    a.reverse()
    link_off = bytes(a)
    link_off = binascii.b2a_hex(link_off)
    link_off = str(link_off,encoding='utf-8')
    print('链接段偏移位置: ',end='')
    print(hex(int(link_off,16)))
    
    f.seek(0x34)
    map_off = f.read(4)
    a = bytearray(map_off)
    a.reverse()
    map_off = bytes(a)
    map_off = binascii.b2a_hex(map_off)
    map_off = str(map_off,encoding='utf-8')
    print('map数据偏移位置: ',end='')
    print(hex(int(map_off,16)))
    
    f.seek(0x38)
    stringidsSize = f.read(4)
    a = bytearray(stringidsSize)
    a.reverse()
    stringidsSize = bytes(a)
    stringidsSize = binascii.b2a_hex(stringidsSize)
    stringidsSize = str(stringidsSize,encoding='utf-8')
    print('字符串数量: ',end='')
    print(int(stringidsSize,16),end='')
    print('(',end='')
    print(hex(int(stringidsSize,16)),end='')
    print(')')
    
    f.seek(0x3c)
    string_ids_off = f.read(4)
    a = bytearray(string_ids_off)
    a.reverse()
    string_ids_off = bytes(a)
    string_ids_off = binascii.b2a_hex(string_ids_off)
    string_ids_off = str(string_ids_off,encoding='utf-8')
    print('字符串偏移位置: ',end='')
    print(hex(int(string_ids_off,16)))
    
    f.seek(0x40)
    type_ids_size = f.read(4)
    a = bytearray(type_ids_size)
    a.reverse()
    type_ids_size = bytes(a)
    type_ids_size = binascii.b2a_hex(type_ids_size)
    type_ids_size = str(type_ids_size,encoding='utf-8')
    print('类数量: ',end='')
    print(int(type_ids_size,16),end='')
    print('(',end='')
    print(hex(int(type_ids_size,16)),end='')
    print(')')
    
    f.seek(0x44)
    type_ids_off = f.read(4)
    a = bytearray(type_ids_off)
    a.reverse()
    type_ids_off = bytes(a)
    type_ids_off = binascii.b2a_hex(type_ids_off)
    type_ids_off = str(type_ids_off,encoding='utf-8')
    print('类偏移位置: ',end='')
    print(hex(int(type_ids_off,16)))
    
    f.seek(0x48)
    photo_ids_size = f.read(4)
    a = bytearray(photo_ids_size)
    a.reverse()
    photo_ids_size = bytes(a)
    photo_ids_size = binascii.b2a_hex(photo_ids_size)
    photo_ids_size = str(photo_ids_size,encoding='utf-8')
    print('方法原型数量: ',end='')
    print(int(photo_ids_size,16),end='')
    print('(',end='')
    print(hex(int(photo_ids_size,16)),end='')
    print(')')
    
    f.seek(0x4c)
    photo_ids_off = f.read(4)
    a = bytearray(photo_ids_off)
    a.reverse()
    photo_ids_off = bytes(a)
    photo_ids_off = binascii.b2a_hex(photo_ids_off)
    photo_ids_off = str(photo_ids_off,encoding='utf-8')
    print('方法原型偏移位置: ',end='')
    print(hex(int(photo_ids_off,16)))
    
    f.seek(0x50)
    field_ids_size = f.read(4)
    a = bytearray(field_ids_size)
    a.reverse()
    field_ids_size = bytes(a)
    field_ids_size = binascii.b2a_hex(field_ids_size)
    field_ids_size = str(field_ids_size,encoding='utf-8')
    print('字段数量: ',end='')
    print(int(field_ids_size,16),end='')
    print('(',end='')
    print(hex(int(field_ids_size,16)),end='')
    print(')')
    
    f.seek(0x54)
    field_ids_off = f.read(4)
    a = bytearray(field_ids_off)
    a.reverse()
    field_ids_off = bytes(a)
    field_ids_off = binascii.b2a_hex(field_ids_off)
    field_ids_off = str(field_ids_off,encoding='utf-8')
    print('字段偏移位置: ',end='')
    print(hex(int(field_ids_off,16)))
    
    f.seek(0x58)
    method_ids_size = f.read(4)
    a = bytearray(method_ids_size)
    a.reverse()
    method_ids_size = bytes(a)
    method_ids_size = binascii.b2a_hex(method_ids_size)
    method_ids_size = str(method_ids_size,encoding='utf-8')
    print('方法数量: ',end='')
    print(int(method_ids_size,16),end='')
    print('(',end='')
    print(hex(int(method_ids_size,16)),end='')
    print(')')
    
    f.seek(0x5c)
    method_ids_off = f.read(4)
    a = bytearray(method_ids_off)
    a.reverse()
    method_ids_off = bytes(a)
    method_ids_off = binascii.b2a_hex(method_ids_off)
    method_ids_off = str(method_ids_off,encoding='utf-8')
    print('方法偏移位置: ',end='')
    print(hex(int(method_ids_off,16)))
    
    f.seek(0x60)
    class_defs_size = f.read(4)
    a = bytearray(class_defs_size)
    a.reverse()
    class_defs_size = bytes(a)
    class_defs_size = binascii.b2a_hex(class_defs_size)
    class_defs_size = str(class_defs_size,encoding='utf-8')
    print('类定义数量: ',end='')
    print(int(class_defs_size,16),end='')
    print('(',end='')
    print(hex(int(class_defs_size,16)),end='')
    print(')')
    
    f.seek(0x64)
    class_defs_off = f.read(4)
    a = bytearray(class_defs_off)
    a.reverse()
    class_defs_off = bytes(a)
    class_defs_off = binascii.b2a_hex(class_defs_off)
    class_defs_off = str(class_defs_off,encoding='utf-8')
    print('类定义偏移位置: ',end='')
    print(hex(int(class_defs_off,16)))
    
    f.seek(0x68)
    data_size = f.read(4)
    a = bytearray(data_size)
    a.reverse()
    data_size = bytes(a)
    data_size = binascii.b2a_hex(data_size)
    data_size = str(data_size,encoding='utf-8')
    print('数据段大小: ',end='')
    print(int(data_size,16),end='')
    print('(',end='')
    print(hex(int(data_size,16)),end='')
    print(')')
    
    f.seek(0x6c)
    data_off = f.read(4)
    a = bytearray(data_off)
    a.reverse()
    data_off = bytes(a)
    data_off = binascii.b2a_hex(data_off)
    data_off = str(data_off,encoding='utf-8')
    print('数据段偏移位置: ',end='')
    print(hex(int(data_off,16)))
    
    if __name__ == '__main__':
    f = open("C:\Users\admin\Desktop\android_nx\classes.dex", 'rb', True)
    parserHeader(f)
    f.close()
    

    五、相关链接

      参考链接

      某作者github链接(相关附件下载):https://github.com/windy-purple/parserDex

      PS:部分图片来自于网络,侵删

  • 相关阅读:
    Spring MVC 迁移项目搭建运行
    linux 安装 nginx
    linux 安装 redis
    linux 安装 jdk
    存储过程之游标插入数据
    存储过程之基础语法
    AES加密解密,自定义加密规则记录
    idea破解记录
    Mysql-explain之Using temporary和Using filesort解决方案
    C#多线程学习笔记(朝夕eleven) Task启动方式、Task阻塞、Task.Delay()、多线程异常处理、任务取消、多线程的临时变量、共享数据的lock、Task返回值
  • 原文地址:https://www.cnblogs.com/aWxvdmVseXc0/p/11879093.html
Copyright © 2011-2022 走看看