前面写过动态链接库 延迟绑定的一篇博文,那篇文章我非常喜欢,但是当时刚搞清楚,自己写的比较凌乱,我最近学习了Ulrich Drepper的How to write share library,学习了几篇其他的讲述动态链接的文章,再次整理了这篇文章。
有一个问题是我们调用了动态链接库里面的函数,我们怎么知道动态链接库里面的函数的地址呢?事实上,直到我们第一次调用这个函数,我们并不知道这个函数的地址,这个功能要做延迟绑定 lazy bind。 因为程序的分支很多,并不是所有的分支都能跑到,想想我们的异常处理,异常处理分支的动态链接库里面的函数也许永远跑不到,所以,一上来就解析所有出现过的动态库里面的函数是个浪费的办法,降低性能并且没有必要。
下面我们看下延迟绑定的效果。我写了个程序,先睡15s,然后pthread_create 一个线程。我们用LD_DEBUG观察符号的解析。
- #include<stdio.h>
- #include<stdlib.h>
- #include<pthread.h>
- void* myfunc()
- {
- while(1)
- {
- sleep(10);
- }
- return NULL;
- }
- int main()
- {
- sleep(15);
- pthread_t tid = 0;
- int ret = pthread_create(&tid,NULL,myfunc,NULL);
- if(ret)
- {
- fprintf(stderr,"pthread create failed %m ");
- return -1;
- }
- ret = pthread_join(tid,NULL);
- if(ret)
- {
- fprintf(stderr,"pthread join failed %m ");
- return -2;
- }
- return 0;
- }
- root@libin:~/program/C/plt_got# LD_DEBUG=symbols ./test
- 2849: symbol=_res; lookup in file=./test [0]
- 2849: symbol=_res; lookup in file=/lib/tls/i686/cmov/libpthread.so.0 [0]
- 2849: symbol=_res; lookup in file=/lib/tls/i686/cmov/libc.so.6 [0]
- 2849: symbol=_IO_file_close; lookup in file=./test [0]
- 2849: symbol=_IO_file_close; lookup in file=/lib/tls/i686/cmov/libpthread.so.0 [0]
- 2849: symbol=_IO_file_close; lookup in file=/lib/tls/i686/cmov/libc.so.6 [0]
- 2849: symbol=rpc_createerr; lookup in file=./test [0]
- 2849: symbol=rpc_createerr; lookup in file=/lib/tls/i686/cmov/libpthread.so.0 [0]
- 2849: symbol=rpc_createerr; lookup in file=/lib/tls/i686/cmov/libc.so.6 [0]
...................
2849: transferring control: ./test
2849:
2849: symbol=sleep; lookup in file=./test [0]
2849: symbol=sleep; lookup in file=/lib/tls/i686/cmov/libpthread.so.0 [0]
2849: symbol=sleep; lookup in file=/lib/tls/i686/cmov/libc.so.6 [0]
===================================================================================
- 2849:
- 2849: symbol=sleep; lookup in file=./test [0]
- 2849: symbol=sleep; lookup in file=/lib/tls/i686/cmov/libpthread.so.0 [0]
- 2849: symbol=sleep; lookup in file=/lib/tls/i686/cmov/libc.so.6 [0]
- ===================================================================================
- 2849: symbol=pthread_create; lookup in file=./test [0]
- 2849: symbol=pthread_create; lookup in file=/lib/tls/i686/cmov/libpthread.so.0 [0]
- 2849: symbol=__getpagesize; lookup in file=./test [0]
- 2849: symbol=__getpagesize; lookup in file=/lib/tls/i686/cmov/libpthread.so.0 [0]
- 2849: symbol=__getpagesize; lookup in file=/lib/tls/i686/cmov/libc.so.6 [0]
- 2849: symbol=mmap; lookup in file=./test [0]
- 2849: symbol=mmap; lookup in file=/lib/tls/i686/cmov/libpthread.so.0 [0]
- 2849: symbol=mmap; lookup in file=/lib/tls/i686/cmov/libc.so.6 [0]
真正动态库中函数地址的解析是第一次调用的时候做的,然后如果再次用到动态库的解析过的函数,就直接用第一次解析的结果。很自然的想法就是,一定有地方存储函数的地址,否则第一次解析出来的结果,第二次调用也没法利用。 这个存储动态库函数的地方就要GOT,Global Offset Table。 OK,我们可以想象,如果我的程序里面用到了6个动态库里面的函数,那个这个GOT里面就应该存有6个条目,每个条目里面存储着对应函数的地址。事实的确是这样:
- root@libin:~/program/C/plt_got# readelf -r test
- Relocation section '.rel.dyn' at offset 0x394 contains 2 entries:
- Offset Info Type Sym.Value Sym. Name
- 08049ff0 00000206 R_386_GLOB_DAT 00000000 __gmon_start__
- 0804a020 00000905 R_386_COPY 0804a020 stderr
- Relocation section '.rel.plt' at offset 0x3a4 contains 6 entries:
- Offset Info Type Sym.Value Sym. Name
- 0804a000 00000107 R_386_JUMP_SLOT 00000000 pthread_join
- 0804a004 00000207 R_386_JUMP_SLOT 00000000 __gmon_start__
- 0804a008 00000407 R_386_JUMP_SLOT 00000000 __libc_start_main
- 0804a00c 00000507 R_386_JUMP_SLOT 00000000 fprintf
- 0804a010 00000607 R_386_JUMP_SLOT 00000000 pthread_create
- 0804a014 00000707 R_386_JUMP_SLOT 00000000 sleep
.got.plt这个段的起始地址是0x8049ff4。 .got.plt这个section大小为0x24 = 36,可是我们只有6个需要解析地址的function,4*6=24个字节,只需要24个字节就能存放这6个函数指针。多出来的12个字节是dynamic段地址,ModuleID 和 _dl_runtime_resolve的地址,如下图所示
OK 。我们看一下:
- (gdb) b main
- Breakpoint 1 at 0x8048551: file test.c, line 19.
- (gdb) r
- Starting program: /home/libin/program/C/plt_got/test
- [Thread debugging using libthread_db enabled]
- Breakpoint 1, main () at test.c:19
- 19 sleep(15);
- (gdb) x/24x 0x8049ff4
- 0x8049ff4 <_GLOBAL_OFFSET_TABLE_>: 0x08049f18 0x0012c8f8 0x00123270 0x0804841a
- 0x804a004 <_GLOBAL_OFFSET_TABLE_+16>: 0x0804842a 0x0015daf0 0x0804844a 0x0804845a
- 0x804a014 <_GLOBAL_OFFSET_TABLE_+32>: 0x0804846a 0x00000000 0x00000000 0x0029c580
- 0x804a024 : 0x00000000 0x00000000 0x00000000 0x00000000
- 0x804a034: 0x00000000 0x00000000 0x00000000 0x00000000
- 0x804a044: 0x00000000 0x00000000 0x00000000 0x00000000
- [21] .dynamic DYNAMIC 08049f18 000f18 0000d8 08 WA 7 0 4
接下来,我们要分析PLT 和GOT的关系了。
- (gdb) disas main
- ....
- 0x0804857e <+54>: lea 0x1c(%esp),%eax
- 0x08048582 <+58>: mov %eax,(%esp)
- 0x08048585 <+61>: call 0x8048454<pthread_create@plt>
- 0x0804858a <+66>: mov %eax,0x18(%esp)
- 0x0804858e <+70>: cmpl $0x0,0x18(%esp)
- .....
要执行pthread_create 函数,跳到PLT部分。
- libin@libin:~/program/C/plt_got$ objdump -dj .plt test
- test: file format elf32-i386
- Disassembly of section .plt:
- 08048404 <pthread_join@plt-0x10>:
- 8048404: ff 35 f8 9f 04 08 pushl 0x8049ff8
- 804840a: ff 25 fc 9f 04 08 jmp *0x8049ffc
- 8048410: 00 00 add %al,(%eax)
- ...
- 08048414 <pthread_join@plt>:
- 8048414: ff 25 00 a0 04 08 jmp *0x804a000
- 804841a: 68 00 00 00 00 push $0x0
- 804841f: e9 e0 ff ff ff jmp 8048404 <_init+0x30>
- 08048424 <__gmon_start__@plt>:
- 8048424: ff 25 04 a0 04 08 jmp *0x804a004
- 804842a: 68 08 00 00 00 push $0x8
- 804842f: e9 d0 ff ff ff jmp 8048404 <_init+0x30>
- 08048434 <__libc_start_main@plt>:
- 8048434: ff 25 08 a0 04 08 jmp *0x804a008
- 804843a: 68 10 00 00 00 push $0x10
- 804843f: e9 c0 ff ff ff jmp 8048404 <_init+0x30>
- 08048444 <fprintf@plt>:
- 8048444: ff 25 0c a0 04 08 jmp *0x804a00c
- 804844a: 68 18 00 00 00 push $0x18
- 804844f: e9 b0 ff ff ff jmp 8048404 <_init+0x30>
- 08048454 <pthread_create@plt>:
- 8048454: ff 25 10 a0 04 08 jmp *0x804a010
- 804845a: 68 20 00 00 00 push $0x20
- 804845f: e9 a0 ff ff ff jmp 8048404 <_init+0x30>
- 08048464 <sleep@plt>:
- 8048464: ff 25 14 a0 04 08 jmp *0x804a014
- 804846a: 68 28 00 00 00 push $0x28
- 804846f: e9 90 ff ff ff jmp 8048404 <_init+0x30>
- (gdb) x/10i 0x8048454
- 0x8048454 <pthread_create@plt>: jmp *0x804a010
- 0x804845a <pthread_create@plt+6>: push $0x20
- 0x804845f <pthread_create@plt+11>: jmp 0x8048404
- 0x8048464 <sleep@plt>: jmp *0x804a014
- 0x804846a <sleep@plt+6>: push $0x28
- 0x804846f <sleep@plt+11>: jmp 0x8048404
- 0x8048474: add %al,(%eax)
- 0x8048476: add %al,(%eax)
- 0x8048478: add %al,(%eax)
- 0x804847a: add %al,(%eax)
- (gdb) x/10x 0x804a010
- 0x804a010 <_GLOBAL_OFFSET_TABLE_+28>: 0x0804845a 0x0804846a 0x00000000 0x00000000
- 0x804a020 <stderr@@glibc_2.0>: 0x0029c580 0x00000000 0x00000000 0x00000000
- 0x804a030: 0x00000000 0x00000000
- 08048454 <pthread_create@plt>:
- 8048454: ff 25 10 a0 04 08 jmp *0x804a010
- 804845a: 68 20 00 00 00 push $0x20
- 804845f: e9 a0 ff ff ff jmp 8048404 <_init+0x30>
接下来,我们看0x8048404存放的是啥指令:
- (gdb) x/10i 0x8048404
- 0x8048404: pushl 0x8049ff8
- 0x804840a: jmp *0x8049ffc
- 0x8048410: add %al,(%eax)
- 0x8048412: add %al,(%eax)
- 0x8048414 <pthread_join@plt>: jmp *0x804a000
- 0x804841a <pthread_join@plt+6>: push $0x0
- 0x804841f <pthread_join@plt+11>: jmp 0x8048404
- 0x8048424 <__gmon_start__@plt>: jmp *0x804a004
- 0x804842a <__gmon_start__@plt+6>: push $0x8
- 0x804842f <__gmon_start__@plt+11>: jmp 0x8048404
- (gdb) x/10x 0x8049ffc
- 0x8049ffc <_GLOBAL_OFFSET_TABLE_+8>: 0x00123270 0x0804841a 0x0804842a 0x0015daf0
- 0x804a00c <_GLOBAL_OFFSET_TABLE_+24>: 0x0804844a 0x0804845a 0x0804846a 0x00000000
- 0x804a01c <__dso_handle>: 0x00000000 0x0029c580
- (gdb) x/10i 0x00123270
- 0x123270 <_dl_runtime_resolve>: push %eax
- 0x123271 <_dl_runtime_resolve+1>: push %ecx
- 0x123272 <_dl_runtime_resolve+2>: push %edx
- 0x123273 <_dl_runtime_resolve+3>: mov 0x10(%esp),%edx
- 0x123277 <_dl_runtime_resolve+7>: mov 0xc(%esp),%eax
- 0x12327b <_dl_runtime_resolve+11>: call 0x11d5a0 <_dl_fixup>
- 0x123280 <_dl_runtime_resolve+16>: pop %edx
- 0x123281 <_dl_runtime_resolve+17>: mov (%esp),%ecx
- 0x123284 <_dl_runtime_resolve+20>: mov %eax,(%esp)
- 0x123287 <_dl_runtime_resolve+23>: mov 0x4(%esp),%eax
我们watch下GOT pthread_create对应条目,看下这个条目啥时候变化:
- (gdb) b main
- Breakpoint 1 at 0x8048551: file test.c, line 19.
- (gdb) r
- Starting program: /home/libin/program/C/plt_got/test
- [Thread debugging using libthread_db enabled]
- Breakpoint 1, main () at test.c:19
- 19 sleep(15);
- (gdb) watch *0x804a010
- Hardware watchpoint 2: *0x804a010
- (gdb) c
- Continuing.
- Hardware watchpoint 2: *0x804a010
- Old value = 134513754
- New value = 1260912
- _dl_fixup (l=<value optimized out>, reloc_arg=<value optimized out>) at dl-runtime.c:155
- 155 dl-runtime.c: 没有那个文件或目录.
- in dl-runtime.c
- (gdb) bt
- #0 _dl_fixup (l=<value optimized out>, reloc_arg=<value optimized out>) at dl-runtime.c:155
- #1 0x00123280 in _dl_runtime_resolve () at ../sysdeps/i386/dl-trampoline.S:37
- #2 0x0804858a in main () at test.c:21
- (gdb)
看到了,是_dl_runtime_resolve调用了_dl_fixup修改了GOT的对应条目。
- (gdb) x/10i 1260912
- 0x133d70 <__pthread_create_2_1>: push %ebp
- 0x133d71 <__pthread_create_2_1+1>: mov %esp,%ebp
- 0x133d73 <__pthread_create_2_1+3>: push %edi
- 0x133d74 <__pthread_create_2_1+4>: push %esi
- 0x133d75 <__pthread_create_2_1+5>: push %ebx
- 0x133d76 <__pthread_create_2_1+6>: call 0x132340 <__i686.get_pc_thunk.bx>
- 0x133d7b <__pthread_create_2_1+11>: add $0x10279,%ebx
- 0x133d81 <__pthread_create_2_1+17>: sub $0x4c,%esp
- 0x133d84 <__pthread_create_2_1+20>: mov 0xc(%ebp),%edx
- 0x133d87 <__pthread_create_2_1+23>: test %edx,%edx
这是第一次。第二次就比较简单了,因为GOT里面有一个条目已经有了pthread_create函数的地址。
本文里面两个PLT图来自http://eli.thegreenplace.net/2011/11/03/position-independent-code-pic-in-shared-libraries/。这个博客内容相当的好,我学到了很多的东西。
参考文献:
2 How to write share library.
来源:http://blog.chinaunix.net/uid-24774106-id-3349549.html