zoukankan      html  css  js  c++  java
  • image update to ubuntu18.04

    最近在升级apollo docker image nvidia/cuda:10.0-cudnn7-devel-ubuntu18.04。真叫一个头大,但是往往在这个过程中能够得出很多体会,比如应该怎么做更好,更容易维护,更容易升级等等。

    计划是分三个阶段来完成,目前还在第二阶段:

    1. build image:这个阶段的体会最深的是在速度方面,主要是cache相关。正确合理的利用好cache机制,但要注意bad cache。

    • 建议将较稳定偏系统不易变的 置于靠前位置。
    • 另一个,之前制作image相关的script到处都是apt-get update,基本在apt之前大家都update一把,其实是非常耗时的。关于这个操作做了什么,什么时候需要做,这里面的细节可以了解下。
    • 还有关于bad cache,cache机制以后再了解下,由于规模大,目前大部分image的dockerfile RUN xxx.sh,这个里面cache好像有点奇怪。
    • 小技巧,由于做一次整的build很耗时,在失败时,可以运行上一层成功的layer进行调试,很方便好使。

    2. compile code in new image:new image最终是用于承载product code,product code跟image的interaction主要体现在头文件和库文件。

    • new image做出来,还有很多跟product code可能不兼容的地方,特别是OS层的升级,携带着许多默认小库的自动升级,比如boost,vtk.xx==>vtk.xx等,会携带大量头文件及库文件的改动,以及接口变更等。这些往往还会带来关联波及。
    • 另一个就是之前的image使用大量打包的so,这些so都是基于以前的环境做出来,或者说运行很依赖于做它的环境,比如so是打包boost.54.xx相关东西,现在新环境很可能已经相应升级。这个很不便于跨度较大的升级。所以在考虑编译时间成本可以忍受的情况下,尽量不要去down so来用。

    3. test code:做这个的原因是,上述相关还只是停留在symbol层次的工作,还需要验证上述做的相关symbol变更是否达到等价的效果。

    refer: 

    https://www.joyfulbikeshedding.com/blog/2019-08-27-debugging-docker-builds.html

    https://vsupalov.com/debug-docker-container/

    $ docker run -it --entrypoint /bin/bash $IMAGE_NAME -s

    docker private registery http://192.168.1.101:5000/v2/mooncar/mooncar/tags/list

    一些好用的技巧:

    #!/bin/bash
    hgrep()
    {
      sudo find $1 -name "*.h" -o -name "*.hpp" |xargs grep -n $2
    }
    cgrep()
    {
      sudo find $1 -name "*.c" -o -name "*.cpp" |xargs grep -n $2
    }
    cmgrep()
    {
      sudo find $1 -name "CMakeList.txt" |xargs grep -n $2
    }

    把这个在.bashrc中source一把,很方便快捷好用。

    aptitude show libboost-dev            show version

    apt-cache madison libboost-dev    list candidate,sourcelist

    patch -p0 < xx.patch

    在每行的头添加字符,比如"HEAD",命令如下:sed 's/^/HEAD&/g' test.file

    在每行的行尾添加字符,比如“TAIL”,命令如下:sed 's/$/&TAIL/g' test.file

    ldd -r    https://blog.csdn.net/xihuanzhi1854/article/details/89523247

    nm -C xxx.so |grep "yyy" --color=auto

     grep "Werror" . -R |grep Makefile.in |awk -F ":" '{print $1}'|xargs sed -i "s/ -Werror//g"

    readelf -d libadolc.so |grep SONAME

    objdump -TC libleveldb.so

    look symbol in obj

    moonx@moonx:/usr/download/apue/ttt$ g++ -c main.c
    moonx@moonx:/usr/download/apue/ttt$ nm -C main.o 

    0000000000000000 T main
                                    U hello(char const*)
    moonx@moonx:/usr/download/apue/ttt$ gcc -c hello.c
    moonx@moonx:/usr/download/apue/ttt$ nm -C hello.o
    0000000000000000 T hello
                                    U printf
    moonx@moonx:/usr/download/apue/ttt$ g++ -c hello.c
    moonx@moonx:/usr/download/apue/ttt$ nm -C hello.o
                                    U printf
    0000000000000000 T hello(char const*)
    moonx@moonx:/usr/download/apue/ttt$ gcc -c main.c
    moonx@moonx:/usr/download/apue/ttt$ nm -C main.o
                                     U hello
    0000000000000000 T main

    make static lib and dynamic lib

    https://blog.csdn.net/thinkerABC/article/details/621817?utm_medium=distribute.pc_relevant_t0.none-task-blog-BlogCommendFromMachineLearnPai2-1.edu_weight&depth_1-utm_source=distribute.pc_relevant_t0.none-task-blog-BlogCommendFromMachineLearnPai2-1.edu_weight

    ABI     http://litaotju.github.io/2019/02/24/Why-we-need-D_GLIBCXX_USE_CXX11_ABI=0/

    revise:http://litaotju.github.io/c++/2019/02/24/Why-we-need-D_GLIBCXX_USE_CXX11_ABI=0/

    objdump -T -C libfoo.so
    
    • -T stands for dynamic symbols
    • -C will help making c++ methods more human-friendly

    apollo@in_dev_docker:/apollo/ttt/abi$ objdump -TC libmy.so |grep print
    0000000000000945 g DF .text 0000000000000068 Base print_string(std::string const&)
    apollo@in_dev_docker:/apollo/ttt/abi$ g++ -fPIC mylib.cpp -shared -o libmy.so -D_GLIBCXX_USE_CXX11_ABI=0
    apollo@in_dev_docker:/apollo/ttt/abi$ objdump -TC libmy.so |grep print
    0000000000000945 g DF .text 0000000000000068 Base print_string(std::string const&)
    apollo@in_dev_docker:/apollo/ttt/abi$ g++ -fPIC mylib.cpp -shared -o libmy.so
    apollo@in_dev_docker:/apollo/ttt/abi$ objdump -TC libmy.so |grep print
    00000000000009b5 g DF .text 0000000000000068 Base print_string(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
    apollo@in_dev_docker:/apollo/ttt/abi$ objdump -TC libmy.so |grep string
    0000000000000000 DF *UND* 0000000000000000 GLIBCXX_3.4.21 std::basic_ostream<char, std::char_traits<char> >& std::operator<< <char, std::char_traits<char>, std::allocator<char> >(std::basic_ostream<char, std::char_traits<char> >&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
    00000000000009b5 g DF .text 0000000000000068 Base print_string(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)

    apollo@in_dev_docker:/apollo/ttt/abi$ nm libmy.so |grep print_string
    00000000000009b5 T _Z12print_stringRKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE
    apollo@in_dev_docker:/apollo/ttt/abi$ nm /usr/lib/x86_64-linux-gnu/libleveldb.so |grep open
    nm: /usr/lib/x86_64-linux-gnu/libleveldb.so: no symbols
    apollo@in_dev_docker:/apollo/ttt/abi$ nm -D /usr/lib/x86_64-linux-gnu/libleveldb.so |grep open
    U fopen
    0000000000013080 T leveldb_open
    0000000000013f60 T leveldb_options_set_max_open_files
    U open
    U opendir
    apollo@in_dev_docker:/apollo/ttt/abi$ nm -DC /usr/lib/x86_64-linux-gnu/libleveldb.so |grep open
    U fopen
    0000000000013080 T leveldb_open
    0000000000013f60 T leveldb_options_set_max_open_files
    U open
    U opendir
    apollo@in_dev_docker:/apollo/ttt/abi$ g++ -fPIC mylib.cpp -shared -o libmy.so -D_GLIBCXX_USE_CXX11_ABI=0
    apollo@in_dev_docker:/apollo/ttt/abi$ g++ myapp.cpp -lmy -L./ -o myapp
    /tmp/ccU6YIvW.o: In function `main':
    myapp.cpp:(.text+0x43): undefined reference to `print_string(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)'
    collect2: error: ld returned 1 exit status
    apollo@in_dev_docker:/apollo/ttt/abi$ nm libmy.so |grep print_string0000000000000945 T _Z12print_stringRKSs
    linker相关:

    $ dpkg -l|grep boost

    echo "/usr/local/mysql/lib" >> /etc/ld.so.conf

    sudo ldconfig -v | grep mysql # 查看mysql库文件是否被找到。

    what apt leave: /var/lib/dpkg/info

    export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/mysql/lib

    LIBRARY_PATH is used by gcc before compilation to search directories containing static and shared libraries that need to be linked to your program.

    LD_LIBRARY_PATH is used by your program to search directories containing shared libraries after it has been successfully compiled and linked.

    EDIT: As pointed below, your libraries can be static or shared. If it is static then the code is copied over into your program and you don't need to search for the library after your program is compiled and linked. If your library is shared then it needs to be dynamically linked to your program and that's when LD_LIBRARY_PATH comes into play.

     #在PATH中找到可执行文件程序的路径。
    export PATH =$PATH:$HOME/bin
    #找到动态链接库的路径
    LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/MyLib
    export LD_LIBRARY_PATH
    #找到静态库的路径
    LIBRARY_PATH=$LIBRARY_PATH:/MyLib
    export LIBRARY_PATH

    gcc -L / -l option flags

    gcc -l links with a library file.

    gcc -L looks in directory for library files.

    https://alex.dzyoba.com/blog/gdb-source-path/

    头文件相关:

    #gcc找到头文件的路径
    C_INCLUDE_PATH=$C_INCLUDE_PATH:/usr/include/libxml2:/MyLib
    export C_INCLUDE_PATH
    #g++找到头文件的路径
    CPLUS_INCLUDE_PATH=$CPLUS_INCLUDE_PATH:/usr/include/libxml2:/MyLib
    export CPLUS_INCLUDE_PATH

    cpp -Iheaders -v    like  gcc -Iheaders source.c 

    cpp -iquote hdr1 -v

    ABI http://litaotju.github.io/2019/02/24/Why-we-need-D_GLIBCXX_USE_CXX11_ABI=0/

    如何用diff比较两个文件夹下的内容 

    diff -ruNa s1 s2 >> s1.patch

    patch -p0 < s1.patch , patch to s1

    cp -a

    在保留原文件属性的前提下复制文件 

     cp -r dirname destdir 

    复制目录后其文件属性会发生变化
    想要使得复制之后的目录和原目录完全一样,可以使用cp -a dirname destdir 

    动态链接库文件(windows里的dll)在linux里以.so结尾,称为shared object library 。该文件是elf(Executable and Linkable Format)文件的一种,有两个符号表,“.symtab”和“.dynsym”。“.dynsym”只保留“.symtab”中的全局符号(global symbols )。命令strip可以去掉elf文件中“.symtab”,但不会去掉“.dynsym”。/lib里的共享对象库.so文件在使用nm时提示no symbol是因为被strip了。所以需要查看动态符号表“.dynsym”,加上-D: 

    usr@usrpc:~$nm -Do /lib/*.so.*  

    类似的命令还有:

    readelf --symbols *.so.* 

    objdump -TC *.so.*

    $ mount -o remount,rw /

    update nvidia-driver http://www.linuxandubuntu.com/home/how-to-install-latest-nvidia-drivers-in-linux

    1. sudo apt-get purge nvidia*

    2. Add the graphics drivers PPA​

    Let us go ahead and add the graphics-driver PPA –

    1. sudo add-apt-repository ppa:graphics-drivers
    2. And update
    3. sudo apt-get update

    4. Install (and activate) the latest Nvidia graphics drivers. Enter the following command to install the version of Nvidia graphics supported by your graphics card –

    1. sudo apt-get install nvidia-370

    https://codepyre.com/2019/01/installing-nvidia-docker2-on-ubuntu-18.0.4/  install nvidia-docker2

    130 sudo apt update
    131 apt-cache search linux|grep linux- |grep 4.15.0-128
    132 sudo apt install linux-headers-4.15.0-128 linux-headers-4.15.0-128-generic linux-image-4.15.0-128-generic linux-modules-4.15.0-128-generic

    switch kernel:

    1.输入命令:sudogedit /etc/default/grub)

    2.找到hidden_timeout 数字改为10,保存

    3. 这行代码下面有个bool量设置 改为false

    4. 终端执行命令:sudoupdate-grub

    1. grep menuentry /boot/grub/grub.cfg
    该命令显示内核的顺序,比如显示为:

    menuentry 'Ubuntu, with Linux 3.2.17experimental' --class ubuntu --class gnu-linux --class gnu --class os {
    menuentry 'Ubuntu, with Linux 3.2.17experimental (recovery mode)' --class ubuntu --class gnu-linux --class gnu --class os {
    menuentry 'Ubuntu, with Linux 3.2.17-chipsee' --class ubuntu --class gnu-linux --class gnu --class os {
    menuentry 'Ubuntu, with Linux 3.2.17-chipsee (recovery mode)' --class ubuntu --class gnu-linux --class gnu --class os {
    menuentry 'Ubuntu, with Linux 3.2.0-23-generic' --class ubuntu --class gnu-linux --class gnu --class os {
    menuentry 'Ubuntu, with Linux 3.2.0-23-generic (recovery mode)' --class ubuntu --class gnu-linux --class gnu --class os {
    menuentry "Memory test (memtest86+)" {
    menuentry "Memory test (memtest86+, serial console 115200)"
    2. 假设你要以3.2.17内核版本启动,则将文件/etc/default/grub中
    GRUB_DEFAULT=0 改为 GRUB_DEFAULT=2保存后

    18.04 

    3. 然后使用命令sudo update-grub

    1950 sudo apt-get install --reinstall nvidia-410

    1953 sudo apt-get install --reinstall nvidia-410
    1955 vi /var/lib/dkms/nvidia-410/410.78/build/Kbuild
    1956 sudo vi /var/lib/dkms/nvidia-410/410.78/build/Kbuild
    1957 cp /var/lib/dkms/nvidia-410/410.78/build/Kbuild .
    1958 cp Kbuild /var/lib/dkms/nvidia-410/410.78/build/Kbuild
    1959 sudo cp Kbuild /var/lib/dkms/nvidia-410/410.78/build/Kbuild
    1961 dpkg -l | grep nvidia
    1962 nvidia-smi

    sder@sder-kvm-yangpeng:~$ sudo update-grub
    Generating grub configuration file ...
    Found linux image: /boot/vmlinuz-4.15.0-129-generic
    Found initrd image: /boot/initrd.img-4.15.0-129-generic
    Found linux image: /boot/vmlinuz-4.15.0-128-generic
    Found initrd image: /boot/initrd.img-4.15.0-128-generic
    Warning: Please don't use old title `Ubuntu, with Linux 4.15.0-128-generic' for GRUB_DEFAULT, use `Advanced options for Ubuntu>Ubuntu, with Linux 4.15.0-128-generic' (for versions before 2.00) or `gnulinux-advanced-648cad3b-7e55-4d6d-b38b-9247483aecb4>gnulinux-4.15.0-128-generic-advanced-648cad3b-7e55-4d6d-b38b-9247483aecb4' (for 2.00 or later)
    Found memtest86+ image: /boot/memtest86+.elf
    Found memtest86+ image: /boot/memtest86+.bin
    done
    sder@sder-kvm-yangpeng:~$ sudo vi /etc/default/grub
    sder@sder-kvm-yangpeng:~$ sudo update-grub
    Generating grub configuration file ...
    Found linux image: /boot/vmlinuz-4.15.0-129-generic
    Found initrd image: /boot/initrd.img-4.15.0-129-generic
    Found linux image: /boot/vmlinuz-4.15.0-128-generic
    Found initrd image: /boot/initrd.img-4.15.0-128-generic
    Found memtest86+ image: /boot/memtest86+.elf
    Found memtest86+ image: /boot/memtest86+.bin
    done
    sder@sder-kvm-yangpeng:~$ cat /etc/default/grub
    # If you change this file, run 'update-grub' afterwards to update
    # /boot/grub/grub.cfg.
    # For full documentation of the options in this file, see:
    # info -f grub -n 'Simple configuration'

    GRUB_DEFAULT="Advanced options for Ubuntu>Ubuntu, with Linux 4.15.0-128-generic"
    #GRUB_HIDDEN_TIMEOUT=0
    #GRUB_HIDDEN_TIMEOUT_QUIET=true
    GRUB_TIMEOUT=10
    GRUB_DISTRIBUTOR=`lsb_release -i -s 2> /dev/null || echo Debian`
    GRUB_CMDLINE_LINUX_DEFAULT="quiet splash"
    GRUB_CMDLINE_LINUX=""

    # Uncomment to enable BadRAM filtering, modify to suit your needs
    # This works with Linux (no patch required) and with any kernel that obtains
    # the memory map information from GRUB (GNU Mach, kernel of FreeBSD ...)
    #GRUB_BADRAM="0x01234567,0xfefefefe,0x89abcdef,0xefefefef"

    # Uncomment to disable graphical terminal (grub-pc only)
    #GRUB_TERMINAL=console

    # The resolution used on graphical terminal
    # note that you can use only modes which your graphic card supports via VBE
    # you can see them in real GRUB with the command `vbeinfo'
    #GRUB_GFXMODE=640x480

    # Uncomment if you don't want GRUB to pass "root=UUID=xxx" parameter to Linux
    #GRUB_DISABLE_LINUX_UUID=true

    # Uncomment to disable generation of recovery mode menu entries
    #GRUB_DISABLE_RECOVERY="true"

    # Uncomment to get a beep at grub start
    #GRUB_INIT_TUNE="480 440 1"

    查看系统自启动服务:systemctl list-unit-files --type=service  | grep enable

    samba: 

    [sambashare]
    comment = Samba on Ubuntu
    path = /home/corenoc/leax
    read only = no
    browsable = yes
    guest ok = yes

    Ubuntu18.04关闭内核自动更新 

    ubuntu默认启动了自动更新内核,为了避免出现重启系统后遇到错误进入不到系统中去,我们可以进一步关闭内核更新,使用当前内核。

    执行:

    root@linux:~# sudo apt-mark hold linux-image-generic linux-headers-generic 
    linux-image-generic set on hold.
    linux-headers-generic set on hold.

    如果要重启启动内核更新:

    root@linux:~# sudo apt-mark unhold linux-image-generic linux-headers-generic

    https://askubuntu.com/questions/540937/what-does-apt-get-install-do-under-the-hood 

    apollo@in_dev_docker:/apollo/bazel-bin/third_party/portable_file_dialogs$ update-alternatives --config gcc
    There are 2 choices for the alternative gcc (providing /usr/bin/gcc).

    Selection Path Priority Status
    ------------------------------------------------------------
    * 0 /usr/bin/gcc-8 60 auto mode
    1 /usr/bin/gcc-4.8 10 manual mode
    2 /usr/bin/gcc-8 60 manual mode

    Press <enter> to keep the current choice[*], or type selection number: q 

    W: An error occurred during the signature verification. The repository is not updated and the previous index files will be used. GPG error: https://nvidia.github.io/nvidia-container-runtime/ubuntu16.04/amd64 InRelease: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY 6ED91CA3AC1160CD

    A: sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 40976EAF437D05B5

    yangpeng@mx:/etc$ cat ./systemd/system/docker.service.d/override.conf
    [Service]
    ExecStart=
    ExecStart=/usr/bin/dockerd --host=fd:// --add-runtime=nvidia=/usr/bin/nvidia-container-runtime

  • 相关阅读:
    MySQL优化二 缓存参数优化
    Ant Design Pro 学习三 新建组件
    Ant Design Pro 学习二 新建菜单-布局
    因素空间从概率论、模糊集走向人工智能---汪培庄
    纪念L.A. Zadeh教授
    Configure the Stanford segmenter for NLTK
    navicat 连接sqlserver提示要安装 sql server native client
    CentOS6.5+nginx+tomcat负载均衡集群
    CentOS6.5安装mysql5.1.73
    linux64位操作系统装32位jdk解决方法
  • 原文地址:https://www.cnblogs.com/cjyp/p/11198377.html
Copyright © 2011-2022 走看看