zoukankan      html  css  js  c++  java
  • Ne10编译安装

    介绍

    NEON,即“ARM Advanced SIMD”,是ARM从ARMv7开始提供的高级单指令多数据(SIMD)扩展。它是一种64/128位混合SIMD体系结构。NEON在网上的资料比较少,对于新手来说不太友好。一番折腾之后,终于在GIT上找到一个封装好的NEON库,Ne10,内部用汇编实现了若干基本运算。

    Git地址

    安装指南

    预备

    先安装arm-linux交叉编译器:

    sudo apt-get install gcc-arm-linux-gnueabihf

    sudo apt-get install g++-arm-linux-gnueabihf

    否则,会出现编译错误

    cc: error: unrecognized command line option ‘-mthumb-interwork’
    cc: error: unrecognized command line option ‘-mthumb’
    cc: error: unrecognized command line option ‘-mfpu=vfp3’
    
    

    作为小白的我不知所以,抓狂很久,直到看到根目录下的GNUlinux_config.cmake才恍然大误大悟。
    关于abi的介绍,可参考这篇博客
    交叉编译器

    编译

    Native compilation on *nix platforms

    编译命令

    cd $NE10_PATH                       # Change directory to the location of the Ne10 source
    mkdir build && cd build             # Create the `build` directory and navigate into it
    export NE10_LINUX_TARGET_ARCH=armv7 # Set the target architecture (can also be "aarch64")
    cmake -DGNULINUX_PLATFORM=ON ..     # Run CMake to generate the build files
    make                                # Build the project
    

    这步总是有问题,找到的编译器是gccg++,而不是gcc-arm-linux-gnueabihfg++-arm-linux-gnueabihf

    -- Found assembler: /usr/bin/cc
    -- Check for working C compiler: /usr/bin/cc
    -- Check for working C compiler: /usr/bin/cc -- works
    -- Detecting C compiler ABI info
    -- Detecting C compiler ABI info - done
    -- Check for working CXX compiler: /usr/bin/c++
    -- Check for working CXX compiler: /usr/bin/c++ -- works
    
    

    ...Cross compilation on *nix platforms...

    编译命令

    cd $NE10_PATH
    mkdir build && cd build
    export NE10_LINUX_TARGET_ARCH=armv7  # Can also be "aarch64"
    cmake -DCMAKE_TOOLCHAIN_FILE=../GNUlinux_config.cmake ..
    make
    

    这步是ok的。找到了合适的编译器

    -- Found assembler: /usr/bin/arm-linux-gnueabihf-as
    -- Check for working C compiler: /usr/bin/arm-linux-gnueabihf-gcc
    -- Check for working C compiler: /usr/bin/arm-linux-gnueabihf-gcc -- works
    -- Detecting C compiler ABI info
    -- Detecting C compiler ABI info - done
    -- Check for working CXX compiler: /usr/bin/arm-linux-gnueabihf-g++
    -- Check for working CXX compiler: /usr/bin/arm-linux-gnueabihf-g++ -- works
    
    

    成功编译。

    Linking C static library libNE10.a
    [ 94%] Built target NE10
    Scanning dependencies of target NE10_test_static
    [ 95%] Building C object samples/CMakeFiles/NE10_test_static.dir/NE10_sample_intro.c.o
    [ 96%] Building C object samples/CMakeFiles/NE10_test_static.dir/NE10_sample_matrix_multiply.c.o
    [ 97%] Building C object samples/CMakeFiles/NE10_test_static.dir/NE10_sample_complex_fft.c.o
    [ 98%] Building C object samples/CMakeFiles/NE10_test_static.dir/NE10_sample_fir.c.o
    [100%] Building C object samples/CMakeFiles/NE10_test_static.dir/NE10_samples.c.o
    Linking CXX executable NE10_test_static
    [100%] Built target NE10_test_static
    
    

    作者还提到

    Additionally, for systems without hardware floating point support, the appropriate compilation options should be added to the CMAKE_C_FLAGS and CMAKE_ASM_FLAGS variables in the root CMakeLists.txt file. For example, -mfloat-abi=softfp -mfpu=neon.

    运行/build/samples/NE10_test_static出现错误

    bash: ./NE10_test_static: cannot execute binary file: Exec format error
    

    应该是32位程序不能运行在64位系统上。

    64位编译

    编译64位程序时(armv7改为aarch64)出现问题

    In file included from /home/XXX/Ne10-master/modules/imgproc/NE10_resize.neon.c:28:0:
    /home/XXX/Ne10-master/modules/imgproc/NE10_resize.neon.c: In function ‘ne10_img_vresize_linear_neon’:
    /home/XXX/Ne10-master/modules/imgproc/NE10_resize.neon.c:174:19: error: incompatible types when initializing type ‘int32x4_t’ using type ‘int32x2_t’
             qT_0123 = vmlaq_lane_s32 (qT_0123, qS1_0123, dBeta, 1);
    
    

    还没有找到问题所在。

    ...for Android

    编译命令

    cd $NE10_PATH
    mkdir build && cd build
    export ANDROID_NDK=/absolute/path/of/android-ndk  # Change to your local ndk path
    export NE10_ANDROID_TARGET_ARCH=armv7  # Can also be "aarch64"
    cmake -DCMAKE_TOOLCHAIN_FILE=../android/android_config.cmake ..
    make
    

    找到编译器

    -- Found assembler: /home/XXX/android-ndk-r13b//toolchains/arm-linux-androideabi-4.9/prebuilt/linux-x86_64/bin/arm-linux-androideabi-as
    -- Detecting C compiler ABI info
    -- Detecting C compiler ABI info - failed
    -- Detecting CXX compiler ABI info
    -- Detecting CXX compiler ABI info - failed
    -- Target architecture: armv7
    -- Building type: RELEASE
    -- Loaded toolchain:
        /home/XXX/android-ndk-r13b/toolchains/arm-linux-androideabi-4.9/prebuilt/linux-x86_64/bin/arm-linux-androideabi-gcc
        /home/XXX/android-ndk-r13b/toolchains/arm-linux-androideabi-4.9/prebuilt/linux-x86_64/bin/arm-linux-androideabi-g++
        /home/XXX/android-ndk-r13b/toolchains/arm-linux-androideabi-4.9/prebuilt/linux-x86_64/bin/arm-linux-androideabi-as
    
    

    成功编译

    Linking C static library libNE10.a
    Scanning dependencies of target NE10_test_static
    Linking CXX executable NE10_test_static
    Scanning dependencies of target NE10_test_demo
    Linking CXX shared library libNE10_test_demo.so
    [100%] Built target NE10_test_demo
    
    

    Android运行结果

    将运算重复运行十万次。具体还需要深入理解后再分析。

    # Introduction
    ne10_addc_float: 0.610000
    ne10_addc_float_c: 1.863000
    ne10_addc_float_neon: 0.652000
    
    # Matrix Multiply
    ne10_mulmat_3x3f: 4.211000
    ne10_mulmat_3x3f_c: 7.352000
    ne10_mulmat_3x3f_neon: 4.246000
    
  • 相关阅读:
    HDU 1800 Flying to the Mars 字典树,STL中的map ,哈希树
    字典树 HDU 1075 What Are You Talking About
    字典树 HDU 1251 统计难题
    最小生成树prim算法 POJ2031
    POJ 1287 Networking 最小生成树
    次小生成树 POJ 2728
    最短路N题Tram SPFA
    poj2236 并查集
    POJ 1611并查集
    Number Sequence
  • 原文地址:https://www.cnblogs.com/luyb/p/6492283.html
Copyright © 2011-2022 走看看