zoukankan      html  css  js  c++  java
  • Ne10编译安装

    介绍

    NEON,即“ARM Advanced SIMD”,是ARM从ARMv7开始提供的高级单指令多数据(SIMD)扩展。它是一种64/128位混合SIMD体系结构。NEON在网上的资料比较少,对于新手来说不太友好。一番折腾之后,终于在GIT上找到一个封装好的NEON库,Ne10,内部用汇编实现了若干基本运算。

    Git地址

    安装指南

    预备

    先安装arm-linux交叉编译器:

    sudo apt-get install gcc-arm-linux-gnueabihf

    sudo apt-get install g++-arm-linux-gnueabihf

    否则,会出现编译错误

    cc: error: unrecognized command line option ‘-mthumb-interwork’
    cc: error: unrecognized command line option ‘-mthumb’
    cc: error: unrecognized command line option ‘-mfpu=vfp3’
    
    

    作为小白的我不知所以,抓狂很久,直到看到根目录下的GNUlinux_config.cmake才恍然大误大悟。
    关于abi的介绍,可参考这篇博客
    交叉编译器

    编译

    Native compilation on *nix platforms

    编译命令

    cd $NE10_PATH                       # Change directory to the location of the Ne10 source
    mkdir build && cd build             # Create the `build` directory and navigate into it
    export NE10_LINUX_TARGET_ARCH=armv7 # Set the target architecture (can also be "aarch64")
    cmake -DGNULINUX_PLATFORM=ON ..     # Run CMake to generate the build files
    make                                # Build the project
    

    这步总是有问题,找到的编译器是gccg++,而不是gcc-arm-linux-gnueabihfg++-arm-linux-gnueabihf

    -- Found assembler: /usr/bin/cc
    -- Check for working C compiler: /usr/bin/cc
    -- Check for working C compiler: /usr/bin/cc -- works
    -- Detecting C compiler ABI info
    -- Detecting C compiler ABI info - done
    -- Check for working CXX compiler: /usr/bin/c++
    -- Check for working CXX compiler: /usr/bin/c++ -- works
    
    

    ...Cross compilation on *nix platforms...

    编译命令

    cd $NE10_PATH
    mkdir build && cd build
    export NE10_LINUX_TARGET_ARCH=armv7  # Can also be "aarch64"
    cmake -DCMAKE_TOOLCHAIN_FILE=../GNUlinux_config.cmake ..
    make
    

    这步是ok的。找到了合适的编译器

    -- Found assembler: /usr/bin/arm-linux-gnueabihf-as
    -- Check for working C compiler: /usr/bin/arm-linux-gnueabihf-gcc
    -- Check for working C compiler: /usr/bin/arm-linux-gnueabihf-gcc -- works
    -- Detecting C compiler ABI info
    -- Detecting C compiler ABI info - done
    -- Check for working CXX compiler: /usr/bin/arm-linux-gnueabihf-g++
    -- Check for working CXX compiler: /usr/bin/arm-linux-gnueabihf-g++ -- works
    
    

    成功编译。

    Linking C static library libNE10.a
    [ 94%] Built target NE10
    Scanning dependencies of target NE10_test_static
    [ 95%] Building C object samples/CMakeFiles/NE10_test_static.dir/NE10_sample_intro.c.o
    [ 96%] Building C object samples/CMakeFiles/NE10_test_static.dir/NE10_sample_matrix_multiply.c.o
    [ 97%] Building C object samples/CMakeFiles/NE10_test_static.dir/NE10_sample_complex_fft.c.o
    [ 98%] Building C object samples/CMakeFiles/NE10_test_static.dir/NE10_sample_fir.c.o
    [100%] Building C object samples/CMakeFiles/NE10_test_static.dir/NE10_samples.c.o
    Linking CXX executable NE10_test_static
    [100%] Built target NE10_test_static
    
    

    作者还提到

    Additionally, for systems without hardware floating point support, the appropriate compilation options should be added to the CMAKE_C_FLAGS and CMAKE_ASM_FLAGS variables in the root CMakeLists.txt file. For example, -mfloat-abi=softfp -mfpu=neon.

    运行/build/samples/NE10_test_static出现错误

    bash: ./NE10_test_static: cannot execute binary file: Exec format error
    

    应该是32位程序不能运行在64位系统上。

    64位编译

    编译64位程序时(armv7改为aarch64)出现问题

    In file included from /home/XXX/Ne10-master/modules/imgproc/NE10_resize.neon.c:28:0:
    /home/XXX/Ne10-master/modules/imgproc/NE10_resize.neon.c: In function ‘ne10_img_vresize_linear_neon’:
    /home/XXX/Ne10-master/modules/imgproc/NE10_resize.neon.c:174:19: error: incompatible types when initializing type ‘int32x4_t’ using type ‘int32x2_t’
             qT_0123 = vmlaq_lane_s32 (qT_0123, qS1_0123, dBeta, 1);
    
    

    还没有找到问题所在。

    ...for Android

    编译命令

    cd $NE10_PATH
    mkdir build && cd build
    export ANDROID_NDK=/absolute/path/of/android-ndk  # Change to your local ndk path
    export NE10_ANDROID_TARGET_ARCH=armv7  # Can also be "aarch64"
    cmake -DCMAKE_TOOLCHAIN_FILE=../android/android_config.cmake ..
    make
    

    找到编译器

    -- Found assembler: /home/XXX/android-ndk-r13b//toolchains/arm-linux-androideabi-4.9/prebuilt/linux-x86_64/bin/arm-linux-androideabi-as
    -- Detecting C compiler ABI info
    -- Detecting C compiler ABI info - failed
    -- Detecting CXX compiler ABI info
    -- Detecting CXX compiler ABI info - failed
    -- Target architecture: armv7
    -- Building type: RELEASE
    -- Loaded toolchain:
        /home/XXX/android-ndk-r13b/toolchains/arm-linux-androideabi-4.9/prebuilt/linux-x86_64/bin/arm-linux-androideabi-gcc
        /home/XXX/android-ndk-r13b/toolchains/arm-linux-androideabi-4.9/prebuilt/linux-x86_64/bin/arm-linux-androideabi-g++
        /home/XXX/android-ndk-r13b/toolchains/arm-linux-androideabi-4.9/prebuilt/linux-x86_64/bin/arm-linux-androideabi-as
    
    

    成功编译

    Linking C static library libNE10.a
    Scanning dependencies of target NE10_test_static
    Linking CXX executable NE10_test_static
    Scanning dependencies of target NE10_test_demo
    Linking CXX shared library libNE10_test_demo.so
    [100%] Built target NE10_test_demo
    
    

    Android运行结果

    将运算重复运行十万次。具体还需要深入理解后再分析。

    # Introduction
    ne10_addc_float: 0.610000
    ne10_addc_float_c: 1.863000
    ne10_addc_float_neon: 0.652000
    
    # Matrix Multiply
    ne10_mulmat_3x3f: 4.211000
    ne10_mulmat_3x3f_c: 7.352000
    ne10_mulmat_3x3f_neon: 4.246000
    
  • 相关阅读:
    哥也能写KMP了——实现strstr()
    面试归来,感觉无望,下次再战
    Pow(x, n)
    Length of Last Word
    后缀数组应用
    2倍倍增算法构造后缀数组
    跳台阶问题
    求无序数组中第二大的数--快速选择
    单源最短路径问题
    全局下的isFinite
  • 原文地址:https://www.cnblogs.com/luyb/p/6492283.html
Copyright © 2011-2022 走看看