zoukankan      html  css  js  c++  java
  • 【原创】大数据基础之Doris(1)编译安装和启动

    一 编译

    doris编译有两种方式,一种是docker编译,一种是直接裸机编译,推荐使用docker编译,可以避免大量的环境依赖问题

    docker编译

    1 安装docker

    yum install docker
    systemctl start docker
    systemctl enable docker
    docker pull apachedoris/doris-dev:build-env-1.2

    2 下载源码

    wget https://mirrors.bfsu.edu.cn/apache/incubator/doris/0.13.0-incubating/apache-doris-0.13.0-incubating-src.tar.gz
    tar xvf apache-doris-0.13.0-incubating-src.tar.gz

    3 启动容器

    docker run -it -v /root/.m2:/root/.m2 -v /path/to/apache-doris-0.13.0-incubating-src:/root/apache-doris-0.13.0-incubating-src apachedoris/doris-dev:build-env-1.2

    需要做两个目录映射,一个是maven的repository目录,一个是doris源码目录,避免容器挂了之后之前下载或编译的内容丢失

    4 编译doris

    cd /root/apache-doris-0.13.0-incubating-src
    sh -x build.sh

    编译之后输出至output目录,有3个子目录:be、fe、udf,只需要拷贝output目录到其他服务器即可

    编译时报错

    [ERROR] Plugin net.sourceforge.czt.dev:cup-maven-plugin:1.6-cdh or one of its dependencies could not be resolved: Failed to read artifact descriptor for net.sourceforge.czt.dev:cup-maven-plugin:jar:1.6-cdh: Could not transfer artifact net.sourceforge.czt.dev:cup-maven-plugin:pom:1.6-cdh from/to spring-plugins (https://repo.spring.io/plugins-release/): Authentication failed for https://repo.spring.io/plugins-release/net/sourceforge/czt/dev/cup-maven-plugin/1.6-cdh/cup-maven-plugin-1.6-cdh.pom 401 Unauthorized -> [Help 1

    修改如下:

                     <!-- for java-cup -->
                     <repository>
                     <!--
                         <id>cloudera-thirdparty</id>
                         <url>https://repository.cloudera.com/content/repositories/third-party/</url>
                         -->
                         <id>cloudera-public</id>
                         <url>https://repository.cloudera.com/artifactory/public/</url>
                     </repository>
    
    
                     <!-- for cup-maven-plugin -->
                     <pluginRepository>
                     <!--
                         <id>spring-plugins</id>
                         <url>https://repo.spring.io/plugins-release/</url>
                         -->
                         <id>cloudera-public</id>
                         <url>https://repository.cloudera.com/artifactory/public/</url>
                     </pluginRepository>
    

    5 编译broker

    cd fs_brokers/apache_hdfs_broker
    sh -x build.sh

    6 编译spark connector

    cd extension/spark-doris-connector
    sh -x build.sh

    裸机编译

    1 准备

    jdk8+
    maven

    sudo yum groupinstall 'Development Tools' && sudo yum install maven cmake byacc flex automake libtool bison binutils-devel zip unzip ncurses-devel curl git wget python2 glibc-static libstdc++-static java-1.8.0-openjdk

    其中:centos7上gcc默认4.8.5,cmake默认2.8.12

    2 升级GCC

    yum install gcc-c++
    wget http://ftp.tsukuba.wide.ad.jp/software/gcc/releases/gcc-7.3.0/gcc-7.3.0.tar.gz
    tar zxvf tar zxvf gcc-7.3.0.tar.gz
    cd gcc-7.3.0
    yum install lbzip2
    ./contrib/download_prerequisites
    mkdir build
    cd build/
    ../configure --enable-checking=release --enable-languages=c,c++ --disable-multilib
    make
    make install

    3 升级CMAKE

    wget https://cmake.org/files/v3.6/cmake-3.6.2.tar.gz
    tar xvf cmake-3.6.2.tar.gz && cd cmake-3.6.2/
    ./bootstrap
    gmake
    gmake install
    mv /usr/bin/cmake /usr/bin/cmake.bak
    ln -s /usr/local/bin/cmake /usr/bin/

    4 编译

    sh -x build.sh

    报错处理

    报错1

    Downloading libevent-20180622-24236aed01798303745470e6c498bf606e88724a.zip from https://doris-incubating-repo.bj.bcebos.com/thirdparty/libevent-20180622-24236aed01798303745470e6c498bf606e88724a.zip to /usr/local/app/doris/thirdparty/src
    --2021-01-11 09:32:59-- https://doris-incubating-repo.bj.bcebos.com/thirdparty/libevent-20180622-24236aed01798303745470e6c498bf606e88724a.zip
    Resolving doris-incubating-repo.bj.bcebos.com (doris-incubating-repo.bj.bcebos.com)... 2409:8c00:6c21:10ad:0:ff:b00e:67d, 220.181.33.44, 220.181.33.43
    Connecting to doris-incubating-repo.bj.bcebos.com (doris-incubating-repo.bj.bcebos.com)|2409:8c00:6c21:10ad:0:ff:b00e:67d|:443... connected.
    HTTP request sent, awaiting response... 404 Not Found
    2021-01-11 09:32:59 ERROR 404: Not Found.

    因为url已经修改,参考git上最新的文件如下:
    https://github.com/apache/incubator-doris/blob/master/thirdparty/vars.sh

    解决方法:将url修改为

    LIBEVENT_DOWNLOAD="https://doris-thirdparty-repo.bj.bcebos.com/thirdparty/libevent-20180622-24236aed01798303745470e6c498bf606e88724a.zip"
    

    参考:https://github.com/apache/incubator-doris/issues/5519

    报错2

    下载第三方DataTables报错403,直接访问
    https://datatables.net/download/builder?bs-3.3.7/jq-3.3.1/dt-1.10.22
    提示

    Error: Only libraries of the current release version can be used by the. package builder. The library DataTables's current release version is 1.10.24. Please reload the download builder page to have it use the latest libraries.

    解决方法:修改thirdparty/vars.sh,将版本改为最新的1.10.24,同时修改md5sum

    # datatables, bootstrap 3 and jQuery 3
    #DATATABLES_DOWNLOAD="https://datatables.net/download/builder?bs-3.3.7/jq-3.3.1/dt-1.10.22"
    #DATATABLES_NAME="DataTables.zip"
    #DATATABLES_SOURCE="DataTables-1.10.22"
    #DATATABLES_MD5SUM="62558846fc6a6db1428e7816a2a351f7"
    DATATABLES_DOWNLOAD="https://datatables.net/download/builder?bs-3.3.7/jq-3.3.1/dt-1.10.24"
    DATATABLES_NAME="DataTables.zip"
    DATATABLES_SOURCE="DataTables-1.10.24"
    DATATABLES_MD5SUM="22404292d02cf3c5f4cd9f5a02d4b42c"
    
    报错3

    checking how to run the C preprocessor... /usr/lib64/ccache/../bin/cpp
    configure: error: in `/data/app/apache-doris-0.13.0-incubating-src/thirdparty/src/unixODBC-2.3.7':
    configure: error: C preprocessor "/usr/lib64/ccache/../bin/cpp" fails sanity check

    解决方法:

    ln -s /usr/local/bin/cpp /usr/lib64/bin/cpp

    报错4

    ./comp_err: error while loading shared libraries: libatomic.so.1: cannot open shared object file: No such file or directory

    解决方法:

    yum -y install libatomic

    报错5

    cp: cannot stat ‘./zstd_ep-install/lib/libzstd.a’: No such file or directory

    解决方法:

    cd thirdparty/src/zstd-1.3.7/tests
    make zstd-staticLib
    mkdir -p /path/to/apache-doris-0.12.0-incubating-src/thirdparty/src/arrow-apache-arrow-0.15.1/cpp/release/zstd_ep-install/lib64
    cp ../lib/libzstd.a /path/to/apache-doris-0.12.0-incubating-src/thirdparty/src/arrow-apache-arrow-0.15.1/cpp/release/zstd_ep-install/lib64

    二 启动

    1 启动FE

    cd output/fe
    mkdir doris-meta
    bin/start_fe.sh --daemon

    日志在log目录下

    注意:

    • 默认8030端口可能与yarn的resourcemanager冲突
    • 启动之后检查fe绑定端口的ip是否正确,如果绑定ip错误(比如安装docker之后取到docker的ip),会导致be无法连接fe,需要配置fe.conf中的priority_networks,配置为正确的网段

    priority_networks=192.168.1.0/24

    2 启动BE

    cd output/be
    mkdir /path/to/storage
    vim conf/be.conf
    bin/start_be.sh --daemon

    日志在log目录下

    注意:

    • 默认8040端口可能与yarn的nodemanager冲突
    • 如果be启动失败,一般可能有两个原因:一个是端口被占用,一个是limit,根据日志排查,比如报错
      Doris Be http service did not start correctly. exiting.
      则是因为端口占用导致web启动失败

    修改limit

    临时修改

    limit -n 65535

    永久修改

    vim /etc/security/limit.conf
    *               hard    nofile             65535
    *               soft    nofile             65535
    

    3 启动Broker

    cd fs_brokers/apache_hdfs_broker/output/apache_hdfs_broker
    bin/start_broker.sh --daemon

    三 使用

    1 FE web访问

    http://fe_server:8030

    注意:

    • 端口为http_port
    • 默认账号root,密码空

    2 FE 命令访问

    mysql -P9030 -uroot
    show proc '/frontends';

    注意:

    • 端口为query_port

    3 添加或删除FE

    alter system add follower '$host:$port';
    show proc '/frontends';

    4 添加或删除BE

    alter system add backend '$host:$port';
    alter system dropp backend '$host:$port';
    show proc '/backends';

    注意:

    • port默认为9050,即heartbeat_service_port

    5 添加或删除Broker

    alter system add broker $broker_name '$host:$port';
    show proc '/brokers';

    注意:

    • port默认为8000,即broker ipc_port

    四 数据导入

    hive数据导入

    create database test;
    CREATE TABLE test.test_user_doris
    (
      `id` varchar(128) , 
      `name` varchar(128) , 
      `country` varchar(128) , 
      `province` varchar(128) , 
      `city` varchar(128) ,  
      `order_count` int SUM
    )
    AGGREGATE KEY(id, name, country, province, city)
    DISTRIBUTED BY HASH(id) BUCKETS 10
    PROPERTIES("replication_num" = "3");
    
    LOAD LABEL load_test_user_doris
    (
        DATA INFILE("hdfs://nameservice1/user/hive/warehouse/test.db/test_user/*")
        INTO TABLE `test_user_doris`
        FORMAT as "parquet"
    )
    WITH BROKER $broker_name 
    (
        "dfs.nameservices" = "nameservice1",
        "dfs.ha.namenodes.nameservice1" = "namenode1,namenode2",
        "dfs.namenode.rpc-address.nameservice1.namenode1" = "nn1:8020",
        "dfs.namenode.rpc-address.nameservice1.namenode2" = "nn2:8020"
    )
    PROPERTIES
    (
        "timeout"="36000",
        "max_filter_ratio"="0.1"
    );
    
    show load;
    

    五 参数配置

    show variables;

    同mysql

    参考:


    ---------------------------------------------------------------- 结束啦,我是大魔王先生的分割线 :) ----------------------------------------------------------------
    • 由于大魔王先生能力有限,文中可能存在错误,欢迎指正、补充!
    • 感谢您的阅读,如果文章对您有用,那么请为大魔王先生轻轻点个赞,ありがとう
  • 相关阅读:
    nginx2
    nginx1
    将Tomcat设置为自动启动的服务最快捷方法
    keepalived VS zookeeper
    Linux CentOS 7 下 Apache Tomcat 7 安装与配置
    使用curl 命令模拟POST/GET请求
    netty3---传统IO,NIO,nettyIO
    个人或小型企业站该如何选择服务器?
    如果你不懂备案,那我简单点跟你说
    SAE Java相关问题小结
  • 原文地址:https://www.cnblogs.com/barneywill/p/14808288.html
Copyright © 2011-2022 走看看