zoukankan      html  css  js  c++  java
  • sqoop部署

    下载安装包

    sqoop-1.99.3-bin-hadoop200.tar.gz

    解压

    tar zxvf sqoop-1.99.3-bin-hadoop200.tar.gz

    建立sqoop链接

    ln -s sqoop-1.99.3-bin-hadoop200 sqoop

    修改sqoop配置

    cd sqoop

    vi server/conf/catalina.properties

    修改内容如下:
    找到common.loader行,把/usr/lib/hadoop/lib/*.jar改成你的hadoop jar 包目录
    例如:/home/hadoop/hadoop/share/hadoop/yarn/lib/*.jar,
    /home/hadoop/hadoop/share/hadoop/yarn/*.jar,
    /home/hadoop/hadoop/share/hadoop/hdfs/*.jar,
    /home/hadoop/hadoop/share/hadoop/hdfs/lib/*.jar,
    /home/hadoop/hadoop/share/hadoop/mapreduce/*.jar,
    /home/hadoop/hadoop/share/hadoop/mapreduce/lib/*.jar,
    /home/hadoop/hadoop/share/hadoop/common/lib/*.jar,
    /home/hadoop/hadoop/share/hadoop/common/*.jar

     

    vi server/conf/sqoop.properties
    找到:mapreduce.configuration.directory行,修改值为你的hadoop配置文件目录
    如:/home/hadoop/hadoop/etc/hadoop/
    并且替换@LOGDIR@ 和@BASEDIR@ :
    0,$ s/@LOGDIR@/logs/g
    0,$ s/@BASEDIR@/base/g

     

    然后找到你的数据库jdbc驱动复制到sqoop/lib目录下,如果不存在则创建

    修改环境参数

    vi /etc/profile

    增加以下内容:

    export SQOOP_HOME=/home/hadoop/sqoop

    export PATH=$PATH:$SQOOP_HOME/bin

    export CATALINA_BASE=$SQOOP_HOME/server

    export LOGDIR=$SQOOP_HOME/logs/

    执行环境参数

    source /etc/profile

    启动

    ./bin/sqoop.sh server start

    测试

    bin/sqoop.sh client
    默认sqoop开启ports 12000 and 12001

    停止

    ./bin/sqoop.sh server stop

    Configure client to use your Sqoop server:

    sqoop:000> set server --host your.host.com --port 12000 --webapp sqoop
    显示版本:show version --all
    显示连接器:show connector --all
    创建连接:create connection --cid 1
    Creating connection for connector with id 1
    Please fill following values to create new connection object
    Name: First connection
    
    Configuration configuration
    JDBC Driver Class: com.mysql.jdbc.Driver
    JDBC Connection String: jdbc:mysql://mysql.server/database
    Username: sqoop
    Password: *****
    JDBC Connection Properties:
    There are currently 0 values in the map:
    entry#
    
    Security related configuration options
    Max connections: 0
    New connection was successfully created with validation status FINE and persistent id 1
    显示连接:show connection
    创建任务:create job --xid 1 --type import
    sqoop:000> create job --xid 1 --type import
    Creating job for connection with id 1
    Please fill following values to create new job object
    Name: First job
    
    Database configuration
    Table name: users
    Table SQL statement:
    Table column names:
    Partition column name:
    Boundary query:
    
    Output configuration
    Storage type:
      0 : HDFS
    Choose: 0
    Output directory: /user/jarcec/users
    New job was successfully created with validation status FINE and persistent id 1

     Throttling resources
        Extractors: 20
        Loaders: 10
    注意创建job过程中会出现Extractors跟Loaders分别对应map 跟reduce个数
    启动任务:start job --jid 1
    启动任务同步执行:start job --jid 1 -s
    显示任务:status job --jid 1
    显示所有任务:show job -a
    停止任务:stop job --jid 1
    克隆连接:clone connection --xid 1
    克隆任务:clone job --jid 1
     
    运行wordcount出现:Application application_1396260476774_0001 failed 2 times due to AM Container for appattempt_1396260476774_0001_000002 exited with exitCode: 1 due to: Exception from container-launch
    查看
    hadoop/logs/userlogs/application_1386683368281_0001/container_1386683368281_0001_01_000001/stderr
     
    yarn配置修改完后,可以正常跑wordcount,sqoop还是提示Exception from container-launch: 这个时候把sqoop server 重启就行
     
    导出数据出现异常
    is running beyond physical memory limits. Current usage: 1.1 GB of 1 GB physical memory used; 1.6 GB of 6 GB virtual memory used. Killing container. 
    修改mapred-site.xml
    <property>
    <name>mapred.map.child.java.opts</name>
    <value>-Xmx8000m</value>
    </property>
    yarn-site.xml
            <property>
                    <name>yarn.nodemanager.vmem-pmem-ratio</name>
                    <value>8</value>
            </property>
     
            <property>
                    <name>yarn.app.mapreduce.am.resource.mb</name>
                    <value>2046</value>
            </property>
     
    使用sqoop导入数据时,当数据量变大时,在map/reduce的过程中就会提示 java heap space error。经过总结,解决方法有两个:
    1、 修改每个运行子进程的jvm大小
     修改mapred-site.xml文件,添加以下属性:
    <property>
      <name>mapred.child.java.opts</name>
      <value>-Xmx8000m</value>
    </property>
    <property>
      <name>mapred.reduce.child.java.opts</name>
      <value>-Xmx8000m</value>
    </property>
    <property>
      <name>mapred.map.child.java.opts</name>
      <value>-Xmx8000m</value>
    </property>
     
    2、 增加map数量,
    sqoop job里设置Extractors与Loaders数量
     
  • 相关阅读:
    jira 解决结果配置
    .net core ef mysql in 参数化写法
    CentOS安装破解版Jira 亲测有效(附带破解包)
    实现js读取Excel数据
    android权限(permission)大全
    如何搭建Nuget服务器
    WebApi配置Swagger
    Aps.Net WebApi依赖注入
    解决.Net Core跨域问题
    一篇关于Asp.Net Model验证响应消息的问题处理
  • 原文地址:https://www.cnblogs.com/langke93/p/3664981.html
Copyright © 2011-2022 走看看