zoukankan      html  css  js  c++  java
  • Sqoop安装及操作

    一、集群环境:

    Hostname

    IP

    Hadoop版本

    Hadoop

    功能

    系统

    node1

    192.168.1.151

    0.20.0

    namenode

    hive+sqoop

    rhel5.4X86

    node2

    192.168.1.152

    0.20.0

    datanode

    mysql

    rhel5.4X86

    node3

    192.168.1.153

    0.20.0

    datanode

    rhel5.4X86

     二、安装sqoop

    1、下载sqoop压缩包,并解压

    压缩包分别是:sqoop-1.2.0-CDH3B4.tar.gz,hadoop-0.20.2-CDH3B4.tar.gz, Mysql JDBC驱动包mysql-connector-java-5.1.10-bin.jar

    [root@node1 ~]# ll
    drwxr-xr-x 15 root  root      4096 Feb 22  2011 hadoop-0.20.2-CDH3B4
    -rw-r--r--  1 root  root    724225 Sep 15 06:46 mysql-connector-java-5.1.10-bin.jar
    drwxr-xr-x 11 root  root      4096 Feb 22  2011 sqoop-1.2.0-CDH3B4

    2、将sqoop-1.2.0-CDH3B4拷贝到/home/hadoop目录下,并将Mysql JDBC驱动包和hadoop-0.20.2-CDH3B4下的hadoop-core-0.20.2-CDH3B4.jar至sqoop-1.2.0-CDH3B4/lib下,最后修改一下属主。

    [root@node1 ~]# cp mysql-connector-java-5.1.10-bin.jar sqoop-1.2.0-CDH3B4/lib
    [root@node1 ~]# cp hadoop-0.20.2-CDH3B4/hadoop-core-0.20.2-CDH3B4.jar sqoop-1.2.0-CDH3B4/lib
    [root@node1 ~]# chown -R hadoop:hadoop sqoop-1.2.0-CDH3B4
    [root@node1 ~]# mv sqoop-1.2.0-CDH3B4 /home/hadoop
    [root@node1 ~]# ll /home/hadoop
    total 35748
    -rw-rw-r--  1 hadoop hadoop      343 Sep 15 05:13 derby.log
    drwxr-xr-x 13 hadoop hadoop     4096 Sep 14 16:16 hadoop-0.20.2
    drwxr-xr-x  9 hadoop hadoop     4096 Sep 14 20:21 hive-0.10.0
    -rw-r--r--  1 hadoop hadoop 36524032 Sep 14 20:20 hive-0.10.0.tar.gz
    drwxr-xr-x  8 hadoop hadoop     4096 Sep 25  2012 jdk1.7
    drwxr-xr-x 12 hadoop hadoop     4096 Sep 15 00:25 mahout-distribution-0.7
    drwxrwxr-x  5 hadoop hadoop     4096 Sep 15 05:13 metastore_db
    -rw-rw-r--  1 hadoop hadoop      406 Sep 14 16:02 scp.sh
    drwxr-xr-x 11 hadoop hadoop     4096 Feb 22  2011 sqoop-1.2.0-CDH3B4
    drwxrwxr-x  3 hadoop hadoop     4096 Sep 14 16:17 temp
    drwxrwxr-x  3 hadoop hadoop     4096 Sep 14 15:59 user

    3、配置configure-sqoop,注释掉对于HBase和ZooKeeper的检查

    [root@node1 bin]# pwd
    /home/hadoop/sqoop-1.2.0-CDH3B4/bin
    [root@node1 bin]# vi configure-sqoop 
    
    #!/bin/bash
    #
    # Licensed to Cloudera, Inc. under one or more
    # contributor license agreements.  See the NOTICE file distributed with
    # this work for additional information regarding copyright ownership.
    .
    .
    .
    # Check: If we can't find our dependencies, give up here.
    if [ ! -d "${HADOOP_HOME}" ]; then
      echo "Error: $HADOOP_HOME does not exist!"
      echo 'Please set $HADOOP_HOME to the root of your Hadoop installation.'
      exit 1
    fi
    #if [ ! -d "${HBASE_HOME}" ]; then
    #  echo "Error: $HBASE_HOME does not exist!"
    #  echo 'Please set $HBASE_HOME to the root of your HBase installation.'
    #  exit 1
    #fi
    #if [ ! -d "${ZOOKEEPER_HOME}" ]; then
    #  echo "Error: $ZOOKEEPER_HOME does not exist!"
    #  echo 'Please set $ZOOKEEPER_HOME to the root of your ZooKeeper installation.'
    #  exit 1
    #fi

    4、修改/etc/profile和.bash_profile文件,添加Hadoop_Home,调整PATH

    [hadoop@node1 ~]$ vi .bash_profile 
    
    # .bash_profile
    
    # Get the aliases and functions
    if [ -f ~/.bashrc ]; then
            . ~/.bashrc
    fi
    
    # User specific environment and startup programs
    
    HADOOP_HOME=/home/hadoop/hadoop-0.20.2
    PATH=$HADOOP_HOME/bin:$PATH:$HOME/bin
    export HIVE_HOME=/home/hadoop/hive-0.10.0
    export MAHOUT_HOME=/home/hadoop/mahout-distribution-0.7
    export PATH HADOOP_HOME

    三、测试Sqoop

    1、查看mysql中的数据库:

    [hadoop@node1 bin]$ ./sqoop list-databases --connect jdbc:mysql://192.168.1.152:3306/ --username sqoop --password sqoop
    13/09/15 07:17:16 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
    13/09/15 07:17:17 INFO manager.MySQLManager: Executing SQL statement: SHOW DATABASES
    information_schema
    mysql
    performance_schema
    sqoop
    test

    2、将mysql的表导入到hive中:

    [hadoop@node1 bin]$ ./sqoop import --connect jdbc:mysql://192.168.1.152:3306/sqoop --username sqoop --password sqoop --table test --hive-import -m 1
    13/09/15 08:15:01 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
    13/09/15 08:15:01 INFO tool.BaseSqoopTool: Using Hive-specific delimiters for output. You can override
    13/09/15 08:15:01 INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc.
    13/09/15 08:15:01 INFO tool.CodeGenTool: Beginning code generation
    13/09/15 08:15:01 INFO manager.MySQLManager: Executing SQL statement: SELECT t.* FROM `test` AS t LIMIT 1
    13/09/15 08:15:02 INFO manager.MySQLManager: Executing SQL statement: SELECT t.* FROM `test` AS t LIMIT 1
    13/09/15 08:15:02 INFO orm.CompilationManager: HADOOP_HOME is /home/hadoop/hadoop-0.20.2/bin/..
    13/09/15 08:15:02 INFO orm.CompilationManager: Found hadoop core jar at: /home/hadoop/hadoop-0.20.2/bin/../hadoop-0.20.2-core.jar
    13/09/15 08:15:03 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hadoop/compile/a71936fd2bb45ea6757df22751a320e3/test.jar
    13/09/15 08:15:03 WARN manager.MySQLManager: It looks like you are importing from mysql.
    13/09/15 08:15:03 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
    13/09/15 08:15:03 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
    13/09/15 08:15:03 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
    13/09/15 08:15:03 INFO mapreduce.ImportJobBase: Beginning import of test
    13/09/15 08:15:04 INFO manager.MySQLManager: Executing SQL statement: SELECT t.* FROM `test` AS t LIMIT 1
    13/09/15 08:15:05 INFO mapred.JobClient: Running job: job_201309150505_0009
    13/09/15 08:15:06 INFO mapred.JobClient:  map 0% reduce 0%
    13/09/15 08:15:34 INFO mapred.JobClient:  map 100% reduce 0%
    13/09/15 08:15:36 INFO mapred.JobClient: Job complete: job_201309150505_0009
    13/09/15 08:15:36 INFO mapred.JobClient: Counters: 5
    13/09/15 08:15:36 INFO mapred.JobClient:   Job Counters 
    13/09/15 08:15:36 INFO mapred.JobClient:     Launched map tasks=1
    13/09/15 08:15:36 INFO mapred.JobClient:   FileSystemCounters
    13/09/15 08:15:36 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=583323
    13/09/15 08:15:36 INFO mapred.JobClient:   Map-Reduce Framework
    13/09/15 08:15:36 INFO mapred.JobClient:     Map input records=65536
    13/09/15 08:15:36 INFO mapred.JobClient:     Spilled Records=0
    13/09/15 08:15:36 INFO mapred.JobClient:     Map output records=65536
    13/09/15 08:15:36 INFO mapreduce.ImportJobBase: Transferred 569.6514 KB in 32.0312 seconds (17.7842 KB/sec)
    13/09/15 08:15:36 INFO mapreduce.ImportJobBase: Retrieved 65536 records.
    13/09/15 08:15:36 INFO hive.HiveImport: Removing temporary files from import process: test/_logs
    13/09/15 08:15:36 INFO hive.HiveImport: Loading uploaded data into Hive
    13/09/15 08:15:36 INFO manager.MySQLManager: Executing SQL statement: SELECT t.* FROM `test` AS t LIMIT 1
    13/09/15 08:15:36 INFO manager.MySQLManager: Executing SQL statement: SELECT t.* FROM `test` AS t LIMIT 1
    13/09/15 08:15:41 INFO hive.HiveImport: Logging initialized using configuration in jar:file:/home/hadoop/hive-0.10.0/lib/hive-common-0.10.0.jar!/hive-log4j.properties
    13/09/15 08:15:41 INFO hive.HiveImport: Hive history file=/tmp/hadoop/hive_job_log_hadoop_201309150815_1877092059.txt
    13/09/15 08:16:10 INFO hive.HiveImport: OK
    13/09/15 08:16:10 INFO hive.HiveImport: Time taken: 28.791 seconds
    13/09/15 08:16:11 INFO hive.HiveImport: Loading data to table default.test
    13/09/15 08:16:12 INFO hive.HiveImport: Table default.test stats: [num_partitions: 0, num_files: 1, num_rows: 0, total_size: 583323, raw_data_size: 0]
    13/09/15 08:16:12 INFO hive.HiveImport: OK
    13/09/15 08:16:12 INFO hive.HiveImport: Time taken: 1.704 seconds
    13/09/15 08:16:12 INFO hive.HiveImport: Hive import complete.
  • 相关阅读:
    树莓派开机启动
    树莓派连接18b20测温度
    树莓派VNC
    树莓派python 控制GPIO
    树莓派笔记
    用nohup执行python程序时,print无法输出
    mysql 函数应用
    mysql 正则表达式判断是否数字
    mysql select into 不支持
    tushare
  • 原文地址:https://www.cnblogs.com/Richardzhu/p/3322635.html
Copyright © 2011-2022 走看看