zoukankan      html  css  js  c++  java
  • 在Ubuntu上安装Hadoop(单机模式)步骤

    1. 安装jdk:
    sudo apt-get install openjdk-6-jdk

    2. 配置ssh:
    安装ssh:
    apt-get install openssh-server

    为运行hadoop的用户生成一个SSH key:
    $ ssh-keygen -t rsa -P ""

    让你可以通过新生成的key来登录本地机器:
    $ cp ~/.ssh/id_rsa.pub ~/.ssh/authorized_keys

    3. 安装hadoop:
    下载hadoop tar.gz包
    并解压:
    tar -zxvf hadoop-2.2.0.tar.gz

    4. 配置:
    - 在~/.bashrc文件中添加:
    export HADOOP_HOME=/usr/local/hadoop
    export JAVA_HOME=/usr/lib/jvm/java-6-openjdk-amd64
    export PATH=$PATH:$HADOOP_HOME/bin
    在修改完成后保存,重新登录,相应的环境变量就配置好了。

    - 配置hadoop-env.sh:
    export JAVA_HOME=/usr/lib/jvm/java-6-openjdk-amd64


    - 配置hdfs-site.xml:
    <property>

    <name>hadoop.tmp.dir</name>

    <value>/app/hadoop/tmp</value>
    <description>A base for other temporary directories.</description>

    </property>


    <property>
    <name>fs.default.name</name>

    <value>hdfs://localhost:9000</value>
    <description>The name of the default file system. A URI whose
    scheme and
    authority determine the FileSystem implementation. The
    uri's scheme determines the
    config property (fs.SCHEME.impl) naming
    the FileSystem implementation class. The uri's
    authority is used to
    determine the host, port, etc. for a filesystem.</description>
    </property>

    - 配置mapred-site.xml:
    <property>
    <name>mapred.job.tracker</name>
    <value>localhost:9001</value>
    <description>The host and port that the MapReduce job tracker runs
    at. If "local", then jobs are run in-process as a single map
    and reduce task.
    </description>
    </property>

    - 配置hdfs-site.xml:
    <property>

    <name>dfs.replication</name>

    <value>1</value>
    <description>Default block replication.
    The actual number of replications can be
    specified when the file is created.
    The default is used if replication is not specified
    in create time.
    </description>

    </property>

    5. 通过 NameNode 来格式化 HDFS 文件系统
    $ /usr/local/hadoop/bin/hadoop namenode -format

    6. 运行hadoop
    $ /usr/local/hadoop/sbin/start-all.sh

    7. 检查hadoop的运行状况
    - 使用jps来检查hadoop的运行状况:
    $ jps

    - 使用netstat 命令来检查 hadoop 是否正常运行:
    $ sudo netstat -plten | grep java

    8. 停止运行hadoop:
    $ /usr/local/hadoop/bins/stop-all.sh

  • 相关阅读:
    hbase性能调优_表设计案例
    ItemCF_基于物品的协同过滤_MapReduceJava代码实现思路
    TF-IDF_MapReduceJava代码实现思路
    Hive HQL学习
    一对多(多对一)关系中的inverse和cascade属性
    Hive_UDF函数中集合对象初始化的注意事项
    kafka producer生产数据到kafka异常:Got error produce response with correlation id 16 on topic-partition...Error: NETWORK_EXCEPTION
    Kafka中操作topic时 Error:Failed to parse the broker info from zookeeper
    keepalived VS zookeeper
    算法--链表的回文结构
  • 原文地址:https://www.cnblogs.com/tiantianbyconan/p/3552711.html
Copyright © 2011-2022 走看看