1/增加hadoop 的用户;
sudo useradd -m hadoop -s /bin/bash
创建了可以登陆的 hadoop 用户,并使用 /bin/bash 作为 shell。
设置密码:sudo passwd hadoop
2/可为 hadoop 用户增加管理员权限
sudo adduser hadoop sudo
更新源
2 sudo apt update
3/安装ssh服务器,配置无密码登陆
3 sudo apt-get install openssh-server4 ssh localhost 5 cd ~/.ssh/ 6 ssh-keygen -t rsa 7 cat ./id_rsa.pub >> ./authorized_keys
测试:
8 ssh localhost
一般选择下载最新的稳定版本,即下载 “stable” 下的 hadoop-2.x.y.tar.gz 这个格式的文件,这是编译好的,另一个包含 src 的则是 Hadoop 源代码,需要进行编译才可使用。
验证安装文件
33 cat hadoop-2.7.2.tar.gz.mds |grep 'MD5' 34 head -n 6 hadoop-2.7.2.tar.gz.mds 35 md5sum hadoop-2.7.2.tar.gz |tr "a-z" "A-Z"
将 Hadoop 安装至 /usr/local/ 中:
sudo tar -zxf ~/packages/hadoop-2.7.2.tar.gz -C /usr/local
sudo tar -zxf ~/下载/hadoop-2.6.0.tar.gz -C /usr/local # 解压到/usr/local中cd /usr/local/sudo mv ./hadoop-2.7.2/ ./hadoop # 将文件夹名改为hadoopsudo chown -R hadoop:hadoop ./hadoop # 修改文件权限
测试:
cd hadoop./bin/hadoop version
hadoop@Athena:/usr/local/hadoop$ ./bin/hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar
Valid program names are:
aggregatewordcount: An Aggregate based map/reduce program that counts the words in the input files.
aggregatewordhist: An Aggregate based map/reduce program that computes the histogram of the words in the input files.
bbp: A map/reduce program that uses Bailey-Borwein-Plouffe to compute exact digits of Pi.
dbcount: An example job that count the pageview counts from a database.
distbbp: A map/reduce program that uses a BBP-type formula to compute exact bits of Pi.
grep: A map/reduce program that counts the matches of a regex in the input.
join: A job that effects a join over sorted, equally partitioned datasets
multifilewc: A job that counts words from several files.
pentomino: A map/reduce tile laying program to find solutions to pentomino problems.
pi: A map/reduce program that estimates Pi using a quasi-Monte Carlo method.
randomtextwriter: A map/reduce program that writes 10GB of random textual data per node.
randomwriter: A map/reduce program that writes 10GB of random data per node.
secondarysort: An example defining a secondary sort to the reduce.
sort: A map/reduce program that sorts the data written by the random writer.
sudoku: A sudoku solver.
teragen: Generate data for the terasort
terasort: Run the terasort
teravalidate: Checking results of terasort
wordcount: A map/reduce program that counts the words in the input files.
wordmean: A map/reduce program that counts the average length of the words in the input files.
wordmedian: A map/reduce program that counts the median length of the words in the input files.
wordstandarddeviation: A map/reduce program that counts the standard deviation of the length of the words in the input files.
./bin/hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar wordcount README.txt ~/RESULT
注意,Hadoop 默认不会覆盖结果文件,因此再次运行上面实例会提示出错,需要先将
./output
删除。