1. 测试MapReduce Job
1.1 上传文件到hdfs文件系统
$ jps
15520 Jps
13426 SecondaryNameNode
14003 JobHistoryServer
13211 NameNode
13612 ResourceManager
$ jps > infile
$ hadoop fs -mkdir /inputdir
$ hadoop fs -put infile /inputdir
$ hadoop fs -ls /inputdir
Found 1 items
-rw-r--r-- 3 hduser supergroup 94 2017-09-01 11:02 /inputdir/infile
1.2 进行word count计算
$ hadoop jar /usr/local/hadoop-2.7.3/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar wordcount /inputdir /outputdir
17/09/01 11:04:37 INFO client.RMProxy: Connecting to ResourceManager at /
17/09/01 11:04:39 INFO input.FileInputFormat: Total input paths to process : 1
17/09/01 11:04:39 INFO mapreduce.JobSubmitter: number of splits:1
17/09/01 11:04:40 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1504106569900_0001
17/09/01 11:04:41 INFO impl.YarnClientImpl: Submitted application application_1504106569900_0001
17/09/01 11:04:41 INFO mapreduce.Job: The url to track the job: http://sht-sgmhadoopnn-01:8088/proxy/application_1504106569900_0001/
17/09/01 11:04:41 INFO mapreduce.Job: Running job: job_1504106569900_0001
17/09/01 11:04:58 INFO mapreduce.Job: Job job_1504106569900_0001 running in uber mode : false
17/09/01 11:04:58 INFO mapreduce.Job: map 0% reduce 0%
17/09/01 11:05:06 INFO mapreduce.Job: map 100% reduce 0%
17/09/01 11:05:15 INFO mapreduce.Job: map 100% reduce 100%
17/09/01 11:05:16 INFO mapreduce.Job: Job job_1504106569900_0001 completed successfully
17/09/01 11:05:16 INFO mapreduce.Job: Counters: 49
File System Counters
FILE: Number of bytes read=160
FILE: Number of bytes written=238465
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=200
HDFS: Number of bytes written=114
HDFS: Number of read operations=6
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters
Launched map tasks=1
Launched reduce tasks=1
Data-local map tasks=1
Total time spent by all maps in occupied slots (ms)=5960
Total time spent by all reduces in occupied slots (ms)=6543
Total time spent by all map tasks (ms)=5960
Total time spent by all reduce tasks (ms)=6543
Total vcore-milliseconds taken by all map tasks=5960
Total vcore-milliseconds taken by all reduce tasks=6543
Total megabyte-milliseconds taken by all map tasks=6103040
Total megabyte-milliseconds taken by all reduce tasks=6700032
Map-Reduce Framework
Map input records=5
Map output records=10
Map output bytes=134
Map output materialized bytes=160
Input split bytes=106
Combine input records=10
Combine output records=10
Reduce input groups=10
Reduce shuffle bytes=160
Reduce input records=10
Reduce output records=10
Spilled Records=20
Shuffled Maps =1
Failed Shuffles=0
Merged Map outputs=1
GC time elapsed (ms)=223
CPU time spent (ms)=2280
Physical memory (bytes) snapshot=426209280
Virtual memory (bytes) snapshot=4179288064
Total committed heap usage (bytes)=315097088
Shuffle Errors
File Input Format Counters
Bytes Read=94
File Output Format Counters
Bytes Written=114
1.3 查看wordcount结果
$ hadoop fs -ls /outputdir
Found 2 items
-rw-r--r-- 3 hduser supergroup 0 2017-09-01 11:05 /outputdir/_SUCCESS
-rw-r--r-- 3 hduser supergroup 114 2017-09-01 11:05 /outputdir/part-r-00000
$ hadoop fs -cat /outputdir/part-r-00000
13211 1
13426 1
13612 1
14003 1
15541 1
JobHistoryServer 1
Jps 1
NameNode 1
ResourceManager 1
SecondaryNameNode 1
2. 测试hdfs分布式存储
2.1 上传测试文件
$ ls -lh hadoop-2.7.3.tar.gz
-rw-r--r-- 1 root root 205M May 5 09:01 hadoop-2.7.3.tar.gz
$ hadoop fs -put hadoop-2.7.3.tar.gz /inputdir
$ hadoop fs -ls -h /inputdir
Found 2 items
-rw-r--r-- 3 hduser supergroup 204.2 M 2017-09-01 11:09 /inputdir/hadoop-2.7.3.tar.gz
-rw-r--r-- 3 hduser supergroup 94 2017-09-01 11:02 /inputdir/infile
2.2 查看datanode副本信息