Mahout小案例学习,实现k-means算法。
环境:OS:Centos 6.5 x64 & Soft:Hadoop 1.2.1 & Mahout 0.9
1、下载测试数据
[huser@master hadoop]$ wget http://archive.ics.uci.edu/ml/databases/synthetic_control/synthetic_control.data
2、数据拷贝到HDFS
[huser@master hadoop]$ hadoop-1.2.1/bin/hadoop fs -mkdir ./testdata Warning: $HADOOP_HOME is deprecated.
[huser@master hadoop]$ hadoop-1.2.1/bin/hadoop fs -put ./synthetic_control.data ./testdata Warning: $HADOOP_HOME is deprecated.
[huser@master hadoop]$ hadoop-1.2.1/bin/hadoop fs -ls ./testdata Warning: $HADOOP_HOME is deprecated. Found 1 items -rw-r--r-- 1 huser supergroup 288374 2014-04-17 14:02 /user/huser/testdata/synthetic_control.data
3、做一个kmeans聚类测试
[huser@master hadoop]$ mahout org.apache.mahout.clustering.syntheticcontrol.kmeans.Job
4、观察输出
[huser@master hadoop]$ hadoop-1.2.1/bin/hadoop fs -ls ./output Warning: $HADOOP_HOME is deprecated. Found 15 items -rw-r--r-- 1 huser supergroup 194 2014-04-17 14:18 /user/huser/output/_policy drwxr-xr-x - huser supergroup 0 2014-04-17 14:19 /user/huser/output/clusteredPoints drwxr-xr-x - huser supergroup 0 2014-04-17 14:10 /user/huser/output/clusters-0 drwxr-xr-x - huser supergroup 0 2014-04-17 14:13 /user/huser/output/clusters-1 drwxr-xr-x - huser supergroup 0 2014-04-17 14:18 /user/huser/output/clusters-10-final drwxr-xr-x - huser supergroup 0 2014-04-17 14:14 /user/huser/output/clusters-2 drwxr-xr-x - huser supergroup 0 2014-04-17 14:14 /user/huser/output/clusters-3 drwxr-xr-x - huser supergroup 0 2014-04-17 14:15 /user/huser/output/clusters-4 drwxr-xr-x - huser supergroup 0 2014-04-17 14:15 /user/huser/output/clusters-5 drwxr-xr-x - huser supergroup 0 2014-04-17 14:16 /user/huser/output/clusters-6 drwxr-xr-x - huser supergroup 0 2014-04-17 14:17 /user/huser/output/clusters-7 drwxr-xr-x - huser supergroup 0 2014-04-17 14:17 /user/huser/output/clusters-8 drwxr-xr-x - huser supergroup 0 2014-04-17 14:18 /user/huser/output/clusters-9 drwxr-xr-x - huser supergroup 0 2014-04-17 14:10 /user/huser/output/data drwxr-xr-x - huser supergroup 0 2014-04-17 14:10 /user/huser/output/random-seeds
[huser@master hadoop]$ hadoop-1.2.1/bin/hadoop fs -ls ./output/data Warning: $HADOOP_HOME is deprecated.
Found 3 items -rw-r--r-- 1 huser supergroup 0 2014-04-17 14:10 /user/huser/output/data/_SUCCESS drwxr-xr-x - huser supergroup 0 2014-04-17 14:07 /user/huser/output/data/_logs -rw-r--r-- 1 huser supergroup 335470 2014-04-17 14:10 /user/huser/output/data/part-m-00000