zoukankan      html  css  js  c++  java
  • Mahout学习

    Mahout小案例学习,实现k-means算法。

    环境:OS:Centos 6.5 x64 & Soft:Hadoop 1.2.1 & Mahout 0.9

    1、下载测试数据

    [huser@master hadoop]$ wget http://archive.ics.uci.edu/ml/databases/synthetic_control/synthetic_control.data

    2、数据拷贝到HDFS

    [huser@master hadoop]$ hadoop-1.2.1/bin/hadoop fs -mkdir ./testdata
    Warning: $HADOOP_HOME is deprecated.
    [huser@master hadoop]$ hadoop-1.2.1/bin/hadoop fs -put ./synthetic_control.data ./testdata
    Warning: $HADOOP_HOME is deprecated.
    [huser@master hadoop]$ hadoop-1.2.1/bin/hadoop fs -ls ./testdata
    Warning: $HADOOP_HOME is deprecated.
    Found 1 items
    -rw-r--r-- 1 huser supergroup 288374 2014-04-17 14:02 /user/huser/testdata/synthetic_control.data

    3、做一个kmeans聚类测试

    [huser@master hadoop]$ mahout org.apache.mahout.clustering.syntheticcontrol.kmeans.Job

    4、观察输出

    [huser@master hadoop]$ hadoop-1.2.1/bin/hadoop fs -ls ./output
    Warning: $HADOOP_HOME is deprecated.
    Found 15 items
    -rw-r--r-- 1 huser supergroup 194 2014-04-17 14:18 /user/huser/output/_policy
    drwxr-xr-x - huser supergroup 0 2014-04-17 14:19 /user/huser/output/clusteredPoints
    drwxr-xr-x - huser supergroup 0 2014-04-17 14:10 /user/huser/output/clusters-0
    drwxr-xr-x - huser supergroup 0 2014-04-17 14:13 /user/huser/output/clusters-1
    drwxr-xr-x - huser supergroup 0 2014-04-17 14:18 /user/huser/output/clusters-10-final
    drwxr-xr-x - huser supergroup 0 2014-04-17 14:14 /user/huser/output/clusters-2
    drwxr-xr-x - huser supergroup 0 2014-04-17 14:14 /user/huser/output/clusters-3
    drwxr-xr-x - huser supergroup 0 2014-04-17 14:15 /user/huser/output/clusters-4
    drwxr-xr-x - huser supergroup 0 2014-04-17 14:15 /user/huser/output/clusters-5
    drwxr-xr-x - huser supergroup 0 2014-04-17 14:16 /user/huser/output/clusters-6
    drwxr-xr-x - huser supergroup 0 2014-04-17 14:17 /user/huser/output/clusters-7
    drwxr-xr-x - huser supergroup 0 2014-04-17 14:17 /user/huser/output/clusters-8
    drwxr-xr-x - huser supergroup 0 2014-04-17 14:18 /user/huser/output/clusters-9
    drwxr-xr-x - huser supergroup 0 2014-04-17 14:10 /user/huser/output/data
    drwxr-xr-x - huser supergroup 0 2014-04-17 14:10 /user/huser/output/random-seeds
    [huser@master hadoop]$ hadoop-1.2.1/bin/hadoop fs -ls ./output/data
    Warning: $HADOOP_HOME is deprecated.
    Found 3 items -rw-r--r-- 1 huser supergroup 0 2014-04-17 14:10 /user/huser/output/data/_SUCCESS drwxr-xr-x - huser supergroup 0 2014-04-17 14:07 /user/huser/output/data/_logs -rw-r--r-- 1 huser supergroup 335470 2014-04-17 14:10 /user/huser/output/data/part-m-00000
  • 相关阅读:
    对数据劫持 OR 数据代理 的研究------------引用
    对React性能优化的研究-----------------引用
    对abel 转译 class 过程的研究----------------------引用
    对vue-router的研究--------------引用
    对JavaScript 引擎基础:原型优化的研究 -----------------------引用
    对vue源码之缓存的研究--------------引用
    对前端数据结构与算法的研究----------------引用
    正则表达式巩固_从别的资料上弄下来的
    十进制转换
    cmd 安装第三方库问题
  • 原文地址:https://www.cnblogs.com/guarder/p/3705357.html
Copyright © 2011-2022 走看看