zoukankan      html  css  js  c++  java
  • Mahout学习

    Mahout小案例学习,实现k-means算法。

    环境:OS:Centos 6.5 x64 & Soft:Hadoop 1.2.1 & Mahout 0.9

    1、下载测试数据

    [huser@master hadoop]$ wget http://archive.ics.uci.edu/ml/databases/synthetic_control/synthetic_control.data

    2、数据拷贝到HDFS

    [huser@master hadoop]$ hadoop-1.2.1/bin/hadoop fs -mkdir ./testdata
    Warning: $HADOOP_HOME is deprecated.
    [huser@master hadoop]$ hadoop-1.2.1/bin/hadoop fs -put ./synthetic_control.data ./testdata
    Warning: $HADOOP_HOME is deprecated.
    [huser@master hadoop]$ hadoop-1.2.1/bin/hadoop fs -ls ./testdata
    Warning: $HADOOP_HOME is deprecated.
    Found 1 items
    -rw-r--r-- 1 huser supergroup 288374 2014-04-17 14:02 /user/huser/testdata/synthetic_control.data

    3、做一个kmeans聚类测试

    [huser@master hadoop]$ mahout org.apache.mahout.clustering.syntheticcontrol.kmeans.Job

    4、观察输出

    [huser@master hadoop]$ hadoop-1.2.1/bin/hadoop fs -ls ./output
    Warning: $HADOOP_HOME is deprecated.
    Found 15 items
    -rw-r--r-- 1 huser supergroup 194 2014-04-17 14:18 /user/huser/output/_policy
    drwxr-xr-x - huser supergroup 0 2014-04-17 14:19 /user/huser/output/clusteredPoints
    drwxr-xr-x - huser supergroup 0 2014-04-17 14:10 /user/huser/output/clusters-0
    drwxr-xr-x - huser supergroup 0 2014-04-17 14:13 /user/huser/output/clusters-1
    drwxr-xr-x - huser supergroup 0 2014-04-17 14:18 /user/huser/output/clusters-10-final
    drwxr-xr-x - huser supergroup 0 2014-04-17 14:14 /user/huser/output/clusters-2
    drwxr-xr-x - huser supergroup 0 2014-04-17 14:14 /user/huser/output/clusters-3
    drwxr-xr-x - huser supergroup 0 2014-04-17 14:15 /user/huser/output/clusters-4
    drwxr-xr-x - huser supergroup 0 2014-04-17 14:15 /user/huser/output/clusters-5
    drwxr-xr-x - huser supergroup 0 2014-04-17 14:16 /user/huser/output/clusters-6
    drwxr-xr-x - huser supergroup 0 2014-04-17 14:17 /user/huser/output/clusters-7
    drwxr-xr-x - huser supergroup 0 2014-04-17 14:17 /user/huser/output/clusters-8
    drwxr-xr-x - huser supergroup 0 2014-04-17 14:18 /user/huser/output/clusters-9
    drwxr-xr-x - huser supergroup 0 2014-04-17 14:10 /user/huser/output/data
    drwxr-xr-x - huser supergroup 0 2014-04-17 14:10 /user/huser/output/random-seeds
    [huser@master hadoop]$ hadoop-1.2.1/bin/hadoop fs -ls ./output/data
    Warning: $HADOOP_HOME is deprecated.
    Found 3 items -rw-r--r-- 1 huser supergroup 0 2014-04-17 14:10 /user/huser/output/data/_SUCCESS drwxr-xr-x - huser supergroup 0 2014-04-17 14:07 /user/huser/output/data/_logs -rw-r--r-- 1 huser supergroup 335470 2014-04-17 14:10 /user/huser/output/data/part-m-00000
  • 相关阅读:
    Zabbix配置文件详解之服务端zabbix_server
    Ansible批量远程管理Windows主机(部署与配置)
    ansible简要说明
    zabbix自动发现与自动注册
    Linux获取UUID
    python爬虫练习之批量下载zabbix文档
    cmake编译c++程序
    spring中PropertyPlaceholderConfigurer的运用---使用${property-name}取值
    spring中<bean>中parent标签的使用
    用静态工厂的方法实例化bean
  • 原文地址:https://www.cnblogs.com/guarder/p/3705357.html
Copyright © 2011-2022 走看看