zoukankan      html  css  js  c++  java
  • AI

    1 - Iris数据集

    Iris数据集是常用的机器学习分类实验数据集,特点是数据量很小,可以快速学习。
    数据集包含150个数据集,分为3类,每类50个数据,每个数据包含4个属性。

    • Sepal.Length(花萼长度),单位是cm
    • Sepal.Width(花萼宽度),单位是cm
    • Petal.Length(花瓣长度),单位是cm
    • Petal.Width(花瓣宽度),单位是cm

    可通过以上4个属性预测鸢尾花卉属于以下三个种类中的哪一类

    • Iris Setosa(山鸢尾)
    • Iris Versicolour(杂色鸢尾)
    • Iris Virginica(维吉尼亚鸢尾)

    2 - 在Python中运行Iris数据集的深度学习

    2.1 - 代码内容

    # coding=utf-8
    import h2o
    
    h2o.init()  # 默认情况下,H2O实例允许使用所有内核, 并且通常需要25%的系统存储空间
    
    # 准备数据
    datasets = "https://raw.githubusercontent.com/DarrenCook/h2o/bk/datasets/"
    data = h2o.import_file(datasets + "iris_wheader.csv")  # 输入数据
    y = "class"  # 变量y是指要学习的字段名称,在无监督学习中不需要设置此变量
    x = data.names  # 从何处学习的字段名称,这里表示所有其他字段
    x.remove(y)
    train, test = data.split_frame([0.8])  # 分割为训练数据和测试数据,这里选取了80%的数据进行训练,剩下的来进行测试
    
    # 训练模型
    m = h2o.estimators.deeplearning.H2ODeepLearningEstimator()  # 使用默认值,创建一个机器学习算法的对象
    m.train(x, y, train)  # 开始训练,并指定使用所有的数据集
    print("# MSE:", m.mse())  # 显示MSE(均方误差)
    print("# Confusion Matrix: 
    ", m.confusion_matrix(train))  # 显示混淆矩阵(显示每个类别有多少正确, 错误时所选择的类别)
    
    # 使用模型进行预测
    p = m.predict(test)
    print("# Predict: 
    ", p)  # 默认只显示前10行
    print("# as_data_frame : 
    ", p.as_data_frame())  # 显示所有行
    print("# mean: ", (p["predict"] == test["class"]).mean())  # 显示正确的百分比
    print("# cbind: 
    ", p["predict"].cbind(test["class"]).as_data_frame())  # 显示每个预测的两列输出
    
    # 一些默认约定
    # - y变量:H2O中某一列是需要预测的内容,将该列名称定为y变量(在无监督学习中不需要设置此变量)
    # - x变量:数据中的一些列或所有其他列是需要从中学习的内容,这些列称为x变量
    # - data变量:用于完整的数据
    # - train变量:用于训练帧子集
    # - valid变量:用于验证的子集
    # - test变量:用于测试的子集
    # 建议采用更为清楚有意义的简写名称.
    

    2.2 - 显示结果

    D:TempAnaconda3envsh2opython.exe D:/Anliven/Anliven-Code/PycharmProjects/TempTest/TempTest_1.py
    Checking whether there is an H2O instance running at http://localhost:54321 ..... not found.
    Attempting to start a local H2O server...
    ; Java HotSpot(TM) 64-Bit Server VM (build 25.131-b11, mixed mode)
      Starting server from D:TempAnaconda3envsh2olibsite-packagesh2oackendinh2o.jar
      Ice root: C:UsersanlivenAppDataLocalTemp	mptafn6xd_
      JVM stdout: C:UsersanlivenAppDataLocalTemp	mptafn6xd_h2o_anliven_started_from_python.out
      JVM stderr: C:UsersanlivenAppDataLocalTemp	mptafn6xd_h2o_anliven_started_from_python.err
      Server is running at http://127.0.0.1:54321
    Connecting to H2O server at http://127.0.0.1:54321 ... successful.
    --------------------------  ------------------------------------------
    H2O cluster uptime:         02 secs
    H2O cluster timezone:       +08:00
    H2O data parsing timezone:  UTC
    H2O cluster version:        3.24.0.5
    H2O cluster version age:    6 days
    H2O cluster name:           H2O_from_python_anliven_be1ik6
    H2O cluster total nodes:    1
    H2O cluster free memory:    10.64 Gb
    H2O cluster total cores:    8
    H2O cluster allowed cores:  8
    H2O cluster status:         accepting new members, healthy
    H2O connection url:         http://127.0.0.1:54321
    H2O connection proxy:
    H2O internal security:      False
    H2O API Extensions:         Amazon S3, Algos, AutoML, Core V3, Core V4
    Python version:             3.6.2 final
    --------------------------  ------------------------------------------
    Parse progress: |█████████████████████████████████████████████████████████| 100%
    deeplearning Model Build progress: |██████████████████████████████████████| 100%
    # MSE: 0.039118900961189924
    # Confusion Matrix: 
     Confusion Matrix: Row labels: Actual class; Column labels: Predicted class
    
    Iris-setosa    Iris-versicolor    Iris-virginica    Error     Rate
    -------------  -----------------  ----------------  --------  -------
    40             0                  0                 0         0 / 40
    0              34                 5                 0.128205  5 / 39
    0              0                  38                0         0 / 38
    40             34                 43                0.042735  5 / 117
    
    deeplearning prediction progress: |███████████████████████████████████████| 100%
    # Predict: 
     predict        Iris-setosa    Iris-versicolor    Iris-virginica
    -----------  -------------  -----------------  ----------------
    Iris-setosa       0.999995        5.26512e-06       1.22522e-23
    Iris-setosa       0.999998        2.10502e-06       2.36894e-24
    Iris-setosa       0.999996        4.30403e-06       1.68815e-23
    Iris-setosa       0.99995         5.0415e-05        4.90541e-23
    Iris-setosa       0.999999        1.23285e-06       4.16845e-24
    Iris-setosa       0.999997        3.05992e-06       4.10819e-23
    Iris-setosa       0.999946        5.44824e-05       5.15226e-22
    Iris-setosa       0.999999        8.97722e-07       2.31546e-23
    Iris-setosa       0.99999         9.56155e-06       1.59912e-23
    Iris-setosa       1               3.44765e-07       4.95222e-24
    
    [33 rows x 4 columns]
    
    # as_data_frame : 
                 predict   Iris-setosa  Iris-versicolor  Iris-virginica
    0       Iris-setosa  9.999947e-01     5.265116e-06    1.225220e-23
    1       Iris-setosa  9.999979e-01     2.105018e-06    2.368935e-24
    2       Iris-setosa  9.999957e-01     4.304033e-06    1.688151e-23
    3       Iris-setosa  9.999496e-01     5.041504e-05    4.905406e-23
    4       Iris-setosa  9.999988e-01     1.232852e-06    4.168452e-24
    5       Iris-setosa  9.999969e-01     3.059924e-06    4.108188e-23
    6       Iris-setosa  9.999455e-01     5.448235e-05    5.152261e-22
    7       Iris-setosa  9.999991e-01     8.977222e-07    2.315463e-23
    8       Iris-setosa  9.999904e-01     9.561553e-06    1.599121e-23
    9       Iris-setosa  9.999997e-01     3.447651e-07    4.952222e-24
    10  Iris-versicolor  1.285173e-07     9.774696e-01    2.253031e-02
    11  Iris-versicolor  8.456613e-05     9.979772e-01    1.938266e-03
    12  Iris-versicolor  4.829308e-02     9.517061e-01    8.497348e-07
    13  Iris-versicolor  4.169988e-07     9.999681e-01    3.150848e-05
    14  Iris-versicolor  1.805217e-06     9.998308e-01    1.673994e-04
    15  Iris-versicolor  8.759536e-05     9.999115e-01    8.606799e-07
    16  Iris-versicolor  2.206746e-05     9.999167e-01    6.120105e-05
    17  Iris-versicolor  3.302204e-06     9.998997e-01    9.695184e-05
    18  Iris-versicolor  3.622209e-08     9.389008e-01    6.109913e-02
    19  Iris-versicolor  9.407188e-03     9.905912e-01    1.631313e-06
    20  Iris-versicolor  1.332645e-03     9.986596e-01    7.739634e-06
    21   Iris-virginica  5.299107e-16     7.827116e-07    9.999992e-01
    22   Iris-virginica  9.149237e-16     4.476949e-09    1.000000e+00
    23   Iris-virginica  4.123180e-13     1.779434e-07    9.999998e-01
    24   Iris-virginica  7.280032e-08     6.898109e-03    9.931018e-01
    25   Iris-virginica  5.853220e-17     9.229382e-07    9.999991e-01
    26   Iris-virginica  1.171212e-12     2.643036e-04    9.997357e-01
    27   Iris-virginica  2.345086e-16     2.944686e-09    1.000000e+00
    28   Iris-virginica  8.742579e-08     2.479772e-01    7.520227e-01
    29   Iris-virginica  1.258946e-09     1.586186e-02    9.841381e-01
    30   Iris-virginica  2.918212e-07     1.127815e-02    9.887216e-01
    31   Iris-virginica  1.635366e-13     3.913354e-06    9.999961e-01
    32   Iris-virginica  1.160129e-11     2.099658e-07    9.999998e-01
    # mean:  [1.0]
    # cbind: 
                 predict            class
    0       Iris-setosa      Iris-setosa
    1       Iris-setosa      Iris-setosa
    2       Iris-setosa      Iris-setosa
    3       Iris-setosa      Iris-setosa
    4       Iris-setosa      Iris-setosa
    5       Iris-setosa      Iris-setosa
    6       Iris-setosa      Iris-setosa
    7       Iris-setosa      Iris-setosa
    8       Iris-setosa      Iris-setosa
    9       Iris-setosa      Iris-setosa
    10  Iris-versicolor  Iris-versicolor
    11  Iris-versicolor  Iris-versicolor
    12  Iris-versicolor  Iris-versicolor
    13  Iris-versicolor  Iris-versicolor
    14  Iris-versicolor  Iris-versicolor
    15  Iris-versicolor  Iris-versicolor
    16  Iris-versicolor  Iris-versicolor
    17  Iris-versicolor  Iris-versicolor
    18  Iris-versicolor  Iris-versicolor
    19  Iris-versicolor  Iris-versicolor
    20  Iris-versicolor  Iris-versicolor
    21   Iris-virginica   Iris-virginica
    22   Iris-virginica   Iris-virginica
    23   Iris-virginica   Iris-virginica
    24   Iris-virginica   Iris-virginica
    25   Iris-virginica   Iris-virginica
    26   Iris-virginica   Iris-virginica
    27   Iris-virginica   Iris-virginica
    28   Iris-virginica   Iris-virginica
    29   Iris-virginica   Iris-virginica
    30   Iris-virginica   Iris-virginica
    31   Iris-virginica   Iris-virginica
    32   Iris-virginica   Iris-virginica
    H2O session _sid_aa65 closed.
    
    Process finished with exit code 0
    

    3 - 在Flow(流)中运行Iris数据集的深度学习

    Flow:http://docs.h2o.ai/h2o/latest-stable/h2o-docs/flow.html#
    作为H2O一部分的Web接口名称(不需要额外的安装步骤),可以完成如下操作:

    • 查看通过客户端上传的数据
    • 直接上传数据
    • 查看通过客户端创建的模型(以及正在创建的模型)
    • 直接创建模型
    • 查看通过客户端生成的预测
    • 直接预测

    3.1 - 启动

    直接运行jar文件来启动H2O Flow

    [Anliven@localhost Downloads]$ pwd
    /home/Anliven/Downloads
    [Anliven@localhost Downloads]$ ls -l
    total 402984
    drwxr-xr-x 5 Anliven Anliven        60 Jun 19 08:19 h2o-3.24.0.5
    -rw-rw-r-- 1 Anliven Anliven 368257676 Jun 19 21:57 h2o-3.24.0.5.zip
    drwxr-xr-x 5 Anliven Anliven        84 Dec 22  2017 h2o-bk
    -rw-rw-rw- 1 Anliven Anliven  44392957 Jun 23 22:25 基于H2O的机器学习实用方法.zip
    [Anliven@localhost Downloads]$ 
    [Anliven@localhost Downloads]$ cd h2o-3.24.0.5/
    [Anliven@localhost h2o-3.24.0.5]$ java -jar h2o.jar -ip 192.168.16.101 -port 54321
    06-27 22:32:49.845 192.168.16.101:54321  3486   main      INFO: ----- H2O started  -----
    06-27 22:32:49.864 192.168.16.101:54321  3486   main      INFO: Build git branch: rel-yates
    06-27 22:32:49.864 192.168.16.101:54321  3486   main      INFO: Build git hash: b9cd4d5bcd44a4949ca8c677c5e54c10ee72c968
    06-27 22:32:49.864 192.168.16.101:54321  3486   main      INFO: Build git describe: jenkins-3.24.0.4-66-gb9cd4d5
    06-27 22:32:49.864 192.168.16.101:54321  3486   main      INFO: Build project version: 3.24.0.5
    06-27 22:32:49.864 192.168.16.101:54321  3486   main      INFO: Build age: 8 days
    06-27 22:32:49.865 192.168.16.101:54321  3486   main      INFO: Built by: 'jenkins'
    06-27 22:32:49.865 192.168.16.101:54321  3486   main      INFO: Built on: '2019-06-18 23:52:14'
    06-27 22:32:49.865 192.168.16.101:54321  3486   main      INFO: Found H2O Core extensions: [Watchdog, XGBoost, KrbStandalone]
    06-27 22:32:49.865 192.168.16.101:54321  3486   main      INFO: Processed H2O arguments: [-ip, 192.168.16.101, -port, 54321]
    06-27 22:32:49.865 192.168.16.101:54321  3486   main      INFO: Java availableProcessors: 2
    06-27 22:32:49.865 192.168.16.101:54321  3486   main      INFO: Java heap totalMemory: 240.0 MB
    06-27 22:32:49.865 192.168.16.101:54321  3486   main      INFO: Java heap maxMemory: 3.45 GB
    06-27 22:32:49.866 192.168.16.101:54321  3486   main      INFO: Java version: Java 1.8.0_161 (from Oracle Corporation)
    06-27 22:32:49.866 192.168.16.101:54321  3486   main      INFO: JVM launch parameters: []
    06-27 22:32:49.866 192.168.16.101:54321  3486   main      INFO: OS version: Linux 3.10.0-957.el7.x86_64 (amd64)
    06-27 22:32:49.866 192.168.16.101:54321  3486   main      INFO: Machine physical memory: 15.51 GB
    06-27 22:32:49.866 192.168.16.101:54321  3486   main      INFO: Machine locale: en_US
    06-27 22:32:49.866 192.168.16.101:54321  3486   main      INFO: X-h2o-cluster-id: 1561645969069
    06-27 22:32:49.866 192.168.16.101:54321  3486   main      INFO: User name: 'Anliven'
    06-27 22:32:49.866 192.168.16.101:54321  3486   main      INFO: IPv6 stack selected: false
    06-27 22:32:49.867 192.168.16.101:54321  3486   main      INFO: Network interface is down: name:virbr0 (virbr0)
    06-27 22:32:49.867 192.168.16.101:54321  3486   main      INFO: Possible IP Address: enp0s8 (enp0s8), fe80:0:0:0:cfdd:6281:f738:fba%enp0s8
    06-27 22:32:49.867 192.168.16.101:54321  3486   main      INFO: Possible IP Address: enp0s8 (enp0s8), 192.168.16.101
    06-27 22:32:49.867 192.168.16.101:54321  3486   main      INFO: Possible IP Address: enp0s3 (enp0s3), fe80:0:0:0:c48f:c289:276:2308%enp0s3
    06-27 22:32:49.867 192.168.16.101:54321  3486   main      INFO: Possible IP Address: enp0s3 (enp0s3), 10.0.2.15
    06-27 22:32:49.867 192.168.16.101:54321  3486   main      INFO: Possible IP Address: lo (lo), 0:0:0:0:0:0:0:1%lo
    06-27 22:32:49.868 192.168.16.101:54321  3486   main      INFO: Possible IP Address: lo (lo), 127.0.0.1
    06-27 22:32:49.868 192.168.16.101:54321  3486   main      INFO: H2O node running in unencrypted mode.
    06-27 22:32:49.869 192.168.16.101:54321  3486   main      INFO: Internal communication uses port: 54322
    06-27 22:32:49.869 192.168.16.101:54321  3486   main      INFO: Listening for HTTP and REST traffic on http://192.168.16.101:54321/
    06-27 22:32:49.870 192.168.16.101:54321  3486   main      INFO: H2O cloud name: 'Anliven' on /192.168.16.101:54321, static configuration based on -flatfile null
    06-27 22:32:49.870 192.168.16.101:54321  3486   main      INFO: If you have trouble connecting, try SSH tunneling from your local machine (e.g., via port 55555):
    06-27 22:32:49.870 192.168.16.101:54321  3486   main      INFO:   1. Open a terminal and run 'ssh -L 55555:localhost:54321 Anliven@192.168.16.101'
    06-27 22:32:49.870 192.168.16.101:54321  3486   main      INFO:   2. Point your browser to http://localhost:55555
    06-27 22:32:50.627 192.168.16.101:54321  3486   main      INFO: Log dir: '/tmp/h2o-Anliven/h2ologs'
    06-27 22:32:50.627 192.168.16.101:54321  3486   main      INFO: Cur dir: '/home/Anliven/Downloads/h2o-3.24.0.5'
    06-27 22:32:50.641 192.168.16.101:54321  3486   main      INFO: Subsystem for distributed import from HTTP/HTTPS successfully initialized
    06-27 22:32:50.641 192.168.16.101:54321  3486   main      INFO: HDFS subsystem successfully initialized
    06-27 22:32:50.645 192.168.16.101:54321  3486   main      INFO: S3 subsystem successfully initialized
    06-27 22:32:50.663 192.168.16.101:54321  3486   main      INFO: GCS subsystem successfully initialized
    06-27 22:32:50.663 192.168.16.101:54321  3486   main      INFO: Flow dir: '/home/Anliven/h2oflows'
    06-27 22:32:50.681 192.168.16.101:54321  3486   main      INFO: Cloud of size 1 formed [/192.168.16.101:54321]
    06-27 22:32:50.690 192.168.16.101:54321  3486   main      INFO: Registered parsers: [GUESS, ARFF, XLS, SVMLight, AVRO, PARQUET, CSV]
    06-27 22:32:50.691 192.168.16.101:54321  3486   main      INFO: Watchdog extension initialized
    06-27 22:32:50.692 192.168.16.101:54321  3486   main      INFO: XGBoost extension initialized
    06-27 22:32:50.692 192.168.16.101:54321  3486   main      INFO: KrbStandalone extension initialized
    06-27 22:32:50.692 192.168.16.101:54321  3486   main      INFO: Registered 3 core extensions in: 318ms
    06-27 22:32:50.692 192.168.16.101:54321  3486   main      INFO: Registered H2O core extensions: [Watchdog, XGBoost, KrbStandalone]
    06-27 22:32:51.041 192.168.16.101:54321  3486   main      INFO: Found XGBoost backend with library: xgboost4j_gpu
    06-27 22:32:51.041 192.168.16.101:54321  3486   main      INFO: XGBoost supported backends: [WITH_GPU, WITH_OMP]
    06-27 22:32:51.229 192.168.16.101:54321  3486   main      INFO: Registered: 174 REST APIs in: 537ms
    06-27 22:32:51.229 192.168.16.101:54321  3486   main      INFO: Registered REST API extensions: [Amazon S3, XGBoost, Algos, AutoML, Core V3, Core V4]
    06-27 22:32:51.492 192.168.16.101:54321  3486   main      INFO: Registered: 249 schemas in 263ms
    06-27 22:32:51.493 192.168.16.101:54321  3486   main      INFO: H2O started in 2407ms
    06-27 22:32:51.493 192.168.16.101:54321  3486   main      INFO: 
    06-27 22:32:51.493 192.168.16.101:54321  3486   main      INFO: Open H2O Flow in your web browser: http://192.168.16.101:54321
    06-27 22:32:51.493 192.168.16.101:54321  3486   main      INFO: 
    

    3.2 - 数据

    在开始界面点击importFiles, 或者在开始页面的顶部菜单依次选择Data-->Import Files
    在新出现的Import Files对话框中, 填写Search的路径后点击查找(放大镜图标), 然后在出现的Search Results中选择数据文件, Selected Files将显示选择结果.
    注意: 这里的Search路径可以是数据文件的绝对路径,也可以是以h2o.jar文件为参照的相对路径, 例如../h2o-bk/datasets.

    单击Import按钮, 将显示文件导入的结果

    单击Parse these files可以自定义导入数据文件的设置, 一般情况下最好是保持默认值, 直接点击"Parse"即可.

    可以点击View或者iris_wheader1.hex查看详细信息

    Actions中选择Split...按钮, 设置如何划分traintest数据集.

    点击Create按钮

    3.3 - 模型

    点击"train"后, 然后点击"Build Model...", 将出现算法选择界面

    选择Deep learning, 并选择参数response_columnclass, 其余参数均保持默认值.

    然后单击此对话框尾部的"Build Model"按钮, 开始训练

    训练完成后, 点击View按钮, 可以查看模型构建的参数和过程.

    如果之前已经构建过模型, 那么从开始界面依次选择Model--->List All Models, 然后单击选择的模型, 就能够查看到此模型构建的参数和过程.

    3.4 - 预测

    从模型视图单击Predict..., 然后指定名称/数据集

    或者从开始界面依次选择Score--->Predict, 然后指定名称/选择模型/数据集

    确定参数后, 点击Predict, 将看到预测结果

    4 - 其他

    • 相比Python,在Flow中可以完成绝大多数类似的操作,不能完成某些数据操作。
    • 在Python中加载数据,可以在Flow中观察;在Flow中加载数据,也可以在Python中观察。
    • 通过Admin菜单下的Water Meter可以查看集群中每个CPU内核的工作状况。
  • 相关阅读:
    远程调用丢失请求头与定义RequestInterceptor
    RabbitMQ 高级特性
    注解@ConfigurationProperties使用方法
    Redisson
    分布式缓存
    DEA 无法显示 Run Dashboard 的解决方法
    node多版本切换
    springboot整合amazonS3,封装上传文件接口
    Maven报错:The packaging for this project did not assign a file to the build artifact
    Nodejs介绍及npm工具使用
  • 原文地址:https://www.cnblogs.com/anliven/p/6847661.html
Copyright © 2011-2022 走看看