zoukankan      html  css  js  c++  java
  • ALINK(四十一):模型评估(六)聚类评估 (EvalClusterBatchOp)

    Java 类名:com.alibaba.alink.operator.batch.evaluation.EvalClusterBatchOp

    Python 类名:EvalClusterBatchOp

    功能介绍

    聚类评估是对聚类算法的预测结果进行效果评估,支持下列评估指标。

     

     

    参数说明

    名称

    中文名称

    描述

    类型

    是否必须?

    默认值

    predictionCol

    预测结果列名

    预测结果列名

    String

     

    labelCol

    标签列名

    输入表中的标签列名

    String

     

    null

    vectorCol

    向量列名

    输入表中的向量列名

    String

     

    null

    distanceType

    距离度量方式

    距离类型

    String

     

    "EUCLIDEAN"

    代码示例

    Python 代码

    from pyalink.alink import *
    import pandas as pd
    useLocalEnv(1)
    df = pd.DataFrame([
        [0, "0 0 0"],
        [0, "0.1,0.1,0.1"],
        [0, "0.2,0.2,0.2"],
        [1, "9 9 9"],
        [1, "9.1 9.1 9.1"],
        [1, "9.2 9.2 9.2"]
    ])
    inOp = BatchOperator.fromDataframe(df, schemaStr='id int, vec string')
    metrics = EvalClusterBatchOp().setVectorCol("vec").setPredictionCol("id").linkFrom(inOp).collectMetrics()
    print("Total Samples Number:", metrics.getCount())
    print("Cluster Number:", metrics.getK())
    print("Cluster Array:", metrics.getClusterArray())
    print("Cluster Count Array:", metrics.getCountArray())
    print("CP:", metrics.getCp())
    print("DB:", metrics.getDb())
    print("SP:", metrics.getSp())
    print("SSB:", metrics.getSsb())
    print("SSW:", metrics.getSsw())
    print("CH:", metrics.getVrc())

    Java 代码

    import org.apache.flink.types.Row;
    import com.alibaba.alink.operator.batch.BatchOperator;
    import com.alibaba.alink.operator.batch.evaluation.EvalClusterBatchOp;
    import com.alibaba.alink.operator.batch.source.MemSourceBatchOp;
    import com.alibaba.alink.operator.common.evaluation.ClusterMetrics;
    import org.junit.Test;
    import java.util.Arrays;
    import java.util.List;
    public class EvalClusterBatchOpTest {
      @Test
      public void testEvalClusterBatchOp() throws Exception {
        List <Row> df = Arrays.asList(
          Row.of(0, "0 0 0"),
          Row.of(0, "0.1,0.1,0.1"),
          Row.of(0, "0.2,0.2,0.2"),
          Row.of(1, "9 9 9"),
          Row.of(1, "9.1 9.1 9.1"),
          Row.of(1, "9.2 9.2 9.2")
        );
        BatchOperator <?> inOp = new MemSourceBatchOp(df, "id int, vec string");
        ClusterMetrics metrics = new EvalClusterBatchOp().setVectorCol("vec").setPredictionCol("id").linkFrom(inOp)
          .collectMetrics();
        System.out.println("Total Samples Number:" + metrics.getCount());
        System.out.println("Cluster Number:" + metrics.getK());
        System.out.println("Cluster Array:" + Arrays.toString(metrics.getClusterArray()));
        System.out.println("Cluster Count Array:" + Arrays.toString(metrics.getCountArray()));
        System.out.println("CP:" + metrics.getCp());
        System.out.println("DB:" + metrics.getDb());
        System.out.println("SP:" + metrics.getSp());
        System.out.println("SSB:" + metrics.getSsb());
        System.out.println("SSW:" + metrics.getSsw());
        System.out.println("CH:" + metrics.getVrc());
      }
    }

    运行结果

    Total Samples Number: 6
    Cluster Number: 2
    Cluster Array: ['0', '1']
    Cluster Count Array: [3.0, 3.0]
    CP: 0.11547005383792497
    DB: 0.014814814814814791
    SP: 15.588457268119896
    SSB: 364.5
    SSW: 0.1199999999999996
    CH: 12150.000000000042
  • 相关阅读:
    SER SERVER存储过程
    SQL SERVER连接、合并查询
    delete drop truncate 区别
    将一个表中的数据插入到另外的新表中
    strtol函数 将字符串转换为相应进制的整数
    malloc函数及用法
    求亲密数
    牛顿迭代法求开根号。 a^1/2_______Xn+1=1/2*(Xn+a/Xn)
    C语言中用于计算数组长度的函数 “strlen() ”。
    如何给sublime text3安装汉化包?so easy 哦
  • 原文地址:https://www.cnblogs.com/qiu-hua/p/14902458.html
Copyright © 2011-2022 走看看