zoukankan      html  css  js  c++  java
  • ALINK(二十一):数据处理(七)数值型数据处理(三)绝对值最大化 (MaxAbsScalerTrainBatchOp/MaxAbsScalerPredictBatchOp)

    绝对值最大化训练 (MaxAbsScalerTrainBatchOp)

    Java 类名:com.alibaba.alink.operator.batch.dataproc.MaxAbsScalerTrainBatchOp

    Python 类名:MaxAbsScalerTrainBatchOp

    功能介绍

    • 绝对值最大标准化是对数据按照最大值和最小值进行标准化的组件, 将数据归一到-1和1之间。
    • 使用绝对值最大标准化预测组件使用生成的模型,转换输入的数据

    参数说明

    名称

    中文名称

    描述

    类型

    是否必须?

    默认值

    selectedCols

    选择的列名

    计算列对应的列名列表

    String[]

     

    代码示例

    Python 代码

    from pyalink.alink import *
    import pandas as pd
    useLocalEnv(1)
    df = pd.DataFrame([
                ["a", 10.0, 100],
                ["b", -2.5, 9],
                ["c", 100.2, 1],
                ["d", -99.9, 100],
                ["a", 1.4, 1],
                ["b", -2.2, 9],
                ["c", 100.9, 1]
    ])
                 
    colnames = ["col1", "col2", "col3"]
    selectedColNames = ["col2", "col3"]
    inOp = BatchOperator.fromDataframe(df, schemaStr='col1 string, col2 double, col3 long')
             
    # train
    trainOp = MaxAbsScalerTrainBatchOp()
               .setSelectedCols(selectedColNames)
    trainOp.linkFrom(inOp)
    # batch predict
    predictOp = MaxAbsScalerPredictBatchOp()
    predictOp.linkFrom(trainOp, inOp).print()

    Java 代码

    import org.apache.flink.types.Row;
    import com.alibaba.alink.operator.batch.BatchOperator;
    import com.alibaba.alink.operator.batch.dataproc.MaxAbsScalerPredictBatchOp;
    import com.alibaba.alink.operator.batch.dataproc.MaxAbsScalerTrainBatchOp;
    import com.alibaba.alink.operator.batch.source.MemSourceBatchOp;
    import org.junit.Test;
    import java.util.Arrays;
    import java.util.List;
    public class MaxAbsScalerTrainBatchOpTest {
      @Test
      public void testMaxAbsScalerTrainBatchOp() throws Exception {
        List <Row> df = Arrays.asList(
          Row.of("a", 10.0, 100),
          Row.of("b", -2.5, 9),
          Row.of("c", 100.2, 1),
          Row.of("d", -99.9, 100),
          Row.of("a", 1.4, 1),
          Row.of("b", -2.2, 9),
          Row.of("c", 100.9, 1)
        );
        String[] selectedColNames = new String[] {"col2", "col3"};
        BatchOperator <?> inOp = new MemSourceBatchOp(df, "col1 string, col2 double, col3 int");
        BatchOperator <?> trainOp = new MaxAbsScalerTrainBatchOp()
          .setSelectedCols(selectedColNames);
        trainOp.linkFrom(inOp);
        BatchOperator <?> predictOp = new MaxAbsScalerPredictBatchOp();
        predictOp.linkFrom(trainOp, inOp).print();
      }
    }

    运行结果

    col1

    col2

    col3

    a

    0.0991

    1.0000

    b

    -0.0248

    0.0900

    c

    0.9931

    0.0100

    d

    -0.9901

    1.0000

    a

    0.0139

    0.0100

    b

    -0.0218

    0.0900

    c

    1.0000

    0.0100

  • 相关阅读:
    HDU 3401 Trade
    POJ 1151 Atlantis
    HDU 3415 Max Sum of MaxKsubsequence
    HDU 4234 Moving Points
    HDU 4258 Covered Walkway
    HDU 4391 Paint The Wall
    HDU 1199 Color the Ball
    HDU 4374 One hundred layer
    HDU 3507 Print Article
    GCC特性之__init修饰解析 kasalyn的专栏 博客频道 CSDN.NET
  • 原文地址:https://www.cnblogs.com/qiu-hua/p/14897456.html
Copyright © 2011-2022 走看看