zoukankan      html  css  js  c++  java
  • 【Spark机器学习速成宝典】模型篇08保序回归【Isotonic Regression】(Python版)

    目录

      保序回归原理

      保序回归代码(Spark Python)


    保序回归原理

       待续...

     返回目录

    保序回归代码(Spark Python) 

      

      代码里数据:https://pan.baidu.com/s/1jHWKG4I 密码:acq1

    # -*-coding=utf-8 -*-  
    from pyspark import SparkConf, SparkContext
    sc = SparkContext('local')
    
    import math
    from pyspark.mllib.regression import LabeledPoint, IsotonicRegression, IsotonicRegressionModel
    from pyspark.mllib.util import MLUtils
    
    # Load and parse the data 加载和解析数据
    def parsePoint(labeledData):
        return (labeledData.label, labeledData.features[0], 1.0)
    
    data = MLUtils.loadLibSVMFile(sc, "data/mllib/sample_isotonic_regression_libsvm_data.txt")
    
    # Create label, feature, weight tuples from input data with weight set to default value 1.0. 创建标签,特征,权重的元组,并设置权重默认为1.0
    parsedData = data.map(parsePoint)
    
    # Split data into training (60%) and test (40%) sets. 分割数据集
    training, test = parsedData.randomSplit([0.6, 0.4], 11)
    
    # Create isotonic regression model from training data. 创建保序回归模型
    # Isotonic parameter defaults to true so it is only shown for demonstration 参数默认为true,这里只是用于展示
    model = IsotonicRegression.train(training)
    
    # Create tuples of predicted and real labels. 创建预测和真实标签的元组
    predictionAndLabel = test.map(lambda p: (model.predict(p[1]), p[0]))
    
    # Calculate mean squared error between predicted and real labels.计算预测和真实标签的均方误差
    meanSquaredError = predictionAndLabel.map(lambda pl: math.pow((pl[0] - pl[1]), 2)).mean()
    print("Mean Squared Error = " + str(meanSquaredError)) #Mean Squared Error = 0.00863040529956
    
    # Save and load model
    model.save(sc, "myIsotonicRegressionModel")
    sameModel = IsotonicRegressionModel.load(sc, "myIsotonicRegressionModel")
    print sameModel.predict(data.collect()[0].features) #0.14987251

     返回目录

  • 相关阅读:
    QSPI
    温度标准
    minigui占用空间的情况
    引导页的展示
    Xcode6.0以后SVN的配置
    【转】iOS多线程编程技术之NSThread、Cocoa NSOperation、GCD
    SDWebImage最新使用
    OC设计模式
    Objective-c的@property 详解
    iOS开发时间戳与时间,时区的转换,汉字与UTF8,16进制的转换
  • 原文地址:https://www.cnblogs.com/itmorn/p/8029676.html
Copyright © 2011-2022 走看看