zoukankan html css js c++ java

机器学习——模型树

和回归树（在每个叶节点上使用各自的均值做预测）不同，模型树算法需要在每个叶节点上都构建出一个线性模型，这就是把叶节点设定为分段线性函数，这个所谓的分段线性（piecewise linear）是指模型由多个线性片段组成。

#####################模型树#####################
def linearSolve(dataSet):   	#模型树的叶节点生成函数
	m,n = shape(dataSet)
	X = mat(ones((m,n))); Y = mat(ones((m,1)))		#建立两个全部元素为1的(m,n)矩阵和(m,1)矩阵
	X[:,1:n] = dataSet[:,0:n-1]; Y = dataSet[:,-1]	#X存放所有的特征，Y存放   
	xTx = X.T*X
	if linalg.det(xTx) == 0.0:
		raise NameError('This matrix is singular, cannot do inverse,

		try increasing the second value of ops')
	ws = xTx.I * (X.T * Y)							#求线性回归的回归系数
	return ws,X,Y

def modelLeaf(dataSet):			#建立模型树叶节点函数
    ws,X,Y = linearSolve(dataSet)
    return ws

def modelErr(dataSet):			#模型树平方误差计算函数
    ws,X,Y = linearSolve(dataSet)
    yHat = X * ws
    return sum(power(Y - yHat,2))

main.py

# coding:utf-8
# !/usr/bin/env python

import regTrees
import matplotlib.pyplot as plt
from numpy import *

if __name__ == '__main__':
	myDat = regTrees.loadDataSet('exp2.txt')
	myMat = mat(myDat)
	myTree = regTrees.createTree(myMat,regTrees.modelLeaf,regTrees.modelErr,(1,10))
	print myTree
	regTrees.plotBestFit('exp2.txt')

得到两段函数，以0.28为分界

分别为y=3.46877+1.1852x和y=0.001698+11.96477x

而生成该数据的真实模型是y=3.5+1.0x和y=0+12x再加上高斯噪声生成

查看全文

相关阅读:
（转）Ogre终于开始改进其对地形渲染的支持
 （转）让VS2005编辑器支持着色器语法高亮
 （转）天龙粒子系统改进
 （转）【行业专题】计算机世界《狗日的腾讯》报道
 （转）Ogre天龙八部2及鹿鼎记天空顶（Skydome）镜头眩光（Lens Flare）等效果的实现
 （转）“你的代码写的很烂”
程序员能力矩阵
 Oracle操作表空间
 TCP/IP、HTTP、WEBSERVICE、SOAP、ICE都使用后才有感慨
 oracleserviceSID 在系统服务里丢失

原文地址：https://www.cnblogs.com/tonglin0325/p/6220521.html