zoukankan      html  css  js  c++  java
  • 主成分分析(PCA)实现代码

    摘自《机器学习实践》第13章:

     1 from numpy import *
     2 import matplotlib
     3 import matplotlib.pyplot as plt
     4 
     5 def loadDataSet(fileName, delim='	'):
     6     fr = open(fileName)
     7     stringArr = [line.strip().split(delim) for line in fr.readlines()]
     8     datArr = [map(float, line) for line in stringArr]
     9     return mat(datArr)
    10 
    11 def pca(dataMat, topNfeat = 999999):
    12     meanVals = mean(dataMat, axis = 0)
    13     meanRemoved = dataMat - meanVals
    14     covMat = cov(meanRemoved, rowvar = 0)
    15     eigVals, eigVects = linalg.eig(mat(covMat))
    16     eigValInd = argsort(eigVals)
    17     eigValInd = eigValInd[: - (topNfeat + 1): -1]
    18     redEigVects = eigVects[:, eigValInd]
    19     lowDDataMat = meanRemoved * redEigVects
    20     reconMat = (lowDDataMat * redEigVects.T) + meanVals
    21     return lowDDataMat, reconMat
    22 
    23 if __name__ == "__main__":
    24     dataMat = loadDataSet('testSet.txt')
    25     lowDMat, reconMat = pca(dataMat, 1)
    26     fig = plt.figure()
    27     ax = fig.add_subplot(111)
    28     ax.scatter(dataMat[:, 0].flatten().A[0], dataMat[:, 1].flatten().A[0], marker='^', s = 90)
    29     ax.scatter(reconMat[:, 0].flatten().A[0], reconMat[:, 1].flatten().A[0], marker='o', s = 50, c='red')
    30     plt.show()

    输入数据格式: 文本文件,每行是两个以TAB键分隔的浮点数。

    附运行结果:

  • 相关阅读:
    Redis 连接命令
    Redis 脚本
    Redis 事务
    Redis 发布订阅
    Redis HyperLogLog
    Redis 有序集合(sorted set)
    Redis 集合(Set)
    Redis 列表(List)
    Redis 哈希(Hash)
    特定消费者的限制流量
  • 原文地址:https://www.cnblogs.com/mush0m/p/3694959.html
Copyright © 2011-2022 走看看