zoukankan      html  css  js  c++  java
  • 机器学习: KNN--python

    今天介绍机器学习中比较常见的一种分类算法,K-NN,NN 就是 Nearest Neighbors, 也就是最近邻的意思,这是一种有监督的分类算法,给定一个 test sample, 计算这个 test sample 与 training set
    里每个 training sample 的距离,选择离 test sample 最近的 K 个,然后通过投票选择这 K 个样本中,属于哪类的最多,那么这个 test sample 就属于哪类。K-NN 比较简单直观,也很好理解,一般需要考虑的就是设置 K 的大小,以及如何计算样本之间的距离,比较常用的是欧式距离。下面给出一段简单的代码,说明这个算法的使用。

    from sklearn import datasets
    import numpy as np
    import operator
    
    
    def Knn_Classify (x, Train_data, labels, k):
        N_sample = Train_data.shape[0]
        diff_mat = np.tile(x, (N_sample, 1)) - Train_data
        Sq_diffmat = diff_mat **2
        Sq_dis = Sq_diffmat.sum(axis = 1)
        Dis = Sq_dis ** 0.5
        Index = Dis.argsort()
        C_count = {}
        for i in range (k):
            votelabel = labels[Index[i]]
            C_count[votelabel] = C_count.get(votelabel, 0) + 1
    
        Sort_K = sorted(C_count.iteritems(), 
           key = operator.itemgetter(1), reverse=True)
    
        return Sort_K
    
    
    
    iris = datasets.load_iris()
    x_data = iris.data
    y_label = iris.target
    class_name = iris.target_names
    
    n_sample = len(x_data)
    
    np.random.seed(0)
    index = np.random.permutation(n_sample)
    x_data = x_data[index]
    y_label = y_label[index]
    
    ratio = 0.8
    
    train_x = x_data[ : int(ratio * n_sample)]
    train_y = y_label[ : int(ratio * n_sample)]
    test_x = x_data[int(ratio * n_sample) :]
    test_y = y_label[int(ratio * n_sample) : ]
    
    n_test = len(test_x)
    
    p_label = np.zeros((len(test_y)))
    
    for i in range (n_test):
        in_x = test_x [i, :]
        target_label = test_y [i]
        predict_value = Knn_Classify(in_x, train_x, train_y, 5)
        p_label[i] = predict_value[0][0]
    #    print "the predict label is: ", predict_value
    #    print "the target_label is: ", target_label
    
    
    t = (p_label == test_y)
    acc = t.sum()*1.0/len(test_y)
    
    print "the accuracy is: ", acc
    
  • 相关阅读:
    bzoj1072: [SCOI2007]排列perm
    bzoj1226: [SDOI2009]学校食堂Dining
    bzoj3208: 花神的秒题计划Ⅰ
    bzoj1079: [SCOI2008]着色方案
    bzoj3573: [Hnoi2014]米特运输
    bzoj1040: [ZJOI2008]骑士
    bzoj 1369: [Baltic2003]Gem
    bzoj2818: Gcd
    bzoj2705: [SDOI2012]Longge的问题
    整数分解
  • 原文地址:https://www.cnblogs.com/mtcnn/p/9412388.html
Copyright © 2011-2022 走看看