zoukankan      html  css  js  c++  java
  • 【笔记】初探KNN算法(2)

    KNN算法(2)

    机器学习算法封装
    scikit-learn中的机器学习算法封装

    在python chame中将算法写好

      import  numpy as np
      from math import sqrt
      from collections import Counter
    
      def kNN_classify(k, X_train, y_train , x):
    
          assert 1 <= k <= X_train.shape[0],"k must be valid"
          assert X_train.shape[0] == y_train.shape[0], \
              "the size of X_train must equal to the size of y_train"
          assert X_train.shape[1] == x.shape[0], \
              "the feature number of x must be equal to X_train"
    
          distances = [sqrt(np.sum((x_train - x)**2)) for x_train in X_train]
          nearest = np.argsort(distances)
    
          topK_y = [y_train[i] for i in nearest[:k]]
          votes = Counter(topK_y)
    
          return votes.most_common(1)[0][0]
    

    将所需要的数据提前准备好

    使用魔法命令%run调用函数

      %run KNN.py
    

    执行即可得到预测结果

    k近邻算法是非常特殊的,可以被认为是没有模型的算法,为了和其他的算法统一,可以认为训练数据集就是魔性本身

    使用scikit-learn中的kNN

    需要调用KNeighborsClassifier类

    创建实例,其中n_neighbors=6相当于k=6

    然后进行fit操作

      kNN_classifier.fit(X_train,y_train)
    

    其返回值就是自身,可以不用接参数

    调用predict方法即可实现

    不过需要注意的是,这个必须是一个矩阵,不能是一维数组
    因此我们先reshape改变结构

    最后就可以得到预测的类别

    重新整理我们的kNN代码
    在同一个文件夹下创建一个kNN1.py的文件
    写入KNN算法

      import numpy as np
      from math import sqrt
      from collections import Counter
    
      class KNNClassifier:
    
          def __init__(self, k):
              """初始化KNN分类器"""
              assert k >= 1, "k must be valid"
              self.k = k
              self._X_train = None
              self._y_train = None
    
          def fit(self, X_train, y_train):
              """根据训练数据集X_train和y_train训练kNN分类器"""
              assert X_train.shape[0] == y_train.shape[0], \
                  "this size of X_train must be equal to the size of y_train"
              assert self.k <= X_train.shape[0], \
                  "the size of X_train must be at least k."
    
              self._X_train = X_train
              self._y_train = y_train
              return self
    
          def predict(self, X_predict):
              """给定预测数据集X_predict,返回表示X_predict的结果向量"""
              assert self._X_train is not None and self._y_train is not None, \
                  "must fit before predict!"
              assert X_predict.shape[1] == self._X_train.shape[1], \
                  "the feature number of X_predict must be equal to X_train"
    
              y_predict = [self._predict(x) for x in X_predict]
              return np.array(y_predict)
    
          def _predict(self, x):
              """给定单个待预测数据x,返回x的预测结果值"""
              assert x.shape[0] == self._X_train.shape[1], \
                  "the feature number of x must be equal to X_train"
    
              distances = [sqrt(np.sum((x_train - x) ** 2))
                           for x_train in self._X_train]
    
              nearest = np.argsort(distances)
    
              topK_y = [self._y_train[i] for i in nearest[:self.k]]
              votes = Counter(topK_y)
    
              return votes.most_common(1)[0][0]
    
          def __repr__(self):
              return "KNN(k=%d)" % self.k
    

    同上操作,即可得到

    您能读到这儿,我呢是发自真心的感谢您,若要转载,还望请您带上链接
  • 相关阅读:
    如何为新的应用获取更高的关键字排名
    AppStore审核不通过?看看问题出在哪儿
    django 学习-11 Django模型数据模板呈现
    django 学习-10 Django多对多关系模型
    Django学习--9 Admin
    Django学习--9 多对一关系模型
    django 学习-7 模型数据操作
    django 学习-6 定义模型--数据库的使用
    django 学习-5 模板使用流程
    django 学习-4 模板标签
  • 原文地址:https://www.cnblogs.com/jokingremarks/p/14274566.html
Copyright © 2011-2022 走看看