zoukankan      html  css  js  c++  java
  • 机器学习:scikit-learn 做笑脸识别 (SVM, KNN, Logisitc regression)

    scikit-learn 是 Python 非常强大的一个做机器学习的包,今天介绍scikit-learn 里几个常用的分类器
    SVM, KNN 和 logistic regression,用来做笑脸识别。

    这里用到的是GENKI4K 这个数据库,每张图像先做一个人脸检测与剪切,然后提取HOG特征。这个数据库有 4000 张图,分成4组,做一个 cross validation,取平均值作为最终的识别率:

    import string, os, sys
    import numpy as np
    import matplotlib.pyplot as plt
    import scipy.io
    import random
    from sklearn import neighbors, linear_model, svm
    
    
    dir = '/GENKI4K/Feature_Data'  
    print '----------- no sub dir'  
    
    # prepare the data
    files = os.listdir(dir)  
    for f in files:  
        print dir + os.sep + f
    
    file_path=dir+os.sep+files[14]
    
    #print file_path
    
    dic_mat = scipy.io.loadmat(file_path)
    
    data_mat=dic_mat['Hog_Feat']
    
    print 'feature: ',  data_mat.shape
    
    #print data_mat.dtype
    
    file_path2=dir+os.sep+files[15]
    
    #print file_path2
    
    dic_label=scipy.io.loadmat(file_path2)
    
    label_mat=dic_label['Label']
    
    file_path3=dir+os.sep+files[16]
    
    print 'fiel 3 path: ', file_path3
    
    dic_T=scipy.io.loadmat(file_path3)
    
    T=dic_T['T']
    T=T-1
    
    print T.shape
    
    label=label_mat.ravel()
    
    # Acc=np.zeros((1,4))
    
    Acc=[0,0,0,0]
    
    for i in range (0, 4):
        print "the fold %d" % (i+1)
        train_ind=[]
        for j in range (0, 4):
            if j==i:
                test_ind=T[j]
            else:
                train_ind.extend(T[j])
    #    print len(test_ind), len(train_ind)
    #    print max(test_ind), max(train_ind)
        train_x=data_mat[train_ind, :]
        test_x=data_mat[test_ind, :]
        train_y=label[train_ind]
        test_y=label[test_ind]
    #   SVM   
        clf=svm.LinearSVC()
    #   KNN 
    #    clf = neighbors.KNeighborsClassifier(n_neighbors=15)
    #    Logistic regression
    #    clf = linear_model.LogisticRegression()
    
        clf.fit(train_x, train_y)
        predict_y=clf.predict(test_x)
        Acc[i]=np.mean(predict_y == test_y)
        print "Accuracy: %.2f" % (Acc[i])
    
    print "The mean average classification accuracy: %.2f" % (np.mean(Acc))
    
    # SVM 的实验结果
    (4, 1000)
    the fold 1
    Accuracy: 0.89
    the fold 2
    Accuracy: 0.88
    the fold 3
    Accuracy: 0.89
    the fold 4
    Accuracy: 0.90
    The mean average classification accuracy: 0.89
    
    # KNN 的实验结果
    (4, 1000)
    the fold 1
    Accuracy: 0.83
    the fold 2
    Accuracy: 0.84
    the fold 3
    Accuracy: 0.84
    the fold 4
    Accuracy: 0.85
    The mean average classification accuracy: 0.84
    
    # logistic regression 的实验结果
    (4, 1000)
    the fold 1
    Accuracy: 0.91
    the fold 2
    Accuracy: 0.91
    the fold 3
    Accuracy: 0.90
    the fold 4
    Accuracy: 0.92
    The mean average classification accuracy: 0.91
    
  • 相关阅读:
    按单生产案例
    【转】linux中执行外部命令提示" error while loading shared libraries"时的解决办法
    【转】WARNING! File system needs to be upgraded. You have version null and I want version 7. Run the '${HBASE_HOME}/bin/hbase migrate' script. 的解决办法
    根据Rowkey从HBase中查询数据
    【转】在一个Job中同时写入多个HBase的table
    sqoop 使用
    给VMware下的Linux扩展磁盘空间(以CentOS6.3为例)
    chrome 版本 29.0.1547.76 m 解决打开新标签页后的恶心页面的问题
    tomcat7+jdk的keytool生成证书 配置https
    如何打包和生成你的Android应用程序
  • 原文地址:https://www.cnblogs.com/mtcnn/p/9412449.html
Copyright © 2011-2022 走看看