zoukankan      html  css  js  c++  java
  • scikit-learn(project中用的相对较多的模型介绍):2.3. Clustering(可用于特征的无监督降维)

    參考:http://scikit-learn.org/stable/modules/clustering.html


    在实际项目中,我们真的非常少用到那些简单的模型,比方LR、kNN、NB等。尽管经典,但在project中确实不有用。

    今天我们不关注详细的模型,而关注无监督的聚类方法。


    之所以关注无监督聚类方法。是由于。在实际项目中,我们除了使用PCA等方法降维外。有时候我们也会考虑使用聚类的方法降维特征


    Overview of clustering methods:


    A comparison of the clustering algorithms in scikit-learn

    Method name Parameters Scalability Usecase Geometry (metric used)
    K-Means number of clusters Very large n_samples, medium n_clusterswith MiniBatch code General-purpose, even cluster size, flat geometry, not too many clusters Distances between points
    Affinity propagation damping, sample preference Not scalable with n_samples Many clusters, uneven cluster size, non-flat geometry Graph distance (e.g. nearest-neighbor graph)
    Mean-shift bandwidth Not scalable withn_samples Many clusters, uneven cluster size, non-flat geometry Distances between points
    Spectral clustering number of clusters Medium n_samples, small n_clusters Few clusters, even cluster size, non-flat geometry Graph distance (e.g. nearest-neighbor graph)
    Ward hierarchical clustering number of clusters Large n_samples andn_clusters Many clusters, possibly connectivity constraints Distances between points
    Agglomerative clustering number of clusters, linkage type, distance Large n_samples andn_clusters Many clusters, possibly connectivity constraints, non Euclidean distances Any pairwise distance
    DBSCAN neighborhood size Very large n_samples, medium n_clusters Non-flat geometry, uneven cluster sizes Distances between nearest points
    Gaussian mixtures many Not scalable Flat geometry, good for density estimation Mahalanobis distances to centers
    Birch branching factor, threshold, optional global clusterer. Large n_clusters andn_samples Large dataset, outlier removal, data reduction. Euclidean distance between points



  • 相关阅读:
    以JPanel为基础实现一个图像框
    扩展JButton实现自己的图片按钮
    箴言录2014年4月22日
    搜集整理一些Cron表达式例子
    长途旅行感悟
    箴言录2014年4月19日
    Linux下显示硬盘空间的两个命令
    Linux命令复习和练习_02
    Dash:程序员的的好帮手
    Linux的桌面环境gnome、kde、xfce、lxde 等等使用比较
  • 原文地址:https://www.cnblogs.com/yangykaifa/p/6805927.html
Copyright © 2011-2022 走看看