zoukankan      html  css  js  c++  java
  • K-Means++的代码 ,写的很好

    function [L,C] = kmeanspp(X,k)
    %KMEANS Cluster multivariate data using the k-means++ algorithm.
    % [L,C] = kmeans_pp(X,k) produces a 1-by-size(X,2) vector L with one class
    % label per column in X and a size(X,1)-by-k matrix C containing the
    % centers corresponding to each class.
    % Version: 2013-02-08
    % Authors: Laurent Sorber (Laurent.Sorber@cs.kuleuven.be)
    L = [];
    L1 = 0;
    while length(unique(L)) ~= k
    % The k-means++ initialization.
    % C就是从X中随机挑一个随机点
    C = X(:,1+round(rand*(size(X,2)-1))); %size(X,2)是数据集合X的数据点的数目,C是中心点的集合
    L = ones(1,size(X,2));
    for i = 2:k
    D = X-C(:,L); %-1,此时的C扩大了,D相当于每个X-C的集合
    D = cumsum(sqrt(dot(D,D,1))); %将每个数据点与中心点的距离,依次累加,欧氏距离
    if D(end) == 0, C(:,i:k) = X(:,ones(1,k-i+1)); return; end
    C(:,i) = X(:,find(rand < D/D(end),1)); %find的第二个参数表示返回的索引的数目,D/D(end)距离越远概率越大
    [~,L] = max(bsxfun(@minus,2*real(C'*X),dot(C,C,1).')); %碉堡了,这句,将每个数据点进行分类。
    end
    % The k-means algorithm.
    % any函数:检测矩阵中是否有非零元素,如果有,则返回1,否则,返回0。
    while any(L ~= L1)
    L1 = L;
    for i = 1:k, l = L==i; C(:,i) = sum(X(:,l),2)/sum(l); end %l是各族索引
    [~,L] = max(bsxfun(@minus,2*real(C'*X),dot(C,C,1).'),[],1);
    end
    end

    clear all; close all; clc
    x=[randn(3,2)*.4;randn(4,2)*.5+ones(4,1)*[4 4]];
    [L, C] = kmeanspp(x',2);
    L
    C

  • 相关阅读:
    c++父类指针子类指针转化分析
    setbuf手册
    c++细节
    cf727e
    总结
    UVa 10192 Vacation (最长公共子序列)
    HUNNU 11313 最长公共子序列(LCS)
    HDU 2069 Coin Change (经典DP)
    UVa 674 Coin Change (经典DP)
    UVa 10131 Is Bigger Smarter? (LDS+数据结构排序)
  • 原文地址:https://www.cnblogs.com/lcj1105/p/4969870.html
Copyright © 2011-2022 走看看