K-means algorithm----PRML读书笔记

zoukankan html css js c++ java

K-means algorithm----PRML读书笔记

The K-means algorithm is based on the use of squared Euclidean distance as the measure of dissimilarity between a data point and a prototype vector. Our goal is to partition the data set into some number K of clusters, where we shall suppose for the moment that the value of K is given. We can then define an objective function, sometimes called a distortion measure, given by J=Σ_nΣ_kr_nk||x_n-μ_k||²,where n=1,...N, k=1,...,K, N is observations of a random D-dimensional Euclidean variable x, K is number of clusters. J represents the sum of the squares of the distances of each data point to its assigned vector μ_k. We can think of the μ_k as representing the centres of the clusters. Our goal is to find values for the {r_nk} and the {μ_k} so as to minimize J. First we choose some initial values for the μ_k.Then in the first phase we minimize J with respect to the r_nk, keeping the μ_k fixed. In the second phase we minimize J with respect to μ_k, keeping r_nk fixed. This two-stage optimization is then repeated until convergence. We simply assign the n^th data point to the closest cluster centre, this can be expressed as r_nk=1,if k=argmin_j||x_n-μ_j||², otherwise r_nk=0. The objective function J is a quadratic function of μ_k, and it can be minimized by setting its derivative with respect to μ_kto zero giving 2Σ_nr_nk(x_n-μ_k)=0. μ_k=(Σ_nr_nkx_n)/(Σ_nr_nk), this result has a simple interpretation, namely set μ_k equal to the mean of all of the data points x_n assigned to cluster k. For this reason, the procedure is known as the K-means algorithm.

查看全文

相关阅读:
Java虚拟机及运行时数据区
 Math小记
 利用BitMap进行大数据排序去重
 Java直接内存与堆内存
 【分布式搜索引擎】Elasticsearch之安装Elasticsearch可视化平台Kibana
【分布式搜索引擎】Elasticsearch之如何安装Elasticsearch
【微信错误】{"errcode":"40013","errmsg":"invalid appid hint: [mackRA06203114]","success":false}
【Java】理解ClassNotFoundException与NoClassDefFoundError的区别
 【Spring Boot】Spring Boot之使用AOP实现数据库多数据源自动切换
 【异常】The dependencies of some of the beans in the application context form a cycle

原文地址：https://www.cnblogs.com/donggongdechen/p/9789561.html