只开了个头……
google scholar自动生成APA引用真好用啊真好用……
以后在博客园写长文果断先写markdown转换成HTML自己改 自带的CMS好蛋疼……(教练 我要自己建博!=。=)
聚类的入门学习材料
-------------------------------
Jain, A. K. (2010). Data clustering: 50 years beyond K-means. Pattern Recognition Letters, 31(8), 651-666.
不知为何404了……搜一搜还是能在其他地方找到的
http://www.cs.msu.edu/~cse802/notes/JainDataClusteringPRL09.pdf
Matteo Matteucci的tutorial
http://home.deib.polimi.it/matteucc/Clustering/tutorial_html/index.html
四种聚类算法及其mahout用例
探索推荐引擎内部的秘密,第 3 部分: 深入推荐引擎相关算法 - 聚类
http://www.ibm.com/developerworks/cn/web/1103_zhaoct_recommstudy3/index.html
Carlos Guestrin的一个slides
http://www.cs.cmu.edu/~guestrin/Class/10701-S07/Slides/clustering.pdf
Segaran, T. (2007). Programming collective intelligence: building smart web 2.0 applications. O'Reilly Media.
http://www.amazon.com/dp/0596529325
《Data Clustering and Pattern Recognition (資料分群與樣式辨認)》 by Roger Jang (張智星)
http://neural.cs.nthu.edu.tw/jang/books/dcpr/index.asp
Google: Cluster Computing and MapReduce,原来的网址404了,这是pku的镜像,视频还在管子上
http://net.pku.edu.cn/~course/cs501/2008/resource/mapreduce-minilecture/listing.html
Andrew Ng在Coursera上鼎鼎大名的的ML课程,目前在新一轮的课程里貌似还没讲到clustering,不过网上很多保存下来的以前的资料
https://www.coursera.org/course/ml
MacKay, D. J. (2003). Information theory, inference and learning algorithms. Cambridge university press.
http://www.inference.phy.cam.ac.uk/mackay/itila/book.html
k-means
------------------------------------------------------------
k-means中英维基
http://en.wikipedia.org/wiki/K-means_clustering
http://zh.wikipedia.org/wiki/K%E5%B9%B3%E5%9D%87%E7%AE%97%E6%B3%95
DEMO(需要JRE)
http://home.deib.polimi.it/matteucc/Clustering/tutorial_html/AppletKM.html
pluskid的博文
http://blog.pluskid.org/?p=17
coolshell上的k-means简介
http://coolshell.cn/articles/7779.html
mahout wiki上的简介
https://cwiki.apache.org/confluence/display/MAHOUT/K-Means+Clustering
Matteo Matteucci的文章
http://home.deib.polimi.it/matteucc/Clustering/tutorial_html/kmeans.html
張智星的文章
http://neural.cs.nthu.edu.tw/jang/books/dcpr/dcKMeans.asp?title=3-3%20K-means%20Clustering
Coates, A., & Ng, A. Y. (2012). Learning feature representations with k-means. In Neural Networks: Tricks of the Trade (pp. 561-580). Springer Berlin Heidelberg.
http://www.stanford.edu/~acoates/papers/coatesng_nntot2012.pdf
Kanungo, T., Mount, D. M., Netanyahu, N. S., Piatko, C. D., Silverman, R., & Wu, A. Y. (2002). An efficient k-means clustering algorithm: Analysis and implementation. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 24(7), 881-892.
https://www.lri.fr/~antoine/Courses/Master-ISI/TD-TP/sujets_2009/pami02.pdf
目前整理的思维导图,待续(复制图片链接到地址栏查看大图)
Fuzzy C-Means(FCM)
------------------------------------------------------------
DEMO(需要JRE)
http://home.deib.polimi.it/matteucc/Clustering/tutorial_html/AppletFCM.html
维基
http://en.wikipedia.org/wiki/Fuzzy_clustering#Fuzzy_c-means_clustering
Matteo Matteucci的文章
http://home.deib.polimi.it/matteucc/Clustering/tutorial_html/cmeans.html
思维导图
Hierarchical Clustering
------------------------------------------------------------
Matteo Matteucci的文章
http://home.deib.polimi.it/matteucc/Clustering/tutorial_html/hierarchical.html
思维导图
canopy
------------------------------------------------------------
使用hadoop的样例代码
http://code.google.com/p/canopy-clustering/
提出该算法的paper
McCallum, A., Nigam, K., & Ungar, L. H. (2000, August). Efficient clustering of high-dimensional data sets with application to reference matching. InProceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 169-178). ACM.
http://www.kamalnigam.com/papers/canopy-kdd00.pdf
Gaussian Mixture Model (GMM)
------------------------------------------------------------
pluskid的博文
http://blog.pluskid.org/?p=39
Matteo Matteucci的文章
http://home.deib.polimi.it/matteucc/Clustering/tutorial_html/mixture.html
维基的Mixture Model词条
http://en.wikipedia.org/wiki/Mixture_model
Expectation–maximization(EM)
------------------------------------------------------------
维基
http://en.wikipedia.org/wiki/Expectation-maximization_algorithm