https://github.com/endymecy/spark-ml-source-analysis/blob/master/%E8%81%9A%E7%B1%BB/k-means/k-means.md
如何选择 K https://www.ibm.com/developerworks/cn/opensource/os-cn-spark-practice4/
K-Means聚类算法(实践篇)– 基于Spark Mlib的图像压缩案例 http://hejunhao.me/archives/718