PP: Deep clustering based on a mixture of autoencoders

zoukankan html css js c++ java

PP: Deep clustering based on a mixture of autoencoders

Problem: clustering

A clustering network transforms the data into another space and then selects one of the clusters. Next, the autoencoder associated with this cluster is used to reconstruct the data-point.

Introduction:

traditional method: data------> extract a feature vector from each object --------> aggregate groups of vectors in a feature space.

cluster is represented by an autoencoder network. ??how

common method: k-means; but for the high-dimensional dataset, it's less useful because inter-point distances become less informative in high-dimensional spaces.

如果对于找一个序列的pattern来说，是不是就是时间维度作为高维情况，每个pattern作为一个cluster，而有的子序列不能归到cluster当中。

representation learning has been used to map the input data into a low-dimensional feature space.

Attempts: apply unsupervised deep learning approaches for clustering. ??how

However, most focus on clustering over a low-dimensional feature space.

Transform the data into more clustering-friendly representations:

A deep version of k-means is based on learning a data representation and applying k-means in the embedded space.

How to represent a cluster:

a vector VS an autoencoder network.

Data collapsing problem: 数据崩溃问题，对于每个数据库，你必须重新调一遍程序。

for multivariate time series, how to find patterns.

1. find patterns: SAX; TICC; slide windows; 导数

2. VG, statistic features.

3.

Supplementary knowledge:

1. Pattern recognition and clustering

Pattern recognition is a mature field in computer science with well-established techniques for the assignment of unknown patterns to categories, or classes. A pattern is defined as a vector of some number of measurements, called features. Usually, a pattern recognition system uses training samples from known categories to form a decision rule for unknown patterns. The unknown pattern is assigned to one of the categories according to the decision rule. Since we are interested in the classes of documents that have been assigned by the user, we can use pattern recognition techniques to try to classify previously unseen documents into the user's categories. While pattern recognition techniques require that the number and labels of categories are known, clustering techniques are unsupervised, requiring no external knowledge of categories. Clustering methods simply try to group similar patterns into clusters whose members are more similar to each other (according to some distance measure) than to members of other clusters. There is no a priori knowledge of patterns that belong to certain groups, or even how many groups are appropriate. Refer to basic pattern recognition and clustering texts such as [5, 6, 7] for further information.

We first employ pattern recognition techniques on documents to attempt to find features for classification, then focus on clustering the raw features of the documents.

查看全文

相关阅读:
Ajax中onreadystatechange函数不执行，是因为放在open()后
 js调用ajax案例2，使用ok
js调用ajax案例
 通过设置ie的通过跨域访问数据源，来访问本地服务
 Net 4.5 WebSocket 在 Windows 7， Windows 8 and Server 2012上的比较以及问题
 Net 4.5 WebSocket 在 Windows 7， Windows 8 and Server 2012上的比较
 windows 系统纯净版官网下载地址
 iOS:给Git仓库上传代码时，超过100M会被拒绝（例如github和oschina）
iOS:Xcode8以下真机测试iOS10.0和iOS10.1配置包
 iOS:高德地图的使用

原文地址：https://www.cnblogs.com/dulun/p/12309740.html