[CVPR 2017] Semantic Autoencoder for Zero-Shot Learning论文笔记

zoukankan html css js c++ java

[CVPR 2017] Semantic Autoencoder for Zero-Shot Learning论文笔记
http://openaccess.thecvf.com/content_cvpr_2017/papers/Kodirov_Semantic_Autoencoder_for_CVPR_2017_paper.pdf

Semantic Autoencoder for Zero-Shot Learning，Elyor Kodirov Tao Xiang Shaogang Gong，Queen Mary University of London, UK，{e.kodirov, t.xiang, s.gong}@qmul.ac.uk

亮点
- 通过对耦学习提升零次学习系统的性能（类似CycleGan）
- 结构非常简洁，且可直接求解，速度非常快
- 有效应用到其他相关任务（监督聚类）上，证明了范化性能
方法

Linear autoencoder

Model Formulation

which is a well-known Sylvester equation which can be solved efficiently by the Bartels-Stewart algorithm (matlab sylvester).

零次学习：基于以上算法有两种测试的方法：
- 将一个未知的类别特征样本xi通过W映射到语义空间（属性）si，通过比较语义空间的距离找到离它最近的类别（无训练样本），即为它的标签
- 将所有无训练数据类别的语义特征S通过WT映射到特征空间X，通过比较一个未知类别的样本xi和映射到特征空间的类别中心X的距离，找到离它最近的类别，即为它的标签
- 以上两种算法得到结果的准确度基本相同。
监督聚类：在这个问题中，语义空间即为类别标签空间（one-hot class label）。所有测试数据被影射到训练类别标签空间，然后使用k-means聚合

与已有模型的关系：零度学习已有模型一般学习一个满足以下条件的影射：

或者，在［54］中将属性影射到特征空间，学习目标变为，

文中的算法结合了这两者，而且由于W*=WT，在对耦学习中W不可能太大（否则，x乘以两个范数很大的的矩阵无法恢复原来的初始值），正则化项可以被忽略。

实验

零次学习

数据集：Semantic word vector representation is used for large-scale datasets (ImNet-1 and ImNet-2). We train a skip-gram text model on a corpus of 4.6M Wikipedia documents to obtain the word2vec2 [38, 37] word vectors.

特征：除 ImNet-1用AlexNet提取外，其他均使用了GoogleNet

结果：
- Our SAE model achieves the best results on all 6 datasets.
- On the smallscale datasets, the gap between our model’s results to the strongest competitor ranges from 3.5% to 6.5%.
- On the large-scale datasets, the gaps are even bigger: On the largest ImNet-2, our model improves over the state-of-the-art SS-Voc [22] by 8.8%.
- Both the encoder and decoder projection functions in our SAE model (SAE (W) and SAE (WT) respectively) can be used for effective ZSL.
- Measures how well a zero-shot learning method can trade-off between recognising data from seen classes and that of unseen classes
聚类

数据集： A synthetic dataset and Oxford Flowers-17 (848 images)

结果：
- On computational cost, our model (93s) is more expensive than MLCA (39%) but much better than all others (hours~days).
- Achieves the best clustering accuracy
查看全文

相关阅读:
ios webapp调试神器MIHTool
20个正则表达式
 jQuery技巧
 浏览器判断和移动端的判断
 JavaScript 被忽视的细节
 移动端Web页面问题解决方案
 virtualenv创建虚拟环境
 init.d文件夹
 python連接mysql數據庫
 const和define的使用区别

原文地址：https://www.cnblogs.com/Xiaoyan-Li/p/8578598.html