zoukankan      html  css  js  c++  java
  • 关于BOF改进方法的一些introduction

    [1]

    Zhang et al.[1] propose a framework that encode spatial information to inverted index by integrating local adjacency of visual words. Descriptive Visual Words (DVWs) and Descriptive Visual Phrases (DVPs) are proposed as the visual correspondences to text words and phrases, where visual phrases refer to the frequently co-occurring visual word pairs. The frequency in [1] was computed between two visual words within short distance (neighboring is not required).

    [2]

    Zheng et al.[2] propose a high-level visual words representation called visual synset, by constructing an intermediate visual descriptor, delta visual phrase, from frequently co-occurring visual word-set with similar spatial context and clustering delta visual phrases into visual synset, based their probabilistic’ semantics, it strengthens the discrimination and invariance power of traditional BOF.

    [3]

    Chen et al.[3] propose a new technique for combining local, discriminatively trained classifiers over groups of (super-) pixels into a joint model over labels. The method generates samples by iterating forward a weakly chaotic dynamical system instead using a trained CRF.

    [4]

    Csurka et al.[4] use a system that scores low-level patches according to their class relevance, propagates these posterior probabilities to pixels and uses low-level segmentation to guide the semantic segmentation. Firstly, they describe each patch with a high-level descriptor based on the Fisher kernel and use a set of linear classifiers, and then they use global image classifiers to take into account the context of the objects to be segmented.

    [5]

    Herve et al.[5] use visual words pairs despite the co-occurrence of neighboring visual words or other spatial relations, firstly, they construct a base-vocabulary containing n words, then they get the pairs-vocabulary, since they did not capture spatial information between words, the pairs-vocabulary size will be n(n+1)/2. After constructing the pairs-vocabulary, a SVM classifier will be used to train the training samples and then to finish the automatic task.

    [6]

    Zhang et al.[6] propose to encode more spatial information through the geometry-preserving visual phrases(GVP), that is, incorporate information about relative spatial locations of the features forming a visual phrase into its representation (hence “geometry-preserving”).In addition to co-occurrences, the GVP method can captures the local and long-range spatial layouts of the words.

    [7]

    Li et al.[7] propose the contextual bag-of-words(CBOW) representation that integrates semantic conceptual relation and spatial neighboring relation. So local spatial consistency from some spatial nearest neighbors is used to filter false visual-word matches. However, they did not consider the co-occurrence of the neighboring words.

    [8]

    Tirilly et al.[8] propose a new image representation called visual sentences that allows to consider simple spatial relations between visual words, and then use probabilistic Latent Semantic Analysis (PLSA) to eliminate the noisiest visual words. They capture the spatial information by getting an appropriate axis and to project keypoints on it.

    [1] Shiliang Zhang, Qi Tian, Gang Hua, Qingming Huang, Shipeng Li; Descriptive Visual Words and Visual Phrases for Image Applications, ACM Int. Conf. on Multimedia 2009

    [2] Zheng, Yan-tao and Zhao, Ming and Neo, Shi-yong and Chua, Tat-seng and Tian, Qi; Visual Synset: Towards a Higher-level Visual Representation, CVPR 2008

    [3] Yutian Chen, Andrew Gelfand, Charless C. Fowlkes, Max Welling; Integrating Local Classifiers through Nonlinear Dynamics on Label Graphs with an Application to Image Segmentation, ICCV 2011

    [4] Gabriela Csurka and Florent Perronnin; A Simple High Performance Approach to Semantic Segmentation, BMVC 2008

    [5] Nicolas Herve, Nozha Boujemaa; Visual Word Pairs for Automatic Image Annotation, ICME 2009

    [6] Yimeng Zhang, Zhaoyin Jia, Tsuhan Chen; Image Retrieval with Geometry-Preserving Visual Phrases

    [7] Teng Li; Contextual Bag-of-Words for Visual Categorization, IEEE Trans. on Circuits and Systems for Video Technology 2011

    [8] Pierre Tirilly, Vincent Claveau, Patrick Gros; Language modeling for bag-of-visual words image categorization, CIVR 2008

  • 相关阅读:
    STL笔记之【map之总概】
    STL笔记之set
    Effective C++笔记之Item49【了解new-handler的行为】
    明成软件条形码打印设置
    将Excel数据导入到SqlServer及导入时数据类型转换失败解决方案
    远程桌面无法复制粘贴传输文件解决办法
    DELPHI如何读取cxcheckcombobox中的值
    Delphi 插入Excel图片和值
    SQL 查询语句先执行 SELECT?
    Linux之xargs命令传递参数的一个过滤器
  • 原文地址:https://www.cnblogs.com/moondark/p/2683035.html
Copyright © 2011-2022 走看看