zoukankan      html  css  js  c++  java
  • 计算Fisher vector和VLAD

    This short tutorial shows how to compute Fisher vector and VLAD encodings with VLFeat MATLAB interface.

    These encoding serve a similar purposes: summarizing in a vectorial statistic a number of local feature descriptors (e.g. SIFT). Similarly to bag of visual words, they assign local descriptor to elements in a visual dictionary, obtained with vector quantization (KMeans) in the case of VLAD or a Gaussian Mixture Models for Fisher Vectors. However, rather than storing visual word occurrences only, these representations store a statistics of the difference between dictionary elements and pooled local features.

    Fisher encoding

    The Fisher encoding uses GMM to construct a visual word dictionary. To exemplify constructing a GMM, consider a number of 2 dimensional data points (see also the GMM tutorial). In practice, these points would be a collection of SIFT or other local image features. The following code fits a GMM to the points:

    numFeatures = 5000 ;
    dimension = 2 ;
    data = rand(dimension,numFeatures) ;
    
    numClusters = 30 ;
    [means, covariances, priors] = vl_gmm(data, numClusters);
    

    Next, we create another random set of vectors, which should be encoded using the Fisher Vector representation and the GMM just obtained:

    numDataToBeEncoded = 1000;
    dataToBeEncoded = rand(dimension,numDataToBeEncoded);
    

    The Fisher vector encoding enc of these vectors is obtained by calling the vl_fisher function using the output of the vl_gmm function:

    encoding = vl_fisher(datatoBeEncoded, means, covariances, priors);
    

    The encoding vector is the Fisher vector representation of the data dataToBeEncoded.

    Note that Fisher Vectors support several normalization options that can affect substantially the performance of the representation.

    VLAD encoding

    The Vector of Linearly Agregated Descriptors is similar to Fisher vectors but (i) it does not store second-order information about the features and (ii) it typically use KMeans instead of GMMs to generate the feature vocabulary (although the latter is also an option).

    Consider the same 2D data matrix data used in the previous section to train the Fisher vector representation. To compute VLAD, we first need to obtain a visual word dictionary. This time, we use K-means:

    numClusters = 30 ;
    centers = vl_kmeans(dataLearn, numClusters);
    

    Now consider the data dataToBeEncoded and use the vl_vlad function to compute the encoding. Differently from vl_fisher, vl_vlad requires the data-to-cluster assignments to be passed in. This allows using a fast vector quantization technique (e.g. kd-tree) as well as switching from soft to hard assignment.

    In this example, we use a kd-tree for quantization:

    kdtree = vl_kdtreebuild(centers) ;
    nn = vl_kdtreequery(kdtree, centers, dataEncode) ;
    

    Now we have in the nn the indexes of the nearest center to each vector in the matrix dataToBeEncoded. The next step is to create an assignment matrix:

    assignments = zeros(numClusters,numDataToBeEncoded);
    assignments(sub2ind(size(assignments), nn, 1:length(nn))) = 1;
    

    It is now possible to encode the data using the vl_vlad function:

    enc = vl_vlad(dataToBeEncoded,centers,assignments);
    

    Note that, similarly to Fisher vectors, VLAD supports several normalization options that can affect substantially the performance of the representation.

    from: http://www.vlfeat.org/overview/encodings.html

  • 相关阅读:
    Java服务,内存OOM了,如何快速定位?
    Java内存分析工具MAT(Memory Analyzer Tool)安装使用实例
    jmap使用方法及原理
    可能发生Full gc 的情况
    java--jvm GC-常用参数配置
    JVM. GC 性能调优方法与思路
    《嫌疑人X的献身》——两个天才之间的思想火花
    爱的纯粹与代价
    18年下半年计划表
    阿里校招准备-总纲
  • 原文地址:https://www.cnblogs.com/GarfieldEr007/p/5052423.html
Copyright © 2011-2022 走看看