zoukankan      html  css  js  c++  java
  • RGB-D action recognition using linear coding

    First, a depth spatial-temporal descriptor is developed to extract the interested local regions in depth image. Then the intensity spatial-temporal descriptor and the depth spatial-temporal descriptor are combined and feeded into a linear coding framework to get an effective feature vector, which can be used for action classification. Finally, extensive experiments are conducted on a publicly available RGB-D action recognition dataset and the proposed method shows promising results.

    创新点就这个了:A linear coding framework is developed to fuse the intensity spatial-temporal descriptor and the depth spatial-temporal descriptor to form robust feature vector. In addition, we further exploit the temporal intrinsics of the video sequence and design a new pooling technology to improve the description performance.

    Feature extraction

    STIPs is an extension of SIFT (Scale-Invariant-Feature-Transform) in 3-dimensional space and uses one of Harris3D, Cuboid or Hessian as the detector.

    http://www.di.ens.fr/~laptev/download.html

    AFBB7_6_IO~G2CKV$$VGL{D

    patch的分割有重叠~~

    Y8`$3LAES{2R5`K7J]3Z9JF

    算是对depth map的预处理了 ~~

    So the STIPs features in the RGB images disclose more detail characters of the subjects themselves while in the depth images they extract more characters of the shape of the subjects.

    Coding approaches

    vector quantization (VQ)

    One disadvantage of the VQ is that it introduces significant quantization errors since only one element of the codebook is selected to represent the descriptor. To remedy this, one usually has to design a nonlinear SVM as the classifier which tries to compensate the quantization errors. However, using nonlinear kernels, the SVM has to pay a high training cost, including computation and storage. Considering the above defects, localityconstrained linear coding (LLC) –a more accurate and efficient coding approach[9]is adopted to replace VQ in this paper

    Pooling strategy

    Similar to the VQ coding approach, the LLC coding coefficients ci are expected to be combined into a global representation of the sample for classification.

    DGX_H$`D)V_K`L]B~OP5SH4

    S2{PZ15)$V}_I~J)%TPIG2R

    DataSet

    RGBD-HuDaAct[1]video database

    The video sample consists of synchronized and calibrated RGB-D frame sequences, which contains in each frame a RGB image and a depth image, respectively. The RGB and depth images in each frame have been calibrated with a standard stereocalibration method available in OpenCV so that the points with the same coordinate in RGB and depth images are corresponded.

    一片简洁的paper ,给我指明了方向 ~~

  • 相关阅读:
    设备上下文-CDC绘图细节
    程序设计思想-1
    消息反射--针对通知消息
    R语言-数据类型与运算符
    背景色与WM_ERASEBKGND
    IDEA 2017.3 新版本中创建 JSF Web 应用程序缺少 web.xml 的解决办法
    在 Fedora 26/27 GNOME 3.24/3.26 环境中安装 FCITX 小企鹅输入法(修订)
    Linux 环境下安装配置 TigerVNC Server 并启用当前会话远程服务(X0VNC)
    关于 gstreamer 和 webrtc 的结合,有点小突破
    VMware Workstation/Fusion 中安装 Fedora 23/24 及其他 Linux 系统时使用 Open VM Tools 代替 VMware Tools 增强工具的方法
  • 原文地址:https://www.cnblogs.com/sprint1989/p/3954058.html
Copyright © 2011-2022 走看看