zoukankan      html  css  js  c++  java
  • Video Architecture Search

    Video Architecture Search

    2019-10-20 06:48:26

     

    This blog is from: https://ai.googleblog.com/2019/10/video-architecture-search.html 

     

    Examples of various EvaNet architectures. Each colored box (large or small) represents a layer with the color of the box indicating its type: 3D conv. (blue), (2+1)D conv. (orange), iTGM (green), max pooling (grey), averaging (purple), and 1x1 conv. (pink). Layers are often grouped to form modules (large boxes). Digits within each box indicate the filter size.

    The representative AssembleNet model evolved using the Moments-in-Time dataset. A node corresponds to a block of spatio-temporal convolutional layers, and each edge specifies their connectivity. Darker edges mean stronger connections. AssembleNet is a family of learnable multi-stream architectures, optimized for the target task.
    A figure comparing AssembleNet with state-of-the-art, hand-designed models on Charades (left) and Moments-in-Time (right) datasets. AssembleNet-50 or AssembleNet-101 has an equivalent number of parameters to a two-stream ResNet-50 or ResNet-101.

    TinyVideoNet (TVN) architectures evolved to maximize the recognition performance while keeping its computation time within the desired limit. For instance, TVN-1 (top) runs at 37 ms on a CPU and 10ms on a GPU. TVN-2 (bottom) runs at 65ms on a CPU and 13ms on a GPU.
    CPU runtime of TinyVideoNet models compared to prior models (left) and runtime vs. model accuracy of TinyVideoNets compared to (2+1)D ResNet models (right). Note that TinyVideoNets take a part of this time-accuracy space where no other models exist, i.e., extremely fast but still accurate.

  • 相关阅读:
    第十一周上机
    第十周上机
    第九周上机
    第八周作业
    课程学习总结报告
    结合中断上下文切换和进程上下文切换分析Linux内核的一般执行过程
    深入理解系统调用
    基于 mykernel 2.0 编写一个操作系统内核
    交互式多媒体图书平台的设计与实现
    码农的自我修养之必备技能 学习笔记
  • 原文地址:https://www.cnblogs.com/wangxiaocvpr/p/11706640.html
Copyright © 2011-2022 走看看