zoukankan      html  css  js  c++  java
  • Video Architecture Search

    Video Architecture Search

    2019-10-20 06:48:26

     

    This blog is from: https://ai.googleblog.com/2019/10/video-architecture-search.html 

     

    Examples of various EvaNet architectures. Each colored box (large or small) represents a layer with the color of the box indicating its type: 3D conv. (blue), (2+1)D conv. (orange), iTGM (green), max pooling (grey), averaging (purple), and 1x1 conv. (pink). Layers are often grouped to form modules (large boxes). Digits within each box indicate the filter size.

    The representative AssembleNet model evolved using the Moments-in-Time dataset. A node corresponds to a block of spatio-temporal convolutional layers, and each edge specifies their connectivity. Darker edges mean stronger connections. AssembleNet is a family of learnable multi-stream architectures, optimized for the target task.
    A figure comparing AssembleNet with state-of-the-art, hand-designed models on Charades (left) and Moments-in-Time (right) datasets. AssembleNet-50 or AssembleNet-101 has an equivalent number of parameters to a two-stream ResNet-50 or ResNet-101.

    TinyVideoNet (TVN) architectures evolved to maximize the recognition performance while keeping its computation time within the desired limit. For instance, TVN-1 (top) runs at 37 ms on a CPU and 10ms on a GPU. TVN-2 (bottom) runs at 65ms on a CPU and 13ms on a GPU.
    CPU runtime of TinyVideoNet models compared to prior models (left) and runtime vs. model accuracy of TinyVideoNets compared to (2+1)D ResNet models (right). Note that TinyVideoNets take a part of this time-accuracy space where no other models exist, i.e., extremely fast but still accurate.

  • 相关阅读:
    C++指针详解
    C++中#include包含头文件带 .h 和不带 .h 的区别
    #if的使用说明
    非常简单的语音朗读功能
    C#基础笔记(第十一天)
    C#基础笔记(第十天)
    手机管理系统
    编程书籍大集合
    centos 安装多实例数据库
    Python3 网络爬虫(请求库的安装)
  • 原文地址:https://www.cnblogs.com/wangxiaocvpr/p/11706640.html
Copyright © 2011-2022 走看看