CVPR 2021 - 走看看

zoukankan html css js c++ java

CVPR 2021

CVPR2021接受论文/代码分方向汇总
分类目录：
1. 检测(detection)
    图像目标检测(Image Object Detection)
    视频目标检测(Video Object Detection)
    动作检测(Activity Detection)
    异常检测(Anomally Detetion)

2. 图像分割(Image Segmentation)
    全景分割(Panoptic Segmentation)
    语义分割(Semantic Segmentation)
    实例分割(Instance Segmentation)

3. 人体姿态估计(Human Pose Estimation)
4. 人脸(Face)
5. 目标跟踪(Object Tracking)
6. 医学影像(Medical Imaging)
7. 神经网络架构搜索(NAS)
8. GAN/生成式/对抗式(GAN/Generative/Adversarial)
9. 超分辨率(Super Resolution)
10. 图像复原(Image Restoration)
11. 图像编辑(Image Edit)
12. 图像翻译（Image Translation）
13. 三维视觉(3D Vision)
    三维点云(3D Point Cloud)
    三维重建(3D Reconstruction)

14. 神经网络架构(Neural Network Structure)
    Transformer
    图神经网络(GNN)

15. 数据处理(Data Processing)
    数据增广(Data Augmentation)
    归一化(Batch Normalization)
    图像聚类(Image Clustering)

16. 模型压缩(Model Compression)
    知识蒸馏(Knowledge Distillation)

17. 模型评估(Model Evaluation)
18. 数据集(Database)
19. 主动学习(Active Learning)
20. 小样本学习(Few-shot Learning)
21. 持续学习(Continual Learning/Life-long Learning)
————————————————

原文链接：https://blog.csdn.net/Extremevision/article/details/114259180

分类目录：
1. 检测
    图像目标检测(Image Object Detection)视频目标检测(Video Object Detection)三维目标检测(3D Object Detection)人物交互检测(HOI Detection)伪装目标检测(Camouflaged Object Detection)旋转目标检测(Rotation Object Detection)显著性目标检测(Saliency Object Detection)图像异常检测(Anomally Detection in Image))关键点检测(Keypoint Detection)

2. 分割(Segmentation)
    图像分割(Image Segmentation)全景分割(Panoptic Segmentation)语义分割(Semantic Segmentation)实例分割(Instance Segmentation)超像素(Superpixel)视频目标分割(Video Object Segmentation)抠图(Matting)密集预测(Dense Prediction)

3. 图像处理(Image Processing)
    超分辨率(Super Resolution)图像复原/图像增强(Image Restoration)图像去阴影/去反射(Image Shadow Removal/Image Reflection Removal)图像去噪/去模糊/去雨去雾(Image Denoising)图像编辑/修复(Image Edit/Image Inpainting)图像翻译(Image Translation)图像质量评估(Image Quality Assessment)风格迁移(Style Transfer)

4. 估计(Estimation)
    姿态估计(Pose Estimation)手势估计(Gesture Estimation)光流/位姿/运动估计(Flow/Pose/Motion Estimation)深度估计(Depth Estimation)

5. 图像&视频检索/理解(Image&Video Retrieval/Video Understanding)
    行为识别/行为识别/动作识别/检测/分割(Action/Activity Recognition)重识别(Re-Identification)图像/视频字幕(Image/Video Caption)

6. 人脸(Face)
    人脸识别/检测(Facial Recognition/Detection)人脸生成/合成/伪造/编辑(Face Generation/Face Synthesis/Face Forgery/Face Editing)人脸反欺骗(Face Anti-Spoofing)

7. 三维视觉(3D Vision)
    点云(Point Cloud)三维重建(3D Reconstruction)

8. 目标跟踪(Object Tracking)
9. 医学影像(Medical Imaging)
10. 文本检测/识别(Text Detection/Recognition)
11. 遥感图像(Remote Sensing Image)
12. GAN/生成式/对抗式(GAN/Generative/Adversarial)
13. 图像生成/合成(Image Generation/Image Synthesis)
    视图合成(View Synthesis)

14. 场景图(Scene Graph
    场景图生成(Scene Graph Generation)场景图预测(Scene Graph Prediction)场景图理解(Scene Graph Understanding)

15. 视觉推理/视觉问答(Visual Reasoning/VQA)
16. 神经网络结构设计(Neural Network Structure Design)
    Transformer图神经网络(GNN)

17. 模型压缩(Model Compression)
    知识蒸馏(Knowledge Distillation)剪枝(Pruning)量化(Quantization)

18. 模型训练/泛化(Model Training/Generalization)
    噪声标签(Noisy Label)

19. 模型评估(Model Evaluation)
20. 神经网络架构搜索(NAS)
21. 数据处理(Data Processing)
    数据增广(Data Augmentation)表征学习(Representation Learning)归一化/正则化(Batch Normalization)图像聚类(Image Clustering)图像压缩(Image Compression)异常检测(Anomaly Detection)

22. 主动学习(Active Learning)
23. 小样本学习/零样本学习/元学习(Few-shot/Zero-shot Learning)
24. 持续学习(Continual Learning/Life-long Learning)
25. 迁移学习/domain/自适应(Transfer Learning/Domain Adaptation)
26. 度量学习(Metric Learning)
27. 对比学习(Contrastive Learning)
28. 强化学习(Reinforcement Learning)
29. 元学习(Meta Learning)
30. 数据集(Dataset)

检测
图像目标检测(Image Object Detection)

[21] DAP: Detection-Aware Pre-training with Weak Supervision(具有弱监督的可感知检测的预训练)
paper

[20] Dense Relation Distillation with Context-aware Aggregation for Few-Shot Object Detection(稠密关系蒸馏与上下文感知聚合用于小样本对象检测)

[19] Scale-aware Automatic Augmentation for Object Detection(用于物体检测的可感知规模的自动增强)

[18] Data-Uncertainty Guided Multi-Phase Learning for Semi-Supervised Object Detection(数据不确定性指导的多阶段学习，用于半监督对象检测)
paper

[17] OTA: Optimal Transport Assignment for Object Detection(OTA：用于对象检测的最佳传输分配)

[16] Distilling Object Detectors via Decoupled Features(通过解耦功能蒸馏物体检测器)

[15] I^3Net: Implicit Instance-Invariant Network for Adapting One-Stage Object Detectors(I ^ 3Net：用于适应一阶段对象检测器的隐式实例不变网络)

[14] Robust and Accurate Object Detection via Adversarial Learning(通过对抗学习进行稳健而准确的目标检测)

[13] You Only Look One-level Feature

[12] End-to-End Object Detection with Fully Convolutional Network()

解读：丢弃Transformer，FCN也可以实现E2E检测

[11] FSCE: Few-Shot Object Detection via Contrastive Proposal Encoding(通过对比提案编码进行的小样本目标检测)

[10] Generalized Focal Loss V2: Learning Reliable Localization Quality Estimation for Dense Object Detection(学习可靠的定位质量估计用于密集目标检测)

解读:大白话 Generalized Focal Loss V2

[9] MeGA-CDA: Memory Guided Attention for Category-Aware Unsupervised Domain Adaptive Object Detection(用于类别识别无监督域自适应对象检测)

[8] OPANAS: One-Shot Path Aggregation Network Architecture Search for Object(一键式路径聚合网络体系结构搜索对象)

[7] Semantic Relation Reasoning for Shot-Stable Few-Shot Object Detection(小样本目标检测的语义关系推理)

[6] General Instance Distillation for Object Detection(通用实例蒸馏技术在目标检测中的应用)

[5] Instance Localization for Self-supervised Detection Pretraining(自监督检测预训练的实例定位)
｜code

[4] Multiple Instance Active Learning for Object Detection（用于对象检测的多实例主动学习）

[3] Towards Open World Object Detection(开放世界中的目标检测)

[2] Positive-Unlabeled Data Purification in the Wild for Object Detection(野外检测对象的阳性无标签数据提纯)

[1] UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

解读：无监督预训练检测器

视频目标检测(Video Object Detection)

[4] Dogfight: Detecting Drones from Drones Videos(从无人机视频中检测无人机)

[3] Depth from Camera Motion and Object Detection(相机运动和物体检测的深度)

[2] There is More than Meets the Eye: Self-Supervised Multi-Object Detection and Tracking with Sound by Distilling Multimodal Knowledge(多模态知识提取的自监督多目标检测与有声跟踪)
| video | project

[1] Dogfight: Detecting Drones from Drone Videos（从无人机视频中检测无人机）

三维目标检测(3D object detection)

[10] GrooMeD-NMS: Grouped Mathematically Differentiable NMS for Monocular 3D Object Detection(用于单眼3D对象检测的数学可微分的分组NMS)

[9] Delving into Localization Errors for Monocular 3D Object Detection(深入研究单目3D对象检测的定位错误)

[8] Depth-conditioned Dynamic Message Propagation for Monocular 3D Object Detection(用于单眼3D对象检测的深度条件动态消息传播)

[7] LiDAR R-CNN: An Efficient and Universal 3D Object Detector(高效且通用的3D对象检测器)

[6] M3DSSD: Monocular 3D Single Stage Object Detector(单眼3D单级目标检测器)

[5] MonoRUn: Monocular 3D Object Detection by Self-Supervised Reconstruction and Uncertainty Propagation(通过自我监督的重构和不确定性传播进行单眼3D目标检测)

[4] ST3D: Self-training for Unsupervised Domain Adaptation on 3D Object Detection(ST3D：在三维目标检测上进行无监督域自适应的自训练)

[3] Center-based 3D Object Detection and Tracking(基于中心的3D目标检测和跟踪)

[2] 3DIoUMatch: Leveraging IoU Prediction for Semi-Supervised 3D Object Detection(利用IoU预测进行半监督3D对象检测)
| project | video

[1] Categorical Depth Distribution Network for Monocular 3D Object Detection(用于单目三维目标检测的分类深度分布网络)

人物交互检测(HOI Detection)

[4] Detecting Human-Object Interaction via Fabricated Compositional Learning(通过人为构图学习检测人与物体的相互作用)

[3] Reformulating HOI Detection as Adaptive Set Prediction(将人物交互检测重新配置为自适应集预测)

[2] QPIC: Query-Based Pairwise Human-Object Interaction Detection with Image-Wide Contextual Information(具有图像范围的上下文信息的基于查询的成对人物交互检测)

[1] End-to-End Human Object Interaction Detection with HOI Transformer(使用HOI Transformer进行端到端的人类对象交互检测)

伪装目标检测(Camouflaged Object Detection)

[1] Simultaneously Localize, Segment and Rank the Camouflaged Objects(同时定位，分割和排序伪装的对象)

旋转目标检测(Rotation Object Detection)

[2] ReDet: A Rotation-equivariant Detector for Aerial Object Detection(ReDet：用于航空物体检测的等速旋转检测器)

[1] Dense Label Encoding for Boundary Discontinuity Free Rotation Detection(密集标签编码，用于边界不连续自由旋转检测)
| 解读-DCL：旋转目标检测新方法

显著性检测(Saliency Object Detection)

[1] Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion(具有深度敏感注意力和自动多模态融合的深度RGB-D显著性检测)

图像异常检测(Anomally Detection in Image)

[1] Multiresolution Knowledge Distillation for Anomaly Detection(用于异常检测的多分辨率知识蒸馏)

关键点检测(Keypoint Detection)

[1] Skeleton Merger: an Unsupervised Aligned Keypoint Detector(骨架合并：无监督的对准关键点检测器)

分割(Segmentation)
图像分割(Image Segmentation)

[8] DiNTS: Differentiable Neural Network Topology Search for 3D Medical Image Segmentation(DiNTS：用于3D医学图像分割的可区分神经网络拓扑搜索)

[7] Self-Guided and Cross-Guided Learning for Few-Shot Segmentation(自我指导和交叉指导学习，用于少量分割)

[6] Locate then Segment: A Strong Pipeline for Referring Image Segmentation(找到然后分割：用于参考图像分割的强大管道)

[5] Boundary IoU: Improving Object-Centric Image Segmentation Evaluation(边界IoU：改进以对象为中心的图像分割评估)

[4] PointFlow: Flowing Semantics Through Points for Aerial Image Segmentation(语义流经点以进行航空图像分割)

[3] FedDG: Federated Domain Generalization on Medical Image Segmentation via Episodic Learning in Continuous Frequency Space(在连续频率空间中通过情景学习进行医学图像分割的联合域泛化)

[2] Few-Shot Segmentation Without Meta-Learning: A Good Transductive Inference Is All You Need?(【小样本】没有元学习的小样本分割：你只需要一个好的转换推论？)

[1] PointFlow: Flowing Semantics Through Points for Aerial Image Segmentation(语义流经点以进行航空图像分割)

全景分割(Panoptic Segmentation)

[3] Panoptic-PolarNet: Proposal-free LiDAR Point Cloud Panoptic Segmentation(无提案的LiDAR点云全景分割)

[2] Cross-View Regularization for Domain Adaptive Panoptic Segmentation(用于域自适应全景分割的跨视图正则化)

[1] 4D Panoptic LiDAR Segmentation（4D全景LiDAR分割）

语义分割(Semantic Segmentation)

[15] PiCIE: Unsupervised Semantic Segmentation using Invariance and Equivariance in Clustering(PiCIE：在聚类中使用不变性和等方差的无监督语义分割)

[14] Source-Free Domain Adaptation for Semantic Segmentation(用于语义分割的无源域自适应)

[13] RobustNet: Improving Domain Generalization in Urban-Scene Segmentation via Instance Selective Whitening(通过实例选择性增白提高城市场景分割中的域泛化)

[12] Coarse-to-Fine Domain Adaptive Semantic Segmentation with Photometric Alignment and Category-Center Regularization(具有光度对齐和类别中心正则化的粗到细域自适应语义分割)

[11] Cross-Dataset Collaborative Learning for Semantic Segmentation(跨数据集协同学习的语义分割)

[10] BBAM: Bounding Box Attribution Map for Weakly Supervised Semantic and Instance Segmentation(用于弱监督语义和实例细分的边界框归因图)

[9] Continual Semantic Segmentation via Repulsion-Attraction of Sparse and Disentangled Latent Representations(通过稀疏和纠缠的潜在表示的排斥力进行连续语义分割)

[8] Semantic Segmentation for Real Point Cloud Scenes via Bilateral Augmentation and Adaptive Fusion(通过双边扩充和自适应融合对实点云场景进行语义分割)

[7] Capturing Omni-Range Context for Omnidirectional Segmentation(捕获全方位上下文进行全方位分割)

[6] MetaCorrection: Domain-aware Meta Loss Correction for Unsupervised Domain Adaptation in Semantic Segmentation(MetaCorrection：语义分割中无监督域自适应的域感知元丢失校正)

[5] Learning Statistical Texture for Semantic Segmentation(学习用于语义分割的统计纹理)

[4] Semi-supervised Domain Adaptation based on Dual-level Domain Mixing for Semantic Segmentation(基于双层域混合的半监督域自适应语义分割)

[3] Multi-Source Domain Adaptation with Collaborative Learning for Semantic Segmentation(多源领域自适应与协作学习的语义分割)

[2] Towards Semantic Segmentation of Urban-Scale 3D Point Clouds: A Dataset, Benchmarks and Challenges(走向城市规模3D点云的语义分割：数据集，基准和挑战)

[1] PLOP: Learning without Forgetting for Continual Semantic Segmentation（PLOP：学习而不会忘记连续的语义分割）

实例分割(Instance Segmentation)

[4] Weakly Supervised Instance Segmentation for Videos with Temporal Mask Consistency(具有时间掩码一致性的视频的弱监督实例分割)

[3] Deep Occlusion-Aware Instance Segmentation with Overlapping BiLayers(具有重叠BiLayer的深度遮挡感知实例分割)

[2] BBAM: Bounding Box Attribution Map for Weakly Supervised Semantic and Instance Segmentation(用于弱监督语义和实例细分的边界框归因图)

[1] End-to-End Video Instance Segmentation with Transformers(使用Transformer的端到端视频实例分割)

超像素(Superpixel)

[1] Learning the Superpixel in a Non-iterative and Lifelong Manner(以非迭代和终身的方式学习超像素)

视频目标分割(Video Object Segmentation)

[3] Efficient Regional Memory Network for Video Object Segmentation(用于视频对象分割的高效区域存储网络)

[2] Learning to Recommend Frame for Interactive Video Object Segmentation in the Wild(学习推荐帧用于交互式野外视频对象分割)

[1] Modular Interactive Video Object Segmentation: Interaction-to-Mask, Propagation and Difference-Aware Fusion(模块化交互式视频对象分割：面具交互，传播和差异感知融合)
| project

抠图(Matting)

[1] Real-Time High Resolution Background Matting
| project | video

密集预测(Dense Prediction)

[3] Generic Perceptual Loss for Modeling Structured Output Dependencies(用于建模结构化输出依存关系的一般感知损失)

[2]Densely connected multidilated convolutional networks for dense prediction tasks（用于密集预测任务的多重卷积连接网络）

[1] Dense Contrastive Learning for Self-Supervised Visual Pre-Training(自监督视觉预训练的密集对比学习)

估计(Estimation)
姿态估计(Human Pose Estimation)

[7] Human POSEitioning System (HPS): 3D Human Pose Estimation and Self-localization in Large Scenes from Body-Mounted Sensors(人体姿势系统（HPS）：人体安装传感器在大场景中的3D人体姿势估计和自定位)
| project

[6] Graph Stacked Hourglass Networks for 3D Human Pose Estimation(用于3D人体姿势估计的图形堆叠沙漏网络)

[5] From Synthetic to Real: Unsupervised Domain Adaptation for Animal Pose Estimation(【动物姿态估计】从合成到真实：用于动物姿势估计的无监督域自适应)

[4] DCPose: Deep Dual Consecutive Network for Human Pose Estimation(用于人体姿态估计的深度双重连续网络)

[3] Differentiable Multi-Granularity Human Representation Learning for Instance-Aware Human Semantic Parsing(用于实例感知人类语义解析的可微分多粒度人类表示学习)

[2] CanonPose: Self-supervised Monocular 3D Human Pose Estimation in the Wild（野外自监督的单眼3D人类姿态估计）

[1] PCLs: Geometry-aware Neural Reconstruction of 3D Pose with Perspective Crop Layers（具有透视作物层的3D姿势的几何感知神经重建）

手势估计(Gesture Estimation)

[3] Read and Attend: Temporal Localisation in Sign Language Videos(阅读和参加：手语视频中的时间本地化)
| [project](https://www.robots.ox.ac.uk/ ?vgg/research/bslattend/)

[2] Skeleton Based Sign Language Recognition Using Whole-body Keypoints(基于全身关键点的基于骨架的手语识别)

[1] Camera-Space Hand Mesh Recovery via Semantic Aggregation and Adaptive 2D-1D Registration(基于语义聚合和自适应2D-1D配准的相机空间手部网格恢复)

光流/位姿/运动估计(Flow/Pose/Motion Estimation)

[4] FS-Net: Fast Shape-based Network for Category-Level 6D Object Pose Estimation with Decoupled Rotation Mechanism(具有分离旋转机制的类别级6D对象姿势估计的快速基于形状的网络)

[3] GDR-Net: Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation(用于单眼6D对象姿态估计的几何引导直接回归网络)

[2] Robust Neural Routing Through Space Partitions for Camera Relocalization in Dynamic Indoor Environments(在动态室内环境中，通过空间划分的鲁棒神经路由可实现摄像机的重新定位)
| project

[1] MultiBodySync: Multi-Body Segmentation and Motion Estimation via 3D Scan Synchronization(通过3D扫描同步进行多主体分割和运动估计)

深度估计(Depth Estimation)

[4] Mask-ToF: Learning Microlens Masks for Flying Pixel Correction in Time-of-Flight Imaging(学习微透镜掩模以在飞行时间成像中进行飞行像素校正)
| project

[3] Generalizing to the Open World: Deep Visual Odometry with Online Adaptation(推广到开放世界：具有在线适应功能的深度视觉里程表)

[2] Beyond Image to Depth: Improving Depth Prediction using Echoes(超越图像深度：使用回声改善深度预测)

[1] PLADE-Net: Towards Pixel-Level Accuracy for Self-Supervised Single-View Depth Estimation with Neural Positional Encoding and Distilled Matting Loss(具有神经位置编码和蒸馏消光损耗的自我监督单视图深度估计的像素级精度)

图像处理(Image Processing)

[1] Invertible Image Signal Processing(可逆图像信号处理)

超分辨率(Super Resolution)

[5] Flow-based Kernel Prior with Application to Blind Super-Resolution(基于流的内核先于盲超分辨率的应用)

[4] ClassSR: A General Framework to Accelerate Super-Resolution Networks by Data Characteristic(通过数据特征加速超分辨率网络的通用框架)
| 解读-超分性能不降低，计算量降低50%：加速图像超分的ClassSR

[3] Learning Continuous Image Representation with Local Implicit Image Function(通过局部隐含图像功能学习连续图像表示)
paepr | code | video | project

[2] Data-Free Knowledge Distillation For Image Super-Resolution(DAFL算法的SR版本)

[1] AdderSR: Towards Energy Efficient Image Super-Resolution(将加法网路应用到图像超分辨率中)

解读：华为开源加法神经网络

图像复原/图像增强(Image Restoration)

[2] NeX: Real-time View Synthesis with Neural Basis Expansion(NeX：具有神经基础扩展的实时视图合成)

[1] Multi-Stage Progressive Image Restoration(多阶段渐进式图像复原)

图像去阴影/去反射(Image Shadow Removal/Image Reflection Removal)

[3] From Shadow Generation to Shadow Removal(从阴影生成到阴影去除)

[2] Robust Reflection Removal with Reflection-free Flash-only Cues(通过无反射的仅含Flash线索进行鲁棒的反射去除)

[1] Auto-Exposure Fusion for Single-Image Shadow Removal(用于单幅图像阴影去除的自动曝光融合)

图像去噪/去模糊/去雨去雾(Image Denoising)

[3] Semi-Supervised Video Deraining with Dynamic Rain Generator(带动态雨水产生器的半监督视频去雨)

[2] ARVo: Learning All-Range Volumetric Correspondence for Video Deblurring(学习用于视频去模糊的全范围体积对应)

[1] DeFMO: Deblurring and Shape Recovery of Fast Moving Objects(快速移动物体的去模糊和形状恢复)
| video

图像编辑/图像修复(Image Edit/Inpainting)

[8] TransFill: Reference-guided Image Inpainting by Merging Multiple Color and Spatial Transformations(通过合并多个颜色和空间变换进行参考引导的图像修复)

[7] DeFLOCNet: Deep Image Editing via Flexible Low-level Controls(通过灵活的低级控件进行深度图像编辑)

[6] Generating Diverse Structure for Image Inpainting With Hierarchical VQ-VAE(使用分层VQ-VAE生成图像修复的多样结构)

[5] PISE: Person Image Synthesis and Editing with Decoupled GAN(使用分离的GAN进行人像合成和编辑)

[4] DeFLOCNet: Deep Image Editing via Flexible Low level Controls(通过灵活的低级控件进行深度图像编辑)

[3] PD-GAN: Probabilistic Diverse GAN for Image Inpainting(用于图像修复的概率多样GAN)

[2] Anycost GANs for Interactive Image Synthesis and Editing(用于交互式图像合成和编辑的AnyCost Gans)

[1] Exploiting Spatial Dimensions of Latent in GAN for Real-time Image Editing（利用GAN中潜在的空间维度进行实时图像编辑）

图像翻译(Image Translation)

[6] ReMix: Towards Image-to-Image Translation with Limited Data(使用有限的数据实现图像到图像的翻译)

[5] Closing the Loop: Joint Rain Generation and Removal via Disentangled Image Translation(闭环：通过解图像翻译联合产生和去除雨水)

[4] CoMoGAN: continuous model-guided image-to-image translation(连续的模型指导的图像到图像翻译)

[3] Spatially-Adaptive Pixelwise Networks for Fast Image Translation(空间自适应像素网络，用于快速图像翻译)
| project

[2] Image-to-image Translation via Hierarchical Style Disentanglement
| 解读-层次风格解耦：人脸多属性篡改终于可控了(CVPR2021 Oral)

[1] Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation(样式编码：用于图像到图像翻译的StyleGAN编码器)
| project

图像质量评估(Image Quality Assessment)

[1] SDD-FIQA: Unsupervised Face Image Quality Assessment with Similarity Distribution Distance(具有相似分布距离的无监督人脸图像质量评估)

风格迁移(Style Transfer)

[2] ArtFlow: Unbiased Image Style Transfer via Reversible Neural Flows(通过可逆神经流进行无偏的图像风格迁移)

[1] Rethinking Style Transfer: From Pixels to Parameterized Brushstrokes(重新考虑风格迁移：从像素到参数化笔触)

人脸(Face)

[4] Unsupervised Disentanglement of Linear-Encoded Facial Semantics(线性编码的面部语义的无监督解缠)

[3] High-fidelity Face Tracking for AR/VR via Deep Lighting Adaptation(通过深度照明自适应实现AR / VR的高保真人脸跟踪)
| project

[2] Structure-Aware Face Clustering on a Large-Scale Graph with 10^7 Nodes(具有10^7个节点的大规模图上的结构感知人脸聚类)
&project

[1] SDD-FIQA: Unsupervised Face Image Quality Assessment with Similarity Distribution Distance(具有相似分布距离的无监督人脸图像质量评估)

人脸识别/检测(Facial Recognition/Detection)

[6] Affective Processes: stochastic modelling of temporal context for emotion and facial expression recognition(情感过程：情感和面部表情识别的时态随机模型)

[5] Cross-Domain Similarity Learning for Face Recognition in Unseen Domains(跨域相似性学习在未知领域中的人脸识别)

[4] MagFace: A Universal Representation for Face Recognition and Quality Assessment(MagFace：人脸识别和质量评估的通用表示形式)

[3] CRFace: Confidence Ranker for Model-Agnostic Face Detection Refinement(用于模型不可知的面部检测细化的置信度排名)

[2] A 3D GAN for Improved Large-pose Facial Recognition(用于改善大姿势面部识别的3D GAN)

[1] WebFace260M: A Benchmark Unveiling the Power of Million-Scale Deep Face Recognition(揭示了百万级深度人脸识别力量的基准测试)
| benchmark

人脸生成/合成/伪造/编辑(Face Generation/Face Synthesis/Face Forgery/Face Editing)

[9] Face Forensics in the Wild(人脸伪造数据集)
|

[8] High-Fidelity and Arbitrary Face Editing(高保真和任意脸部编辑)

[7] Frequency-aware Discriminative Feature Learning Supervised by Single-Center Loss for Face Forgery Detection(【人脸伪造检测】由单中心损失监督的频率感知判别特征学习，用于人脸伪造检测)

[6] 3DCaricShop: A Dataset and A Baseline Method for Single-view 3D Caricature Face Reconstruction(单视图3D漫画面部重建的数据集和基线方法)
| project

[5] ForgeryNet: A Versatile Benchmark for Comprehensive Forgery Analysis(进行全面伪造分析的多功能基准)

[4] Image-to-image Translation via Hierarchical Style Disentanglement(通过分层样式分解实现图像到图像的翻译)

[3] When Age-Invariant Face Recognition Meets Face Age Synthesis: A Multi-Task Learning Framework(当年龄不变的人脸识别遇到人脸年龄合成时：一个多任务学习框架)

[2] PISE: Person Image Synthesis and Editing with Decoupled GAN(使用分离的GAN进行人像合成和编辑)

[1] Soft-IntroVAE: Analyzing and Improving Introspective Variational Autoencoders(分析和改进自省变分自动编码器)
| project

人脸反欺骗(Face Anti-Spoofing)

[3] MagDR: Mask-guided Detection and Reconstruction for Defending Deepfakes(面罩引导的检测和重建，以防御深造假)

[2] Cross Modal Focal Loss for RGBD Face Anti-Spoofing(跨模态焦点损失，用于RGBD人脸反欺骗)

[1] Multi-attentional Deepfake Detection(多注意的Deepfake检测)

目标跟踪(Object Tracking)

[12] Towards More Flexible and Accurate Object Tracking with Natural Language: Algorithms and Benchmark(使用自然语言实现更灵活，准确的对象跟踪：算法和基准)

[11] Learnable Graph Matching: Incorporating Graph Partitioning with Deep Feature Learning for Multiple Object Tracking(可学习的图匹配：将图分区与深度特征学习相结合以实现多对象跟踪)

[10] IoU Attack: Towards Temporally Coherent Black-Box Adversarial Attack for Visual Object Tracking(IoU攻击：针对视觉对象跟踪的临时相干黑盒对抗攻击)

[9] Transformer Tracking(Transformer跟踪)

[8] Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking(Transformer与追踪器相遇：利用时间上下文进行可靠的视觉追踪)

[7] Track to Detect and Segment: An Online Multi-Object Tracker(跟踪检测和分段：在线多目标跟踪器)

[6] Learning a Proposal Classifier for Multiple Object Tracking(用于多对象跟踪的分类器)

[5] Center-based 3D Object Detection and Tracking(基于中心的3D目标检测和跟踪)

[4] HPS: localizing and tracking people in large 3D scenes from wearable sensors(通过可穿戴式传感器对大型3D场景中的人进行定位和跟踪)

[3] Track to Detect and Segment: An Online Multi-Object Tracker(跟踪检测和分段：在线多对象跟踪器)
project | video

[2] Probabilistic Tracklet Scoring and Inpainting for Multiple Object Tracking(多目标跟踪的概率小波计分和修复)

[1] Rotation Equivariant Siamese Networks for Tracking（旋转等距连体网络进行跟踪）

图像&视频检索/理解(Image&Video Retrieval/Video Understanding)

[6] Thinking Fast and Slow: Efficient Text-to-Visual Retrieval with Transformers(快速思考和缓慢思考：使用变压器进行高效的文本到视觉检索)

[5] StyleMeUp: Towards Style-Agnostic Sketch-Based Image Retrieval(StyleMeUp：迈向与风格无关的基于草图的图像检索)

[4] More Photos are All You Need: Semi-Supervised Learning for Fine-Grained Sketch Based Image Retrieval(您只需要更多照片：基于半监督学习的细粒度基于草图的图像检索)

[3] Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning(使用分层Transformer和自我监督学习改进跨模态食谱检索)

[2] On Semantic Similarity in Video Retrieval(视频检索中的语义相似度)

[1] QAIR: Practical Query-efficient Black-Box Attacks for Image Retrieval(实用的查询高效的图像检索黑盒攻击)

行为识别/动作识别/检测/分割/定位(Action/Activity Recognition)

[15] CoLA: Weakly-Supervised Temporal Action Localization with Snippet Contrastive Learning(带有片段对比学习的弱监督实时动作定位)

[14] Recognizing Actions in Videos from Unseen Viewpoints(从看不见的角度识别视频中的动作)

[13] No frame left behind: Full Video Action Recognition(没有残影：完整的视频动作识别)

[12] Learning Salient Boundary Feature for Anchor-free Temporal Action Localization(学习显着边界特征以实现无锚时间动作定位)

[11] Temporal Context Aggregation Network for Temporal Action Proposal Refinement(时间上下文聚合网络，用于改进时间行动建议)

[10] The Blessings of Unlabeled Background in Untrimmed Videos(未修饰视频中未标记背景的祝福)

[9] Temporally-Weighted Hierarchical Clustering for Unsupervised Action Segmentation(临时加权层次聚类，实现无监督动作分割)

[8] Coarse-Fine Networks for Temporal Activity Detection in Videos(粗细网络，用于视频中的时间活动检测)

[7] Learning Discriminative Prototypes with Dynamic Time Warping(通过动态时间扭曲学习判别性原型)

[6] Temporal Action Segmentation from Timestamp Supervision(时间监督中的时间动作分割)

[5] ACTION-Net: Multipath Excitation for Action Recognition(用于动作识别的多路径激励)

[4] BASAR:Black-box Attack on Skeletal Action Recognition(骨骼动作识别的黑匣子攻击)

[3] Understanding the Robustness of Skeleton-based Action Recognition under Adversarial Attack(了解对抗攻击下基于骨骼的动作识别的鲁棒性)

[2] Temporal Difference Networks for Efficient Action Recognition(用于有效动作识别的时差网络)

[1] Behavior-Driven Synthesis of Human Dynamics(行为驱动的人类动力学综合)
<>
重识别

[7] Group-aware Label Transfer for Domain Adaptive Person Re-identification(组感知标签传输，用于域自适应行人重识别)

[6] Lifelong Person Re-Identification via Adaptive Knowledge Accumulation(通过自适应知识积累对终身行人重识别)

[5] Anchor-Free Person Search(Anchor-Free行人搜索)

[4] Intra-Inter Camera Similarity for Unsupervised Person Re-Identification(摄像机内部相似度用于无监督人员重新识别)

[3] Watching You: Global-guided Reciprocal Learning for Video-based Person Re-identification(基于视频的人员重新识别的全球指导对等学习)

[2] Joint Noise-Tolerant Learning and Meta Camera Shift Adaptation for Unsupervised Person Re-Identification(联合抗噪学习和元相机移位自适应，用于无监督人员的重新识别)

[1] Meta Batch-Instance Normalization for Generalizable Person Re-Identification(通用批处理人员重新标识的元批实例规范化)

图像/视频字幕(Image/Video Caption)

[6] Human-like Controllable Image Captioning with Verb-specific Semantic Roles(具有动词特定语义作用的类人可控图像字幕)

[5] Co-Grounding Networks with Semantic Attention for Referring Expression Comprehension in Videos(语义注意的共同接地网络，用于引用视频中的表达理解)
| project

[4] Multiple Instance Captioning: Learning Representations from Histopathology Textbooks and Articles(多实例字幕：从组织病理学教科书和文章中学习表示形式)

[3] Open-book Video Captioning with Retrieve-Copy-Generate Network(带有检索复制生成网络的开卷视频字幕)

[2] VX2TEXT: End-to-End Learning of Video-Based Text Generation From Multimodal Inputs(基于视频的文本生成的端到端学习来自多模式输入)

[1] Scan2Cap: Context-aware Dense Captioning in RGB-D Scans(：RGB-D扫描中的上下文感知密集字幕) | project | video

医学影像(Medical Imaging)

[12] DiNTS: Differentiable Neural Network Topology Search for 3D Medical Image Segmentation(DiNTS：用于3D医学图像分割的可区分神经网络拓扑搜索)

[11] Confluent Vessel Trees with Accurate Bifurcations(分叉的融合容器树)

[10] Brain Image Synthesis with Unsupervised Multivariate Canonical CSC?4Net(无监督多元规范CSC?4Net的脑图像合成)

[9] XProtoNet: Diagnosis in Chest Radiography with Global and Local Explanations(使用全局和局部解释诊断胸部X光片)

[8] FedDG: Federated Domain Generalization on Medical Image Segmentation via Episodic Learning in Continuous Frequency Space(在连续频率空间中通过情景学习进行医学图像分割的联合域泛化)

[7] Multiple Instance Captioning: Learning Representations from Histopathology Textbooks and Articles(多实例字幕：从组织病理学教科书和文章中学习表示形式)

[6] Discovering Hidden Physics Behind Transport Dynamics(在运输动力学背后发现隐藏物理)

[5] DeepTag: An Unsupervised Deep Learning Method for Motion Tracking on Cardiac Tagging Magnetic Resonance Images(一种心脏标记磁共振图像运动跟踪的无监督深度学习方法)

[4] Multi-institutional Collaborations for Improving Deep Learning-based Magnetic Resonance Image Reconstruction Using Federated Learning(多机构协作改进基于深度学习的联合学习磁共振图像重建)

[3] 3D Graph Anatomy Geometry-Integrated Network for Pancreatic Mass Segmentation, Diagnosis, and Quantitative Patient Management(用于胰腺肿块分割，诊断和定量患者管理的3D图形解剖学几何集成网络)

[2] Deep Lesion Tracker: Monitoring Lesions in 4D Longitudinal Imaging Studies(深部病变追踪器：在4D纵向成像研究中监控病变)

[1] Automatic Vertebra Localization and Identification in CT by Spine Rectification and Anatomically-constrained Optimization(通过脊柱矫正和解剖学约束优化在CT中自动进行椎骨定位和识别)

文本检测/识别(Text Detection/Recognition)

[2] Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Recognition(像人类一样阅读：用于场景文本识别的自主，双向和迭代语言建模)

[1] What If We Only Use Real Datasets for Scene Text Recognition? Toward Scene Text Recognition With Fewer Labels(如果我们仅将真实数据集用于场景文本识别该怎么办？带有较少标签的场景文本识别)
paepr | code

遥感图像(Remote Sensing Image)

[2] PointFlow: Flowing Semantics Through Points for Aerial Image Segmentation(语义流经点以进行航空图像分割)

[1] Deep Gradient Projection Networks for Pan-sharpening(【超分辨率】泛锐化的深梯度投影网络)

神经网络架构搜索(NAS

[8] Dynamic Slimmable Network(动态可压缩网络)

[7] Prioritized Architecture Sampling with Monto-Carlo Tree Search(蒙特卡洛树搜索的优先架构采样)

[6] Searching by Generating: Flexible and Efficient One-Shot NAS with Architecture Generator(通过生成进行搜索：带有架构生成器的灵活高效的一键式NAS)

[5] Contrastive Neural Architecture Search with Neural Architecture Comparators(带有神经结构比较器的对比神经网络架构搜索)

[4] OPANAS: One-Shot Path Aggregation Network Architecture Search for Object(一键式路径聚合网络体系结构搜索对象)

[3] AttentiveNAS: Improving Neural Architecture Search via Attentive(通过注意力改善神经架构搜索)

[2] ReNAS: Relativistic Evaluation of Neural Architecture Search(NAS predictor当中ranking loss的重要性)

[1] HourNAS: Extremely Fast Neural Architecture（降低NAS的成本）

GAN/生成式/对抗式(GAN/Generative/Adversarial)

[16] LiBRe: A Practical Bayesian Approach to Adversarial Detection(LiBRe：对抗性检测的实用贝叶斯方法)

[15] DivCo: Diverse Conditional Image Synthesis via Contrastive Generative Adversarial Network(通过对比生成对抗网络进行多种条件图像合成)

[14] Diverse Semantic Image Synthesis via Probability Distribution Modeling(基于概率分布建模的多种语义图像合成)

[13] HumanGAN: A Generative Model of Humans Images(人类图像的生成模型)

[12] MetaSimulator: Simulating Unknown Target Models for Query-Efficient Black-box Attacks(模拟未知目标模型以提高查询效率的黑盒攻击)

[11] Soft-IntroVAE: Analyzing and Improving Introspective Variational Autoencoders(分析和改进自省变分自动编码器)
| project

[10] LOHO: Latent Optimization of Hairstyles via Orthogonalization(LOHO：通过正交化潜在地优化发型)

[9] PISE: Person Image Synthesis and Editing with Decoupled GAN(使用分离的GAN进行人像合成和编辑)

[8] Closed-Form Factorization of Latent Semantics in GANs(GAN中潜在语义的闭式分解)

[7] PD-GAN: Probabilistic Diverse GAN for Image Inpainting(用于图像修复的概率多样GAN)

[6] Anycost GANs for Interactive Image Synthesis and Editing(用于交互式图像合成和编辑的AnyCost Gans)

[5] Efficient Conditional GAN Transfer with Knowledge Propagation across Classes(高效的有条件GAN转移以及跨课程的知识传播)

[4] Exploiting Spatial Dimensions of Latent in GAN for Real-time Image Editing（利用GAN中潜在的空间维度进行实时图像编辑）

[3] Hijack-GAN: Unintended-Use of Pretrained, Black-Box GANs(Hijack-GAN：意外使用经过预训练的黑匣子GAN)

[2] Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation(样式编码：用于图像到图像翻译的StyleGAN编码器)
| project

[1] A 3D GAN for Improved Large-pose Facial Recognition(用于改善大姿势面部识别的3D GAN)

图像生成/合成(Image Generation/Image Synthesis)

[14] VITON-HD: High-Resolution Virtual Try-On via Misalignment-Aware Normalization(通过未对准感知的归一化进行高分辨率的虚拟试戴)

[13] A Closer Look at Fourier Spectrum Discrepancies for CNN-generated Images Detection(仔细研究CNN生成图像检测的傅立叶光谱差异)

[12] Semi-supervised Synthesis of High-Resolution Editable Textures for 3D Humans(用于3D人类的高分辨率可编辑纹理的半监督合成)

[11] Few-Shot Human Motion Transfer by Personalized Geometry and Texture Modeling(个性化几何和纹理建模的少量人体运动传递)

[10] Brain Image Synthesis with Unsupervised Multivariate Canonical CSC?4Net(无监督多元规范CSC?4Net的脑图像合成)

[9] Context-Aware Layout to Image Generation with Enhanced Object Appearance(具有增强的对象外观的上下文感知布局到图像生成)

[8] DivCo: Diverse Conditional Image Synthesis via Contrastive Generative Adversarial Network(通过对比生成对抗网络进行多种条件图像合成)

[7] HumanGAN: A Generative Model of Humans Images(人类图像的生成模型)

[6] PISE: Person Image Synthesis and Editing with Decoupled GAN(使用分离的GAN进行人像合成和编辑)

[5] SMPLicit: Topology-aware Generative Model for Clothed People(穿衣服的人的拓扑感知生成模型)

[4] Diversifying Sample Generation for Data-Free Quantization（多样化的样本生成，实现无数据量化）

[3] Diverse Semantic Image Synthesis via Probability Distribution Modeling(基于概率分布建模的多种语义图像合成)

[2] When Age-Invariant Face Recognition Meets Face Age Synthesis: A Multi-Task Learning Framework(当年龄不变的人脸识别遇到人脸年龄合成时：一个多任务学习框架)

[1] Anycost GANs for Interactive Image Synthesis and Editing(用于交互式图像合成和编辑的AnyCost Gans)

视图合成(View Synthesis)

[4] Layout-Guided Novel View Synthesis from a Single Indoor Panorama(单一室内全景的布局引导式新颖视图合成)
| project

[3] NeX: Real-time View Synthesis with Neural Basis Expansion(NeX：具有神经基础扩展的实时视图合成)

[2] ID-Unet: Iterative Soft and Hard Deformation for View Synthesis(视图合成的迭代软硬变形)

[1] Self-Supervised Visibility Learning for Novel View Synthesis(自我监督的可视性学习，用于新颖的视图合成)

三维视觉(3D Vision)

[2] A Deep Emulator for Secondary Motion of 3D Characters(三维角色二次运动的深度仿真器)

[1] 3D CNNs with Adaptive Temporal Feature Resolutions(具有自适应时间特征分辨率的3D CNN)

点云(Point Cloud)

[19] Denoise and Contrast for Category Agnostic Shape Completion(类别不可知形状完成的消噪和对比度)

[18] Panoptic-PolarNet: Proposal-free LiDAR Point Cloud Panoptic Segmentation(无提案的LiDAR点云全景分割)

[17] ReAgent: Point Cloud Registration using Imitation and Reinforcement Learning(ReAgent：使用模仿和强化学习进行点云配准)

[16] Equivariant Point Network for 3D Point Cloud Analysis(等变点网络进行3D点云分析)

[15] PAConv: Position Adaptive Convolution with Dynamic Kernel Assembling on Point Clouds(PAConv：点云上具有动态内核组装的位置自适应卷积)

[14] Skeleton Merger: an Unsupervised Aligned Keypoint Detector(骨架合并：无监督的对准关键点检测器)

[13] Cycle4Completion: Unpaired Point Cloud Completion using Cycle Transformation with Missing Region Coding(使用缺失区域编码的循环变换完成不成对的点云)

[12] Semantic Segmentation for Real Point Cloud Scenes via Bilateral Augmentation and Adaptive Fusion(通过双边扩充和自适应融合对实点云场景进行语义分割)

[11] How Privacy-Preserving are Line Clouds? Recovering Scene Details from 3D Lines(线云如何保护隐私？从3D线中恢复场景详细信息)

[10] PointDSC: Robust Point Cloud Registration using Deep Spatial Consistency(使用深度空间一致性进行稳健的点云配准)

[9] Robust Point Cloud Registration Framework Based on Deep Graph Matching(基于深度图匹配的鲁棒点云配准框架)

[8] TPCN: Temporal Point Cloud Networks for Motion Forecasting(面向运动预测的时态点云网络)

[7] PointGuard: Provably Robust 3D Point Cloud Classification(可证明稳健的三维点云分类)

[6] Towards Semantic Segmentation of Urban-Scale 3D Point Clouds: A Dataset, Benchmarks and Challenges(走向城市规模3D点云的语义分割：数据集，基准和挑战)

[5] SpinNet: Learning a General Surface Descriptor for 3D Point Cloud Registration(SpinNet：学习用于3D点云配准的通用表面描述符)

[4] MultiBodySync: Multi-Body Segmentation and Motion Estimation via 3D Scan Synchronization(通过3D扫描同步进行多主体分割和运动估计)

[3] Diffusion Probabilistic Models for 3D Point Cloud Generation(三维点云生成的扩散概率模型)

[2] Style-based Point Generator with Adversarial Rendering for Point Cloud Completion(用于点云补全的对抗性渲染基于样式的点生成器)

[1] PREDATOR: Registration of 3D Point Clouds with Low Overlap(预测器：低重叠的3D点云的配准)
| project

三维重建(3D Reconstruction)

[8] Learning Parallel Dense Correspondence from Spatio-Temporal Descriptors for Efficient and Robust 4D Reconstruction(从时空描述符中学习并行密集对应，以进行有效且鲁棒的4D重建)

[7] POSEFusion: Pose-guided Selective Fusion for Single-view Human Volumetric Capture(用于单视图人体体积捕获的姿势引导选择性融合)
| project

[6] Deep Implicit Moving Least-Squares Functions for 3D Reconstruction(用于3D重构的深层隐式移动最小二乘函数)

[5] Model-based 3D Hand Reconstruction via Self-Supervised Learning(通过自我监督学习进行基于模型的3D手重建)

[4] 3DCaricShop: A Dataset and A Baseline Method for Single-view 3D Caricature Face Reconstruction(单视图3D漫画面部重建的数据集和基线方法)
| project

[3] Learning Compositional Representation for 4D Captures with Neural ODE(使用神经ODE学习4D捕捉的合成表示)

[2] SMPLicit: Topology-aware Generative Model for Clothed People(穿衣服的人的拓扑感知生成模型)

[1] PCLs: Geometry-aware Neural Reconstruction of 3D Pose with Perspective Crop Layers（具有透视作物层的3D姿势的几何感知神经重建）

模型压缩(Model Compression)

[2] Dynamic Slimmable Network(动态可压缩网络)

[1] Learning Student Networks in the Wild（一种不需要原始训练数据的模型压缩和加速技术）

解读：华为诺亚方舟实验室提出无需数据网络压缩技术

知识蒸馏(Knowledge Distillation)

[9] Complementary Relation Contrastive Distillation(互补关系对比蒸馏)

[8] Distilling Object Detectors via Decoupled Features(通过解耦功能蒸馏物体检测器)

[7] Refine Myself by Teaching Myself: Feature Refinement via Self-Knowledge Distillation(通过自学来完善自己：通过自我蒸馏提炼特征)

[6] Knowledge Evolution in Neural Networks(神经网络中的知识进化)

[5] Semantic-aware Knowledge Distillation for Few-Shot Class-Incremental Learning(少班级增量学习的语义感知知识蒸馏)

[4] Teachers Do More Than Teach: Compressing Image-to-Image Models(https://arxiv.org/abs/2103.03467)

[3] General Instance Distillation for Object Detection(通用实例蒸馏技术在目标检测中的应用)

[2] Multiresolution Knowledge Distillation for Anomaly Detection(用于异常检测的多分辨率知识蒸馏)

[1] Distilling Object Detectors via Decoupled Features（前景背景分离的蒸馏技术）

剪枝(Pruning)

[2] Neural Response Interpretation through the Lens of Critical Pathways(关键途径对神经反应的解释)
1 | code2

[1] Manifold Regularized Dynamic Network Pruning(流形规则化动态网络剪枝)

量化(Quantization)

[2] Zero-shot Adversarial Quantization(零样本对抗量化)

[1] Learnable Companding Quantization for Accurate Low-bit Neural Networks(精确低位神经网络的可学习压扩量化)

神经网络结构设计(Neural Network Structure Design)

[11] Convolutional Hough Matching Networks(卷积霍夫匹配网络)

[10] Capsule Network is Not More Robust than Convolutional Network(胶囊网络并不比卷积网络更健壮)

[9] Diverse Branch Block: Building a Convolution as an Inception-like Unit(多元分支块：将卷积构建为类似初始的单位)
|

[8] Scaling Local Self-Attention For Parameter Efficient Visual Backbones(扩展局部自注意力以获得有效的参数视觉主干)

[7] Fast and Accurate Model Scaling(快速准确的模型缩放)

[6] Involution: Inverting the Inherence of Convolution for Visual Recognition(反转卷积的固有性以进行视觉识别)

[5] Inception Convolution with Efficient Dilation Search(具有有效膨胀搜索的初始卷积)
| 解读-Inception convolution

[4] Coordinate Attention for Efficient Mobile Network Design(协调注意力以实现高效的移动网络设计)

[3] Rethinking Channel Dimensions for Efficient Model Design(重新考虑通道尺寸以进行有效的模型设计)

[2] Inverting the Inherence of Convolution for Visual Recognition（颠倒卷积的固有性以进行视觉识别）

[1] RepVGG: Making VGG-style ConvNets Great Again

解读：RepVGG：极简架构，SOTA性能，让VGG式模型再次伟大

Transformer

[3] Transformer Interpretability Beyond Attention Visualization(注意力可视化之外的Transformer可解释性)

[2] UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

解读：无监督预训练检测器

[1] Pre-Trained Image Processing Transformer(底层视觉预训练模型)
| 解读-Transformer再下一城！low-level多个任务榜首被占领，北大华为等联合提出预训练模型IPT

图神经网络(GNN)

[2] Quantifying Explainers of Graph Neural Networks in Computational Pathology(计算病理学中图神经网络的量化解释器)

[1] Sequential Graph Convolutional Network for Active Learning(主动学习的顺序图卷积网络)

数据处理(Data Processing)
数据增广(Data Augmentation)

[2] AutoDO: Robust AutoAugment for Biased Data with Label Noise via Scalable Probabilistic Implicit Differentiation(通过可扩展的概率隐式微分对带有标签噪声的有偏数据进行鲁棒的自动增强)

[1] KeepAugment: A Simple Information-Preserving Data Augmentation(一种简单的保存信息的数据扩充)

表征学习(Representation Learning)

[7] Learning by Aligning Videos in Time(【视频表征】通过时间对齐视频进行学习)

[6] Vectorization and Rasterization: Self-Supervised Learning for Sketch and Handwriting(矢量化和光栅化：素描和手写的自我指导学习)

[5] Neural Parts: Learning Expressive 3D Shape Abstractions with Invertible Neural Networks(神经零件：使用可逆神经网络学习富有表现力的3D形状提取)

[4] VideoMoCo: Contrastive Video Representation Learning with Temporally Adversarial Examples(对比视频表示学习和临时对抗示例)

[3] Spatially Consistent Representation Learning(空间一致表示学习)

[2] Removing the Background by Adding the Background: Towards Background Robust Self-supervised Video Representation Learning(通过添加背景来删除背景：朝着背景进行鲁棒的自我监督视频表示学习)
| project | 解读

[1] VirTex: Learning Visual Representations from Textual Annotations（从文本注释中学习视觉表示）

归一化/正则化(Batch Normalization)

[3] Adaptive Consistency Regularization for Semi-Supervised Transfer Learning(半监督转移学习的自适应一致性正则化)

[2] Meta Batch-Instance Normalization for Generalizable Person Re-Identification(通用批处理人员重新标识的元批实例规范化)

[1] Representative Batch Normalization with Feature Calibration（具有特征校准功能的代表性批量归一化）

图像聚类(Image Clustering)

[4] Structure-Aware Face Clustering on a Large-Scale Graph with 10^7 Nodes(具有10^7个节点的大规模图上的结构感知人脸聚类)
&project

[3] COMPLETER: Incomplete Multi-view Clustering via Contrastive Prediction(通过对比预测的不完整多视图聚类)

[2] Improving Unsupervised Image Clustering With Robust Learning（通过鲁棒学习改善无监督图像聚类）

[1] Reconsidering Representation Alignment for Multi-view Clustering(重新考虑多视图聚类的表示对齐方式)

图像压缩(Image Compression)

[4] Learning Scalable ?∞-constrained Near-lossless Image Compression via Joint Lossy Image and Residual Compression(通过联合有损图像和残差压缩学习可伸缩?∞约束的近无损图像压缩)

[3] Checkerboard Context Model for Efficient Learned Image Compression(高效学习图像压缩的棋盘上下文模型)

[2] Slimmable Compressive Autoencoders for Practical Neural Image Compression(实用神经图像压缩的可压缩压缩自动编码器)

[1] Attention-guided Image Compression by Deep Reconstruction of Compressive Sensed Saliency Skeleton(通过压缩感知显着性骨架的深度重构来进行注意力引导的图像压缩)

异常检测(Anomaly Detection)

[1] Learning Placeholders for Open-Set Recognition(学习占位符以进行开放式识别)

模型训练/泛化(Model Training/Generalization)

[4] Student-Teacher Learning from Clean Inputs to Noisy Inputs(从纯净输入到噪音输入的师生学习)

[3] Uncertainty-guided Model Generalization to Unseen Domains(不确定性指导的模型泛化)

[2] Knowledge Evolution in Neural Networks(神经网络中的知识进化)

[1] PGT: A Progressive Method for Training Models on Long Videos(一种在长视频上训练模型的渐进方法)

噪声标签(Noisy Label)

[1] Partially View-aligned Representation Learning with Noise-robust Contrastive Loss(面向部分视图对齐表示学习的噪声鲁棒对比损失函数)

模型评估(Model Evaluation)

[1] Are Labels Necessary for Classifier Accuracy Evaluation?(测试集没有标签，我们可以拿来测试模型吗？)
| 解读

数据集(Dataset)

[6] Face Forensics in the Wild(人脸伪造数据集)
|

[5] Benchmarking Representation Learning for Natural World Image Collections(【自然图像分类】自然世界影像收藏的基准表示学习)

[4] Sewer-ML: A Multi-Label Sewer Defect Classification Dataset and Benchmark(多标签下水道缺陷分类数据集和基准)
| project&dataset

[3] 3DCaricShop: A Dataset and A Baseline Method for Single-view 3D Caricature Face Reconstruction(单视图3D漫画面部重建的数据集和基线方法)
| project

[2] Towards Semantic Segmentation of Urban-Scale 3D Point Clouds: A Dataset, Benchmarks and Challenges(走向城市规模3D点云的语义分割：数据集，基准和挑战)

[1] Re-labeling ImageNet: from Single to Multi-Labels, from Global to Localized Labels（重新标记ImageNet：从单标签到多标签，从全局标签到本地标签）

主动学习(Active Learning)

[3] Vab-AL: Incorporating Class Imbalance and Difficulty with Variational Bayes for Active Learning

[2] Multiple Instance Active Learning for Object Detection（用于对象检测的多实例主动学习）

[1] Sequential Graph Convolutional Network for Active Learning(主动学习的顺序图卷积网络)

小样本学习(Few-shot Learning)/零样本学习(Zero-shot Learning)

[9] Self-Guided and Cross-Guided Learning for Few-Shot Segmentation(自我指导和交叉指导学习，用于小样本分割)

[8] Contrastive Embedding for Generalized Zero-Shot Learning(广义零样本学习的对比嵌入)

[7] Learning Dynamic Alignment via Meta-filter for Few-shot Learning(通过元过滤器学习动态对齐，以进行小样本学习)

[6] Goal-Oriented Gaze Estimation for Zero-Shot Learning(零样本学习的目标导向注视估计)

[5] Few-Shot Segmentation Without Meta-Learning: A Good Transductive Inference Is All You Need?

[4] Counterfactual Zero-Shot and Open-Set Visual Recognition(反事实零射和开集视觉识别)

[3] Semantic Relation Reasoning for Shot-Stable Few-Shot Object Detection(小样本目标检测的语义关系推理)

[2] Few-shot Open-set Recognition by Transformation Consistency(转换一致性很少的开放集识别)

[1] Exploring Complementary Strengths of Invariant and Equivariant Representations for Few-Shot Learning(探索少量学习的不变表示形式和等变表示形式的互补强度)

持续学习(Continual Learning/Life-long Learning)

[5] Rectification-based Knowledge Retention for Continual Learning(基于矫正的知识保留用于持续学习)

[4] Rainbow Memory: Continual Learning with a Memory of Diverse Samples(彩虹记忆：持续学习与多种样本的记忆)

[3] Efficient Feature Transformations for Discriminative and Generative Continual Learning(区分性和生成性持续学习的有效特征转换)

[2] Rainbow Memory: Continual Learning with a Memory of Diverse Samples（不断学习与多样本的记忆）

[1] Learning the Superpixel in a Non-iterative and Lifelong Manner(以非迭代和终身的方式学习超像素)

场景图(Scene Graph)
场景图生成(Scene Graph Generation)

[3] Fully Convolutional Scene Graph Generation(全卷积场景图生成)

[2] Probabilistic Modeling of Semantic Ambiguity for Scene Graph Generation(场景图生成的语义歧义概率建模)

[1] Exploiting Edge-Oriented Reasoning for 3D Point-based Scene Graph Analysis(利用基于边缘的推理进行基于3D点的场景图分析)

场景图预测(Scene Graph Prediction)

[1] SceneGraphFusion: Incremental 3D Scene Graph Prediction from RGB-D Sequences(基于RGB-D序列的增量3D场景图预测)

场景图理解(Scene Graph Understanding)

[2] Bidirectional Projection Network for Cross Dimension Scene Understanding(双向投影网络，用于跨维度场景理解)

[1] Monte Carlo Scene Search for 3D Scene Understanding(蒙特卡洛场景搜索以了解3D场景)

视觉推理/视觉问答(Visual Reasoning/VQA)

[6] Domain-robust VQA with diverse datasets and methods but no target labels(具有各种数据集和方法，但没有目标标签的领域稳健的VQA)

[5] AGQA: A Benchmark for Compositional Spatio-Temporal Reasoning(AGQA：组成时空推理的基准)

[4] Abstract Spatial-Temporal Reasoning via Probabilistic Abduction and Execution(通过概率绑架和执行进行抽象时空推理) | project | supplementary

[3] ACRE: Abstract Causal REasoning Beyond Covariation(ACRE：超越协方差的抽象因果推理)
| project | Supplementary

[2] TrafficQA: A Question Answering Benchmark and an Efficient Network for Video Reasoning over Traffic Events(问题解答基准和有效的交通事件视频推理网络)
| project

[1] Transformation Driven Visual Reasoning(转型驱动的视觉推理)
| project

迁移学习/domain/自适应(Transfer Learning/Domain Adaptation)

[15] Prototypical Cross-domain Self-supervised Learning for Few-shot Unsupervised Domain Adaptation(典型的跨域自我监督学习，适用于少拍无监督领域自适应)
| project

[14] Progressive Domain Expansion Network for Single Domain Generalization(用于单域泛化的渐进域扩展网络)

[13] Dynamic Domain Adaptation for Efficient Inference(动态域自适应以实现高效推理)

[12] Adaptive Methods for Real-World Domain Generalization(真实世界域自适应的自适应方法)

[11] OTCE: A Transferability Metric for Cross-Domain Cross-Task Representations(跨域跨任务表示的可传递性度量标准)

[10] DRANet: Disentangling Representation and Adaptation Networks for Unsupervised Cross-Domain Adaptation(分解表示和自适应网络以实现无监督的跨域自适应)

[9] MetaAlign: Coordinating Domain Alignment and Classification for Unsupervised Domain Adaptation(无监督域自适应的协调域对齐和分类)

[8] Transferable Semantic Augmentation for Domain Adaptation(可转移的语义增强以适应领域)

[7] Dynamic Transfer for Multi-Source Domain Adaptation(多源域自适应的动态传输)

[6] Semi-supervised Domain Adaptation based on Dual-level Domain Mixing for Semantic Segmentation(基于双层域混合的半监督域自适应语义分割)

[5] Multi-Source Domain Adaptation with Collaborative Learning for Semantic Segmentation(多源领域自适应与协作学习的语义分割)

[4] Continual Adaptation of Visual Representations via Domain Randomization and Meta-learning(通过域随机化和元学习对视觉表示进行连续调整)

[3] Domain Generalization via Inference-time Label-Preserving Target Projections(基于推理时间保标目标投影的区域泛化)

[2] MetaSCI: Scalable and Adaptive Reconstruction for Video Compressive Sensing(可伸缩的自适应视频压缩传感重建)

[1] FSDR: Frequency Space Domain Randomization for Domain Generalization(用于域推广的频域随机化)

度量学习(Metric Learning)

[3] Noise-resistant Deep Metric Learning with Ranking-based Instance Selection(具有基于排名的实例选择的抗噪深度度量学习)

[2] Embedding Transfer with Label Relaxation for Improved Metric Learning(嵌入转移与标签松弛功能以改善度量学习)

[1] Dynamic Metric Learning: Towards a Scalable Metric Space to Accommodate Multiple Semantic Scales(动态度量学习：迈向可扩展的度量空间以适应多个语义尺度)

对比学习(Contrastive Learning)

[3] Contrastive Learning based Hybrid Networks for Long-Tailed Image Classification(基于对比学习的混合网络的长尾图像分类)

[2] AdCo: Adversarial Contrast for Efficient Learning of Unsupervised Representations from Self-Trained Negative Adversaries(有效对比自我训练的负面对抗无监督表示的对抗性对比)
| 解读-AdCo基于对抗的对比学习]

[1] Fine-grained Angular Contrastive Learning with Coarse Labels(粗标签的细粒度角度对比学习)

强化学习(Reinforcement Learning)

[1] Unsupervised Learning for Robust Fitting:A Reinforcement Learning Approach(无监督学习以进行稳健拟合：一种强化学习方法)

元学习(Meta Learning)

[2] Meta-Mining Discriminative Samples for Kinship Verification(进行亲缘关系验证的元挖掘歧视性样本)

[1] MetaSAug: Meta Semantic Augmentation for Long-Tailed Visual Recognition(MetaSAug：用于长尾视觉识别的元语义增强)

REF

https://www.zhihu.com/topic/21674674/hot

https://zhuanlan.zhihu.com/p/354043252

查看全文

相关阅读:
Activity具体解释（生命周期、以各种方式启动Activity、状态保存，全然退出等）
StringBuffer和StringBuilder使用方法比較
 python 多线程编程
 八大排序算法总结
 腾讯面试
 顶尖的个人作品集站点设计赞赏
 MATLAB新手教程
 ThreadPoolExecutor使用介绍
 linux diff具体解释
 Android借助Application重写App的Crash(简易版)

原文地址：https://www.cnblogs.com/emanlee/p/14847171.html