CVPR2021

My own summary continues to update Github:

Github.com/Sophia-11/A…

Please reply CVPR2021 in the background of [Computer Vision Alliance]

Please reply CVPR2019 on the background of computer Vision Alliance

Please reply CVPR2020 in the background of computer Vision Alliance

Abstract: Image-to-image Translation via Hierarchical Style Disentanglement Xinyang Li, Shengchuan Zhang, Jie Hu, Liujuan Cao Xiaopeng Hong, Xudong Mao, Feiyue Huang, Yongjian Wu, Rongrong Ji arxiv.org/abs/2103.01…

FLAVR: Flow-agnostic Video Representations for Fast Frame Accommodation arxiv.org/pdf/2012.08…

Patch-NetVLAD: Multi-Scale Fusion of Locally-Global Descriptors for Place Recognition Stephen Hausler, Sourav Garg, Ming Xu, Michael Milford, Tobias Fischer arxiv.org/abs/2103.01…

Depth from Camera Motion and Object Detection Brent A. Griffin, Jason J. Corso arxiv.org/abs/2103.01…

Up-detr: Unsupervised pre-training for Object Detection with Transformers arxiv.org/pdf/2011.09…

Multi-stage Progressive Image Restoration arxiv.org/abs/2102.02…

Weakly Supervised Learning of Rigid 3D Scene Flow arxiv.org/pdf/2102.08…

Exploring Complementary Strengths of Invariant and Equivariant Representations for Few-Shot Learning Mamshad Nayeem Rizve, Salman Khan, Fahad Shahbaz Khan, Mubarak Shah arxiv.org/abs/2103.01…

Re-labeling ImageNet: From Single to multi-labels, from Global to mold Labels arxiv.org/abs/2101.05…

Rethinking Channel Dimensions for Efficient Model Design arxiv.org/abs/2007.00…

Coarse-Fine Networks for Temporal Activity Detection in Videos Kumara Kahatapitiya, Michael s. Ryoo arxiv.org/abs/2103.01…

A Deep Emulator for Secondary Motion of 3D Characters Mianlun Zheng, Yi Zhou, Duygu Ceylan, Jernej Barbic arxiv.org/abs/2103.01…

Fair Attribute Classification through Latent Space de-biasing arxiv.org/abs/2012.01…

Auto-Exposure Fusion for Single-Image Shadow Removal Lan Fu, Changqing Zhou, Qing Guo, Felix Juefei-Xu, Hongkai Yu, Wei Feng, Yang Liu, Song Wang arxiv.org/abs/2103.01…

Less is More: CLIPBERT for Video-and-language Learning via Sparse Sampling arxiv.org/pdf/2102.06…

MetaSCI: Scalable and Adaptive Reconstruction for Video Compressive Sensing Zhengjue Wang, Hao Zhang, Ziheng Cheng, Bo Chen, Xin Yuan arxiv.org/abs/2103.01…

AttentiveNAS: Improving Neural Architecture Search via Attentive arxiv.org/pdf/2011.09…

Diffusion Probabilistic Models for 3D Point Cloud Generation Shitong Luo, Wei Hu arxiv.org/abs/2103.01…

There is More than Meets the Eye: Self-Supervised Multi-Object Detection and Tracking with Sound by Distilling Multimodal Knowledge Francisco Rivera Valverde, Juana Valeria Hurtado, Abhinav Valada arxiv.org/abs/2103.01…

Encoding in Style: A StyleGAN Encoder for image-to-image Translation arxiv.org/abs/2008.00…

Hierarchical and Partially Observable Goal-driven Policy Learning with Goals Relational Graph Xin Ye, Yezhou Yang arxiv.org/abs/2103.01…

RepVGG: Making Vgg-style ConvNets Great Again arxiv.org/abs/2101.03…

Transformer Interpretability Beyond Attention Visualization arxiv.org/pdf/2012.09…

PREDATOR: Registration of 3D Point Clouds with Low Overlap arxiv.org/pdf/2011.13…

In this paper, 2020, regression target Detection Bridging the Gap Between anchor-based and anchor-free Detection via Adaptive Training Sample Selection Address: arxiv.org/abs/1912.02… Code: github.com/sfzhang15/A…

Fee-shot Object Detection with attention-RPN and multi-relation Detector dissertation arxiv.org/abs/1908.01…

Semi-supervised Semantic Image Segmentation with Self-correcting Networks

Deep Snake for real-time Instance Segmentation

CenterMask: Real-time Anchor-Free Instance Segmentation

SketchGCN: Semantic Sketch Segmentation with Graph Convolutional Networks

PolarMask: Single Shot Instance Segmentation with Polar Representation

XMUDA: Cross-modal Unsupervised Domain Adaptation for 3D Semantic Segmentation

BlendMask: Top-down Meets Bottom-up for Instance Segmentation

Face Recognition Towards Universal Representation Learning for Deep Face Recognition

Hamid Rural areas for large-scale Facial Expression Recognition

3.Face X-ray Detection for More General Face Forgery Detection

ROAM: Recurrently Optimizing Tracking Model arxiv.org/abs/1907.12…

Pf-net: Point Fractal Network for 3D Point Cloud Completion

PointAugment: an Auto-augmentation Framework for Point Cloud Classification

3.Learning Multiview 3D Point Cloud Registration

C-flow: Conditional Generative Flow Models for Images and 3D Point Clouds

Randla-net: Efficient Semantic Segmentation of Large-scale Point Clouds

Total3DUnderstanding: Joint Layout, Object Pose and Mesh Reconstruction for Indoor Scenes from a Single Image

Implicit Functions in Feature Space for 3D Shape Reconstruction and Completion

In Perfect Shape: Certifiably Optimal 3D Shape Reconstruction from 2D Landmarks

VIBE: Video Inference for Human Body Pose and Shape Estimation Code: github.com/mkocabas/VI…

Distribution-aware Coordinate Representation for Human Pose Estimation Code: github.com/ilovepose/D…

4D Association Graph for Realtime Multi-person Motion Capture Using Multiple Video Cameras

Optimal least-squares solution to the Hand-eye Calibration Problem

D3VO: Deep Depth, Deep Pose and Deep Uncertainty for Monocular Visual Odometry

163 Multi-modal Domain Adaptation for fine-grained Action Recognition

Distribution Aware Coordinate Representation for Human Pose Estimation

The Devil is in The Details: Delving into Unbiased Data Processing for Human Pose Estimation

9.PVN3D: A Deep point-wise 3D Keypoints Voting Network for 6DoF Pose Estimation

GAN Your Local GAN: Designing Two Dimensional Local Attention Mechanisms for Generative Models

Msg-gan: Multi-scale Gradient GAN for Stable Image Synthesis

Robust Design of Deep Neural Networks against Adversarial Attacks based on Lyapunov Theory

Improved Few-Shot Visual Classification arxiv.org/pdf/1912.03 2. Meta-transfer Learning for Zero-shot super-resolution

Weakly Supervised & unsupervised Rethinking the Route Towards Object Localization (arxiv.org/abs/2002.11) 3. NestedVAE: Micronucleus Common Factors via Weak Supervision… 3.Unsupervised Reinforcement Learning of Transferable meta-skills for Embodied Navigation

4.Disentangling Physical Dynamics from Unknown Factors for Unsupervised Video Prediction

Visual Commonsense R-CNN 中 文 arxiv.org/abs/2002.12…

GhostNet: More Features from Cheap Operations

Watch your Up-Convolution: CNN Based Generative Deep Neural Networks are Failing to Reproduce Spectral Article Address: arxiv.org/abs/2003.01…

Model Accelerated GPU-accelerated Mobile Multi-view Style Transfer What it Thinks is Important is Important: Hospitalized Gradients 2.Attentive Context Disciplines for Robust Permutation-Equivariant Learning Dissertation Address: arxiv.org/abs/1907.02…

Bundle Adjustment on a Graph Processor arxiv.org/abs/2003.03…

… transference poses to Proximal Animal Classes.

Representations, Metrics and Statistics For Shape Analysis of Elastic Graphs Url: arxiv.org/abs/2003.00…

Learning in the Frequency Domain

7.Filter Grafting for Deep Neural Networks

8.ClusterFit: Improving Representations of Visual Representations

9.Social-STGCNN: A Social Spatio-temporal Graph Convolutional Neural Network for Human Trajectory Prediction

10. Auto-encoding twin-hashing dissertation address: arxiv.org/abs/2002.11…

Learning to Fulfill by Predicting Bags of Visual Words

6. Holproof-attracted Parsing: arxiv.org/abs/2003.01…

12.A General and Adaptive Robust Loss Function

14.A Characteristic Function Approach to Deep Implicit Generative Modeling

15.AdderNet: Do We Really Need Multiplications in Deep Learning? Address: arxiv.org/pdf/1912.13…

16.12-IN-1: Multi-task Vision and Language Representation Learning

17.Making Better Mistakes: Leveraging Class Hierarchies with Deep Networks Address: arxiv.org/abs/1912.09…

18.CARS: Contunuous Evolution for Efficient Neural Architecture Search

19.Towards Learning a Generic Agent for Vision-and-language Navigation via pre-training

1.GhostNet: More Features from Cheap Operations (Beyond Mobilenet V3 architecture) Amazing performance on CPU) : github.com/iamhankai/g…

We beat other SOTA lightweight CNNs such as MobileNetV3 and FBNet.

AdderNet: Do We Really Need Multiplications in Deep Learning? (Additive neural network) has achieved very good performance on large-scale neural networks and data sets.

Convolutional Neural Networks for Frequency Domain Compact 3D Convolutional Neural Networks

A semi-supervised Assessor of Neural Architectures

Hit-detector: Hierarchical Trinity Architecture Search for Object Detection (NAS Detection) Backbone-Neck-Head

CARS: Contunuous Evolution for Efficient Neural Architecture Search NAS is Efficient, with multiple advantages of differentiability and Evolution, and can output pareto preresearch

Study On Positive Unlabeled Classification of GAN (PU+GAN)

Multiview 3D Point Cloud Registration (3D Point Cloud

Multi-modal Domain Adaptation for fine-grained Action Recognition

Action Modifiers:Learning from Adverbs in Instructional Video Paper Link: arxiv.org/abs/1912.06617

PolarMask: Single Shot Instance Segmentation with Polar Representation Zhuanlan.zhihu.com/p/84890413…

Rethinking Performance Estimation in Neural Architecture Search (NAS) due to Block Wise Neural Architecture The real time-consuming part of search is the Performance Estimation part. This paper finds the optimal parameters for NAS of Block Wise, which is faster and more relevant.

Distribution Aware Coordinate Representation for Human Pose Estimation Github.com/ilovepose/D…

OCR ABCNet: Real-time Scene Text Spotting with Adaptive Bezier-curve Network Self-training with Noisy Student Improves ImageNet classification

Image Matching across Wide Baselines: From Paper to Practice

Towards Robust Image Classification Using Sequential Attention Models

Rethinking Zero-shot Video Classification: End-to-end Training for Realistic Applications Code: github.com/bbrattoli/Z…

163 Say As You Wish: fine-control of Image Caption Generation with Abstract Scene Graphs

Fine-grained video-text Retrieval with Hierarchical Graph Reasoning

Objective Relational Graph with Teacher-Recommended Learning for Video Captioning

Zooming Slow-Mo: Fast and Accurate One-stage Space-time Video Super-resolution

Blurry Video Frame Interpolation Thesis Address: arxiv.org/abs/2002.12…

Hierarchical Conditional Relation Networks for Video Question Answering Address: arxiv.org/abs/2002.10…

Action Modifiers:Learning from Adverbs in Instructional Video

Learning to Shade Hand-drawn Sketches arxiv.org/abs/2002.11… 2.Single Image Reflection Removal through Cascaded Refinement

3.Generalized ODIN: Detecting out-of-distribution Image without Learning from out-of-distribution Data

Deep Image Harmonization via Domain Verification arxiv.org/abs/1911.13…

RoutedFusion: Learning Real-time Depth Map Fusion

R-cnn, Visual Commonsense R-CNN arxiv.org/abs/2002.12…

Out-of-distribution image detection arxiv.org/abs/2002.11…

Blurry Video Frame Interpolation arxiv.org/abs/2002.12…

Meta-transfer learning zero-sample hyperscore arxiv.org/abs/2002.12…

3D interior scene understanding arxiv.org/abs/2002.12…

6. Generate unbiased scene graph from biased training

Arxiv.org/abs/2002.11…

Autoencoder double bottleneck hash arxiv.org/abs/2002.11…

A Social spatiotemporal graph convolutional Neural Network for human trajectory Prediction arxiv.org/abs/2002.11…

General representation learning for Depth-oriented Face Recognition arxiv.org/abs/2002.11…

Visual representation generalization arxiv.org/abs/1912.03…

Attenuate context bias arxiv.org/abs/2002.11…

Unsupervised reinforcement learning of transferable meta-skills arxiv.org/abs/1911.07…

Fast and accurate spatio-temporal video hyperscore arxiv.org/abs/2002.11…

Teacher Recommended learning video captioning arxiv.org/abs/2002.11…

Weakly supervised object location routing rethinking arxiv.org/abs/2002.11…

General agent to learn visual and verbal navigation through pre-training arxiv.org/pdf/2002.10…

GhostNet lightweight neural network arxiv.org/pdf/1911.11…

AdderNet: Do we really need multiplication in deep learning? Arxiv.org/pdf/1912.13…

CARS: The Continued evolution of efficient neural structure search arxiv.org/abs/1909.04…

Remove reflections from single images by collaborative iterative cascade fine-tuning arxiv.org/abs/1911.06…

Deep neural network filtering grafting arxiv.org/pdf/2001.05…

PolarMask: Split and unify instances to FCN arxiv.org/pdf/1909.13…

Semi-supervised semantic image segmentation arxiv.org/pdf/1811.07…

Defend against generic attacks through selective feature regeneration arxiv.org/pdf/1906.03…

Real-time fine-grained sketch-based image retrieval arxiv.org/abs/2002.10…

Ask the VQA model with sub-questions arxiv.org/abs/1906.03…

Learn from 2 d example nerve 3 d texture space geometry. The cs. Ucl. Ac. UK/projects / 20…

NestedVAE: Isolating common factors through weak surveillance arxiv.org/abs/2002.11…

Realizing multi-future trajectory prediction arxiv.org/pdf/1912.06…

Robust image classification using sequential attention models arxiv.org/pdf/1912.02…