After LFW surpassed human recognition ability, there were few significant breakthroughs in face recognition, and gradually turned to face recognition in video or face attribute learning and other directions. The number of papers accepted for THE CV top conferences also showed a steady trend.
Person re-identification, also known as pedestrian re-identification, is a technology that uses computer vision technology to determine whether there is a specific pedestrian in an image or video sequence. It is widely regarded as a sub-problem of image retrieval. Given a monitor pedestrian image, retrieve a row of pedestrian images across devices. Designed to make up for the visual limitations of the current fixed camera, and can be combined with pedestrian detection/pedestrian tracking technology, can be widely used in intelligent video surveillance, intelligent security and other fields.
Pedestrian re-recognition is one of the main research directions in China, and the number of submissions is increasing year by year. Domestic mainly for tsinghua University, Peking University, Fudan University, University of Science and Technology, Sun Yat-sen University, Hong Kong Chinese, China, Xi ‘an Jiaotong University, Chinese Academy of Sciences, Xiamen University and other research institutions; Overseas for Sydney Technology, QMUL and UTSA, etc. The number of accepted papers for CV top conferences on Pedestrian re-recognition has steadily increased.
- First, let’s analyze from the perspective of top conference enrollment
- Number of face recognition papers (search by keyword for “face recognition “,” Face verification”)
CVPR2013: 9
ICCV2013: 11
CVPR2014: 7
CVPR2015: 8
ICCV2015: 2
CVPR2016: 5
CVPR2017: 6
ICCV2017: 8
- Number of pedestrian re-identification papers (retrieve “Person re-identification “,” Person search”,” Person Retrieval “,” Pedestrian Retrieval” by keyword)
CVPR2013: 1
ICCV2013: 3
CVPR2014: 3
CVPR2015: 7
ICCV2015: 8
CVPR2016: 11
CVPR2017: 14
ICCV2017: 16
2. Similarities and differences between pedestrian re-recognition and face recognition
- Pedestrian re-identification uses photos from one camera to see if the person is seen again from other cameras. Viewpoint changes of cameras and posture changes of pedestrians need to be dealt with.
- Face recognition is a given pair, to identify whether it’s the same person, or to find someone you’ve seen in your photo library.
- Pedestrians have the same structure as human faces, but they are more complex and have more parts that are not easily aligned.
- Large pedestrian datasets are hard to come by, unlike the faces of celebrities. Existing pedestrian re-identification datasets (DukemtMC-Reid, CUHK03, Market-1501, etc.) were recorded with actual cameras on campus. However, early small data sets (Viper et al.) are no longer able to provide a comprehensive assessment and are being used less frequently.
- Pedestrian re – recognition landing products are few, and a large number of face recognition applications have landed?
- The problem of fewer multiple cameras/cross cameras has been studied previously
These are the reasons why pedestrian re-identification is so popular in the academic world.
3. Large-scale data sets commonly used for pedestrian re-recognition
- DukeMTMC-reID
The data set was collected from eight different cameras at Duke University. This data set provides training sets and test sets. The training set contained 16,522 images and the test set contained 17,661 images. There were 702 people in the training data, with an average of 23.5 pieces of training data per category. Is currently the largest pedestrian re-identification data set, and provides annotation of pedestrian attributes (gender/long and short sleeves/backpack or not, etc.).
- Market-1501
The data set was captured on the Campus of Tsinghua University with images from six different cameras, including one with low pixels. At the same time, the data set provides training set and test set. The training set contained 12,936 images and the test set contained 19,732 images. The image is automatically detected and cut by the detector, which contains some detection errors (close to actual use). There were 751 participants in the training data and 750 in the test set. So in the training set, there was an average of 17.2 pieces of training data per category (per person).
- CUHK03
The data set was collected at the Chinese University of Hong Kong with images from two different cameras. This data set provides both machine detection and manual detection. The detection data set contains some detection errors, which are closer to the actual situation. The average person has 9.6 pieces of training data.
4. Possible future research topics
- Transfer learning. A good face model learned on LFW may not work in practice. The same is true of pedestrian re-identification. For example, study how to apply the model trained on dataset market-1501 to another dataset.
- It’s like a human face. Pedestrian re-recognition transfer to attribute learning/video base.
- Make larger and more difficult search libraries, such as Market-1501 + 500K (more intrusive pedestrian candidates).
- Language retrieval pedestrian. Use natural language descriptions to find people.
Clustering visualization of Pedestrian Features from [1]
[1] Zheng Z, Zheng L, Yang Y. A discriminatively learned cnn embedding for person re-identification arXiv preprint arXiv:1611.05666, 2016.
More related articles:
- 2017 ICCV Pedestrian Retrieval/Re-identification Acceptance Papers Summary
- Training with gan-generated images? Yes!
Zhihu column: Pedestrian recognition