6 kinds of common landmark recognition algorithms are sorted out and summarized

Abstract: Landmark recognition, based on deep learning and large-scale image training, supports thousands of object recognition and scene recognition, widely used in photo recognition, popular science in preschool education, image classification and other scenes. This article will bring you 6 kinds of sorting and summary about the algorithm of ground identification.

This article is shared with huawei Cloud community “Landmark Recognition Algorithm”, originally written by A Du.

Landmark recognition, based on deep learning and large-scale image training, supports thousands of object recognition and scene recognition, widely used in photo recognition, popular science in preschool education, image classification and other scenes. This article will bring you six algorithms for identifying places.

1st Place Solution to Google Landmark Retrieval 2020

Algorithm idea:

Step1: train the initial embedding model using the cleaned GLDv2 dataset.

Step2: Use full GLDv2 data to conduct transfer learning based on the model obtained by Step1.

Step3: Gradually expand the scale of training images (512* 512,640 * 640,736 *736) to further improve the model performance. Step4: increase the weight of training loss of the cleaned data to further train the model. Step5: Model fusion.

Notes:

1. Backbone model is Efficientnet+ GlobalAverage pooling, and cosinesoftmax Loss is used in training.

2. To deal with category imbalances, weightedCross entropy is used.

Experience summary:

1. The cleaned data is conducive to fast convergence of the model.

2. Full large data sets are conducive to better feature representation for the model.

3. Increasing training resolution can improve model performance.

3rd Place Solution to “Google Landmark Retrieval 2020”

Algorithm idea:

Step1: CGLDv2 is used to train the basic model to extract the full image features of GLDv2, and DBSCAN clustering method is used to update image categories for data cleaning.

Step2: Corner-Cutmix image enhancement method is used for model training.

Notes:

Backbone. ResNest200 and ResNet152, GAP pooling, 1*1 convolution dimension reduction to 512 dimensions, cross entropy loss function.

Two-stage Discriminative Re-ranking for Large-scale LandmarkRetrieval

Algorithm idea:

Step1: use CNN features to conduct KNN search and obtain similar images.

Step2: insert the missing pictures of Step1 for reordering.

Notes:

1. Backbone model is Resnet-101 +GeneralizedMean (GeM)-pooling, training loss is ArcFace Loss.

2. Global features + local features are used to clean THE GDD-V2 data set for subsequent model training.

2nd Place and 2nd Place Solution to Kaggle Landmark Recognitionand Retrieval Competition 2019

Algorithm idea:

1. Train Resnet152, ResNet200 and other models respectively with GLD- V2 full data, train Loss as ArcFace loss and Npairs Loss, splice the features of each backbone, reduce to 512 dimensions with PCA. As a global feature of the image.

2. KNN search was conducted by using global features, and the search results were reordered by SURF, Hassian-Affine and root SIFT local features, and DBA and AQE were used.

V. Detect-to-Retrieve: Efficient Regional Aggregation for ImageSearch

Algorithm idea:

Step1: GLD data set with bbox is used to train the ftf-rcnn or SSD detection model for landmark box extraction.

Step2: The D2R-R-ASMK method is proposed for local feature extraction and feature aggregation in the detection frame.

Step3: Search the database using the aggregated characteristics.

Notes:

1. D2r-r-asmk is realized based on DELF local feature extraction and ASMK feature aggregation.

2. It is best to extract 4.05 regions per image, and the memory usage of search will increase accordingly.

Vi. “Unifying Deep Local and Global Features for Image Search”

Algorithm idea:

Step1: extract global and local features uniformly in the same network

Step2: Use global features to search for similar images in Top100

Step3: Use local features to reorder the search results

Notes:

1. GeM pooling and ArcFace Loss are used for global features.

2. Ransac method is used for local feature matching.

Click to follow, the first time to learn about Huawei cloud fresh technology ~

6 kinds of common landmark recognition algorithms are sorted out and summarized

Related Posts

With top Japanese drama as the driving force, Microsoft Xiaoice globalization process is accelerating again

Tencent Youtu once again broke three records of ReID public data sets, with the highest hit rate of nearly 99% in the first place

Snowball: Building personalized stock content recommendation products based on Tensorflow