An overview of the
- When you hear “search by picture”, do you first think of baidu, Google, Ali and other search engines search by picture function? In fact, it is possible to build a map search system of their own: their own photo library; Select a picture to search in the library, and get a number of similar pictures.
- In order to search similar images, an image searching system is designed based on inner product distance calculation and image feature extraction model VGG16. The text consists of five parts: System overview, VGG model, data preparation, system deployment, and summary.
System builds
- Open source: github.com/thirtyonele…
- Base environment Installation: Python version: 3.6.10
git clone https://github.com/thirtyonelee/image-retrieval.git && cd image-retrieval
pip install -r requirements.txt
Copy the code
- Build base index, save index to ”
/index/train.h5″ by default
python index.py
Copy the code
- Experience similar search, default Numpy inner product calculation engine, default test image :”
/data/test/001_accordion_image_0001.jpg”
Py or Python Retrieval. Py --engine=numpy --test_data=<ROOT_DIR>/data/test/001_accordion_image_0001.jpg [{'name': b'001_accordion_image_0003.jpg', 'score': 0.902732}, {'name': B '001_accordion_image_0003.jpg', 'score': 0.872308}, {'name': b'002_anchor_image_0004.jpg', 'score': 0.865453}] 'name' corresponding to index <ROOT_DIR>/data/train image nameCopy the code
System architecture
- The system is mainly composed of two parts: image feature extraction model VGG and vector retrieval engine. The VGG model is responsible for converting images into vectors, and the vector retrieval engine is responsible for storing vectors and retrieving similar vectors. The specific architecture is shown in the figure below:
VGG model
- VGGNet, developed by researchers from the Visual Geometry Group at the University of Oxford and Google DeepMind, is the first location task and second classification task in ILSVRC-2014. Its outstanding contribution is to prove that the effect of the model can be effectively improved by increasing the network depth with small convolution (3*3), and VGGNet has good generalization ability for other data sets. VGG model performs better than GoogleNet in multiple transfer learning tasks, and VGG model is the preferred algorithm for extracting CNN features from images. Therefore, VGG is selected as the deep learning model in this scheme.
- VGGNet explores the relationship between THE depth of CNN and its performance by repeatedly stacking 3Small convolution kernel of 3 and 2VGGNet successfully constructed CNN with 16-19 layers of depth. In this scheme, the VGG16 model provided by Keras application module (Keras.Applications) is used.
- VGG official website: www.robots.ox.ac.uk/~vgg/resear…
- VGG Github:github.com/machrisaa/t…
Data preparation
- The demo uses the PASCAL VOC image set, which contains 11,530 images in 20 categories.
- Data set size 2 g, download address: host. Robots. The ox. Ac. UK/PASCAL/VOC /…
Note: you can test with other images, this system supports.jpg and.png image formats
conclusion
- The project supports a variety of distance calculation engines, such as Numpy, Faiss, ES, Milvus
- Other distance calculation functions can be customized, such as Euclidean distance (L2), Hamming distance, Jerkard distance, Guben distance, etc
The appendix
Reference 1: github.com/willard-yua… Reference 2: github.com/zilliz-boot…