DeepSort model training

This is the 17th day of my participation in the August More text Challenge. For details, see:August is more challenging

Software and Hardware Environment

Ubuntu 18.04 64 – bit
GTX 1070Ti
Anaconda with python 3.7
Pytorch 1.6
Cuda 10.1

preface

Target tracking based on YOLOv5 and DeepSort has been introduced in the previous article to realize target detection and tracking by YOLOv5 and DeepSort. However, the model of the original project only contains common people, and there is nothing to do for other targets. In this paper, we will train our own tracker to achieve the tracking of specific targets.

For YOLOv5 test model training, refer to the previous post YOLOv5 Model training.

Market 1501 data set

The market-1501 dataset was collected on the campus of Tsinghua University, photographed in the summer, constructed and made public in 2015. It includes 1,501 pedestrians captured by six cameras (five hd cameras and one low resolution camera) and 32,668 pedestrian rectangles detected. Each pedestrian is captured by at least 2 cameras and may have multiple images in one camera. There were 751 people in the training set, including 12936 images, with an average of 17.2 training data for each person; The test set consisted of 750 people with 19,732 images, with an average of 26.3 images per person. The pedestrian detection rectangular boxes of 3368 query images were manually drawn, while the pedestrian detection rectangular boxes in the gallery were detected by DPM detector.

Data set directory structure

Market-1501-v15.09.15 ├── bounding_box_train ├── gt_bbox ├── gt_query ├─ Query ├─ readme.txtCopy the code

There are four folders

bounding_box_test: For testing
bounding_box_train: Used for training
query: 750 identities. We randomly select a query image for each camera
gt_query: Contains actual annotations. For each query, the associated image is marked as “good” or “garbage.” “Junk” has no effect on search accuracy. The “junk” image also includes images from the same camera as the Query
gt_bbox: Hand-painted border, mainly used for judgmentDPMThe bounding box is in good condition

Image Naming rules

In 0001 _c1s1_000151_01. JPG, for example

0001 indicates the tag number of each person. There are 1,501 people from 0001 to 1501
c1Represents the first camera (ciscamera), there are 6 cameras
s1Represents the first video clip (sissequence), each camera has multiple video clips
000151 saidc1s1The 000151st frame of the image, the video frame rate is 25
An 01 result indicatesc1s1_001051The first detection box on this frame, due to the use ofDPMAutomatic detector, each frame may have more than one pedestrian, the corresponding box will also have more than one. 00 indicates that the box is manually marked

Data set download address:

Link: pan.baidu.com/s/1i9aiZx-E… Extract code: UP8X

How is the Market organized

The original data set structure looks like this

Market-1501-v15.09.15 ├── bounding_box_train ├── gt_bbox ├── gt_query ├─ Query ├─ readme.txtCopy the code

Bounding_box_train and bounding_box_test are the actual picture files, and there’s no ID in there. Instead, place images with an ID (that is, a person) in a folder and use that ID as the folder name. For example, all picture files starting with 0002 under bounding_box_train are stored in folder 0002

Same thing for bounding_box_test, test has an ID of -1, okay? I don’t get it, but it doesn’t affect training.

I wrote a simple script for the above operation

import os import sys import shutil if __name__ == '__main__': root = os.path.join(sys.argv[1], 'dataset') os.mkdir(root) train_dir = os.path.join(root, 'train') test_dir = os.path.join(root, 'test') os.mkdir(train_dir) os.mkdir(test_dir) # train for file in os.listdir(os.path.join(sys.argv[1], 'bounding_box_train')): print(file) id = file.split('_')[0] if not os.path.exists(os.path.join(train_dir, id)): os.mkdir(os.path.join(train_dir, id)) else: shutil.copy(os.path.join(sys.argv[1], 'bounding_box_train', file), os.path.join(train_dir, Id)) # process test for file in os.listdir(os.path.join(sys.argv[1], 'bounding_box_test')): id = file.split('_')[0] if not os.path.exists(os.path.join(test_dir, id)): os.mkdir(os.path.join(test_dir, id)) else: shutil.copy(os.path.join(sys.argv[1], 'bounding_box_test', file), os.path.join(test_dir, id))Copy the code

Method of use

Python test. Py - v15.09.15 Market - 1501Copy the code

After the script is executed, the folder dataset will be generated under market-1501-v15.09.15. The file structure is as follows

The dataset / ├ ─ ─ "train" ├ ─ ─ 0002 ├ ─ ─ 0007 ├ ─ ─ 0010 ├ ─ ─ 0011 ├ ─ ─ 0012 ├ ─ ─ 0020 ├ ─ ─ 0022 ├ ─ ─ the test ├ ─ ─ 0000 ├ ─ ─ 0001 ├ ─ ─ 0003 ├── 0004 ├─ 0005 ├─ 0006 ├─ 0008 ├─ 0009Copy the code

The result is a data set that can be trained directly without breaking the original.

Not to mention the dependency environment, see above

git clone --recurse-submodules https://github.com/mikel-brostrom/Yolov5_DeepSort_Pytorch.git
cd Yolov5_DeepSort_Pytorch/deep_sort/deep_sort/deep
Copy the code

Next, copy the data set Market to Yolov5_DeepSort_Pytorch/deep_sort/deep_sort/deep and then decompress it. The location of the data set is arbitrary and can be specified by parameter.

Finally, I need to make a change, edit model.py, and then

def __init__(self, num_classes=751 ,reid=False):
Copy the code

def __init__(self, num_classes=752 ,reid=False):
Copy the code

Then you can start training

Python "train". Py - data - dir - v15.09.15 Market - 1501Copy the code

After the training, we generate the model file ckpt.t7 under Checkpoint, find a video, and test it

Num_classes meaning

Num_classes = num_classes = num_classes The train.py code in

trainloader = torch.utils.data.DataLoader(
    torchvision.datasets.ImageFolder(train_dir, transform=transform_train),
    batch_size=64,shuffle=True
)
testloader = torch.utils.data.DataLoader(
    torchvision.datasets.ImageFolder(test_dir, transform=transform_test),
    batch_size=64,shuffle=True
)

num_classes = max(len(trainloader.dataset.classes), len(testloader.dataset.classes))
Copy the code

As you can see, num_classes is the value of the larger number of types (that is, the total number of ids) in the train and test sets. In Market 1501, there are 751 trains and 752 tests (including an ID number of -1). Therefore, So num_classes is 752.

Therefore, when training the data set, we only need to change model.py and num_classes to 752, and we don’t need to change train.py.

note

Finally, another word, the project github.com/mikel-brost… The deepsort trace section refers to github.com/ZQPei/deep_… , but some changes have been made. The previous training is based on Yolov5_DeepSort.

The resources

Xugaoxiang.com/2020/10/17/…
Github.com/mikel-brost…
Github.com/ZQPei/deep_…
Github.com/ZQPei/deep_…
Github.com/ZQPei/deep_…

DeepSort model training

Software and Hardware Environment

preface

Market 1501 data set

How is the Market organized