YOLOv5 exploration based on garbage target detection task

Author: Yu Minjun

The research background

As a scientific management scheme for effective waste treatment, garbage classification is of great significance in improving resource utilization rate, alleviating waste production pressure and improving ecological environment. It is a necessary strategy in the process of socialist modernization and urbanization in China, and attracts urgent attention from all over the world. Since 2019, with the smooth implementation of legislation, law enforcement and supervision of domestic waste classification in Shanghai, Hangzhou and other key cities of waste classification, people’s attention to the topic of waste classification has been increasing, and the awareness of personal waste classification has also been greatly improved. But at the same time, due to the variety of garbage is extremely rich, the degree of ambiguity of garbage classification is generally high, so the realization of garbage classification automation is particularly important. As an important part of garbage classification automation, garbage target detection will be implemented in this paper. The so-called target detection is simply to detect what and where the object in the image is, that is, the problem of “target classification + location”. As representative algorithms for one-stage target detection tasks, YOLO series models are favored by researchers due to their fast and good generalization performance. YOLOv5 was also officially released on GitHub a few days ago, and it immediately sparked a lot of discussion online. In this article, we will try to implement garbage target detection on the TACO data set using the YOLOv5 network model simply by referring to the tutorial provided.

Data set processing

TACO is a growing garbage object data set, with trees, roads, and beaches as its backdrop, and currently contains 60 categories of garbage objects, 1,500 images, and nearly 5,000 annotations. For this dataset project see: github.com/pedropro/TA… On the one hand, the file storage format and label format of the dataset should meet the relevant requirements of YOLOv5 model. On the other hand, considering that the sample number of garbage type objects in the dataset is extremely uneven (as shown in Figure 1), this paper first needs to carry out necessary operations on the dataset. The code for processing data sets can be seen in Mo project [1], in which the **_readme.ipynb** file details the specific operations used by relevant codes.

[1]

'Clear plastic bottle': 5
'Plastic bottle cap': 7
'Drink can': 12
'Other plastic': 29
'Plastic film': 36
'Other plastic wrapper': 39
'Unlabeled litter': 58
'Cigarette': 59
Copy the code

To convert the COCO format to YOLO format, the following three tasks should be completed:

  • The generated labels and images are stored in two file directories, and the names of the labels and images must be the same
  • Map the target object’s original label collection in ascending order to the {0-7} space
  • Because the location information in the original label is{top_x, top_y, width, height}, the project needs to convert it to{center_x, center_y, width, height}Format and normalize its values

The core code section looks like this (cocotoyolo.py) :

# map garbage type original number to {0-7} space
label_transfer = {5: 0.7: 1.12: 2.29: 3.36: 4.39: 5.58: 6.59: 7}
class_num = {}  Record the number of samples of each type

img_ids = data_source.getImgIds()
# Go through each image and convert the tags
for index, img_id in tqdm.tqdm(enumerate(img_ids), desc='change .json file to .txt file'):
    img_info = data_source.loadImgs(img_id)[0]
    # change the path containing the folder to the filename
    save_name = img_info['file_name'].replace('/'.'_')
    Remove the file extension
    file_name = save_name.split('. ') [0]
    Get the width and height of a single image
    height = img_info['height']
    width = img_info['width']
    # Convert the TXT file storage path
    save_path = save_base_path + file_name + '.txt'
    is_exist = False  Record whether the image contains the target spam-type object
    with open(save_path, mode='w') as fp:
    	Find a collection of garbage objects based on the image number
        annotation_id = data_source.getAnnIds(img_id)
        boxes = np.zeros((0.5))
        if len(annotation_id) == 0:  Set size to 0
            fp.write(' ')
            continue
        Get the coco tag
        annotations = data_source.loadAnns(annotation_id)
        lines = ' '  Record the converted YOLO format tag
        Iterate through the set of object tags
        for annotation in annotations:
        	Get the tag of the garbage object
            label = coco_labels_inverse[annotation['category_id']]
            if label in label_transfer.keys():
            	If the garbage type belongs to the target garbage type, format conversion is performed
                is_exist = True
                box = annotation['bbox']
                if box[2] < 1 or box[3] < 1:
                	# If there is no long or wide data in the original tag, skip
                	continue
	            # top_x,top_y,width,height==>cen_x,cen_y,width,height
                box[0] = round((box[0] + box[2] / 2) / width, 6)
                box[1] = round((box[1] + box[3] / 2) / height, 6)
                box[2] = round(box[2] / width, 6)
                box[3] = round(box[3] / height, 6)
                label = label_transfer[label]  # tag mapping
                if label not in class_num.keys():
                    class_num[label] = 0
                class_num[label] += 1
                lines = lines + str(label)  Store tags first
                for i in box:  # Store location information again
                    lines += ' ' + str(i)
                lines += '\n'  # a newline
        fp.writelines(lines)
    if is_exist:
    	# copy the image to the specified directory
        shutil.copy('data/{}'.format(img_info['file_name']), os.path.join(save_image_path, save_name))
    else:
    	Delete the generated tag file if it does not exist
        os.remove(save_path)
Copy the code

(2) After the generation of tag set is completed, the project needs to conduct sample division for it. First of all, the project needs to divide the sample set according to the training set using all the samples, and the requirement that the test set samples account for 0.1 proportion of the total sample (the number of sample pictures is 1086 and 109 respectively, and the training set using all samples is due to the poor personal hardware equipment). Secondly, considering the network model’s requirements for sample storage directory, the project needs to store the corresponding generated images and labels in the corresponding folder, and the file directory format is shown in Figure 2. See the sample.py file in the Mo project for the implementation code.

Model configuration and training

The YOLOv5 project can be downloaded at github.com/ultralytics for deployment on the Mo platform, see the project of the same name [2], and all of the following can be referred to the official tutorial [2]. For details about the Mo project, see the **_readme.ipynb** file in the root directory

The model configuration

This section focuses on dependency installation and configuration file Settings, which are described briefly in this article. YOLOv5 is built from the PyTorch deep learning framework, so we first need to install The PyTorch framework in Python. The installation tutorial is available on the official website. Here are the installation commands for the latest PyTorch CPU version.

PIP install torch + CPU torchvision = = = = 1.5.1 0.6.1 + CPU - f https://download.pytorch.org/whl/torch_stable.htmlCopy the code

In addition, additional third-party dependency packages are required to run the YOLOv5 model. It’s all officially placed in requirements.txt. In order to install dependencies as accurately as possible, I put forward the following two suggestions on the revision of the file:

  1. When I fork the official project, numpy’s version requirement is 1.17, but the actual installation has had some problems. Therefore, I recommend that the version requirement be changed to 1.17.3.
  2. This dependency file provides the cocoapi.git download address, which is very slow to download and may be abnormal on Windows. Therefore, you are advised to replace the corresponding statement with the following statement:
git+https://github.com/philferriere/cocoapi.git#subdirectory=PythonAPI
Copy the code

(2) Configuration file setting The YOLOv5 project mainly requires the following two configuration files:

  1. Taco. yaml configuration file in the data path (create it yourself). This file configures the path and object type information for storing images in the training set and test set. The configuration of the label storage path is automatically determined by the image storage path, which is why our dataset file directory needs to conform to the requirements shown in Figure 2. The content of the configuration file is as follows.
# train and val datasets (image directory or *.txt file with image paths)
train: taco/images/train/
val: taco/images/test/

# number of classes
nc: 8

# class names
names: ['Clear plastic bottle', 'Plastic bottle cap'.
        'Drink can'.
        'Other plastic'.
        'Plastic film'. 'Other plastic wrapper'.
        'Unlabeled litter'. 'Cigarette']
Copy the code
  1. Write configuration files related to model configuration in the Models path. According to the official tutorial [2], researchers can directly modify the existing yolov5_. Yaml series files in the Models directory to make relevant configurations. Here, I modified the yolov5s.yaml file (because the experiment used yolov5s.pt as the model pre-training weight). Individuals can simply change the value of the NC property in this file to match the value of the NC property in taco.yaml above.

Model training

On the one hand, since personal computers have relatively little memory, directly using the official training parameter Settings will cause memory to explode and not work. Therefore, the experiment had to reduce the input image size and batCH_size value to force the model to train. On the other hand, due to the poor GPU performance of personal computers, the experiment chose to directly use CPU to train relevant models, so the model training speed was relatively slow. Considering the limitations of the above two aspects, the configuration of relevant parameters involved in the training of the experimental model is shown in Table 1. Table 1 Parameter configuration related to model training

Command line arguments Parameter meaning Set the value
–img Unified input image size 320
–batch The number of images input for each network training 4
–epochs The number of times the entire dataset participated in training 100
–data Path to the data set configuration file ./data/taco.yaml
–cfg Path to model configuration files ./models/yolov5s.yaml
–device Trained device (CPU or GPU) cpu
–weights Pretrain the weight file of the model yolov5s.pt

The call command for model training is as follows:

python train.py --img 320 --batch 4 --epochs 100 --data ./data/taco.yaml --cfg ./models/yolov5s.yaml --device cpu --weights yolov5s.pt
Copy the code

About the pre-training weight download, here is not detailed, Baidu should be able to find many domestic download resources. Here, I have placed the weight file of Yolov5s (yolov5s.pt) in the root directory of the project for the convenience of researchers to train the model. Of course, we can do model training without the weights, just remove the weights yolov5s.pt part of the command.

Results show

According to the official tutorial, the various results generated by model training are automatically placed in the root directory of the runs folder. The weights folder will store the weights files with the best effect and the latest time generated by model training. We can use these files to complete the call task of the model. Results.txt file stores the output of various indicators in the process of model training. YOLO project also automatically visualizes the output results and generates corresponding charts and images. The output visual images of each indicator in this experiment are shown in Figure 3. The top five are the corresponding results of the training set, and the bottom five are the corresponding results of the test set.

Python detect.py --weights best. Pt --img 320 --conf 0.4Copy the code

The detection effect of the target image is shown in Figure 4, and the generated images are located in the inference/output folder. The Drink Can object was incorrectly detected in the batch_1_000048 image (this is mainly related to the size of the conf setting value, which can be adjusted by individuals as appropriate).

The experimental summary

Based on the garbage object detection task, this paper uses the novel YOLOv5 model to experiment the TACO garbage object data set. Due to the shortage of personal hardware equipment and time limit, referring to the effect pictures of some output indicators given by the official and the image detection results obtained from self-experiment, it can be seen that the performance of the model finally obtained in this paper is actually not good (YOLO series models themselves need to run for a very long time). Interested friends can try to increase the uniform image size, batCH_size and epoch to improve the target detection performance of the model.

reference

The project address

[1] TACO: momodel. Cn/workspace / 5… [2] yolov5: momodel. Cn/workspace / 5…

The main literature

[1] from the COCO. Json files into. TXT file: www.jianshu.com/p/8ddd8f3fd… [2]Train Custom Data: github.com/ultralytics…