In the second stage of the project, target detection is carried out on the fused image. Here, the original image of KITTI data set is used for target detection. The code uses the U version pyTorch – YOLO: github.com/ultralytics…

I. Environment configuration

I am using Win10 + CUDA 11.1 + Python 3.8 + Pytorch1.6.0

2. Data preparation

Annotations and JPEGImages are copied to the data folder under yolov3-Master project directory. Create two new folders and name them ImageSets and Labels. Finally, copy and paste the JPEGImages folder and rename the folder to images

! [](https://p6-tt-ipv6.byteimg.com/large/pgc-image/1655eb66604d469695cace3f071fb393)

Note that the file format in JPEGImages is PNG and the file format in images is JPG. Simply use ren *.png *.jpg in CMD to complete the batch conversion

Create a new file maketxt.py in the project root directory

Import osimport random trainval_percent = 0.1train_percent = 0.9 XMLFilepath = 'data/Annotations' TXtSavepath = 'data/ImageSets'total_xml = os.listdir(xmlfilepath) num = len(total_xml)list = range(num)tv = int(num * trainval_percent)tr = int(tv * train_percent)trainval = random.sample(list, tv)train = random.sample(trainval, tr) ftrainval = open('data/ImageSets/trainval.txt', 'w')ftest = open('data/ImageSets/test.txt', 'w')ftrain = open('data/ImageSets/train.txt', 'w')fval = open('data/ImageSets/val.txt', 'w') for i in list: name = total_xml[i][:-4] + '\n' if i in trainval: ftrainval.write(name) if i in train: ftest.write(name) else: fval.write(name) else: ftrain.write(name) ftrainval.close()ftrain.close()fval.close()ftest.close()Copy the code

After running maketxt.py, you get four files in ImageSets that store the name of the image.

! [](https://p6-tt-ipv6.byteimg.com/large/pgc-image/236317823d584d13a2c607b6acd0f9c8)

Run voc_label.py in the root directory to get the labels and tray. TXT, test. TXT, val. TXT in the data directory, where tray. TXT gets the file name and the path to the file. TXT and val. TXT are combined to obtain train. TXT and test. TXT as the validation set

! [](https://p6-tt-ipv6.byteimg.com/large/pgc-image/120b5370fec848fd9fc82a3f8091da38)
! [](https://p1-tt-ipv6.byteimg.com/large/pgc-image/17c8f4715a204afaa66dc845f1412947)

The voc_label.py code is as follows

import xml.etree.ElementTree as ETimport pickleimport osfrom os import listdir, Getcwdfrom os.path import join sets = ['train', 'test', 'val'] classes = ["Car","Pedestrian","Cyclist"] # Def convert(size, box): Dw = 1 \. / size [0] dh = 1 \. / size [1] x = box (box [0] + [1])/y = 2.0 + box (box [2] [3])/w = 2.0 box [1] - box [0] h = box[3] - box[2] x = x * dw w = w * dw y = y * dh h = h * dh return (x, y, w, h) def convert_annotation(image_id): in_file = open('data/Annotations/%s.xml' % (image_id)) out_file = open('data/labels/%s.txt' % (image_id), 'w') tree = ET.parse(in_file) root = tree.getroot() size = root.find('size') w = int(size.find('width').text) h = int(size.find('height').text) for obj in root.iter('object'): #difficult = obj.find('difficult').text cls = obj.find('name').text # if cls not in classes or int(difficult) == 1: # continue cls_id = classes.index(cls) xmlbox = obj.find('bndbox') b = (float(xmlbox.find('xmin').text), float(xmlbox.find('xmax').text), float(xmlbox.find('ymin').text), float(xmlbox.find('ymax').text)) bb = convert((w, h), b) out_file.write(str(cls_id) + " " + " ".join([str(a) for a in bb]) + '\n') wd = getcwd()print(wd)for image_set in sets: if not os.path.exists('data/labels/'): os.makedirs('data/labels/') image_ids = open('data/ImageSets/%s.txt' % (image_set)).read().strip().split() list_file = open('data/%s.txt' % (image_set), 'w') for image_id in image_ids: list_file.write('data/images/%s.jpg\n' % (image_id)) convert_annotation(image_id) list_file.close()Copy the code

3. Modify the configuration file

Create a new kitti.data file under data and configure it as follows:

classes=3train=data/train.txtvalid=data/test.txtnames=data/kitti.namesbackup=backup/eval=coco
Copy the code

Create kitti.names in the data file and set the following parameters:

CarPedestrianCyclist
Copy the code
! [](https://p1-tt-ipv6.byteimg.com/large/pgc-image/190d6ed9d8ea4e688305e99eedf28ce8)

Modify the CFG

In the original project, there are many yolov3 network structures in the CFG directory. We use yolov3-tiny.cfg this time.

We need to modify the yolov3-tiny. CFG file under CFG as follows:

Find the yOLO column first, change the classes in it to the number of categories in your data set (in my case, 3), and then add the convolutional ones above each yOLO

Filters = 24 (3 * (5 + categories))Copy the code
! [](https://p1-tt-ipv6.byteimg.com/large/pgc-image/35487e2402074254a626c6c0e81c1cdc)

Four, get the weight

To obtain the network parameters yolov3-tiny.weights, download the link pjreddie.com/media/files… Extraction code: T7VP

Five, the training

After training, you get two models

! [](https://p1-tt-ipv6.byteimg.com/large/pgc-image/e78fcc26486f4393bbd2961ea440e0e9)

Those who need the source code or want to learn more click here

This article reprinted text, copyright belongs to the author, such as infringement contact xiaobian delete!

Original address: blog.csdn.net/qq_34201858…