PaddleDetection FAQ # 1

Welcome to PaddleDetection. In view of the problems encountered in PaddleDetection, we have sorted out the frequently occurring situations into FAQs (Frequently asked Questions).

Portal: github.com/PaddlePaddl…

The Github address is github.com/PaddlePaddl… Welcome to try, star support ~

Q: Why did I use a single GPU to train Loss to produce NaN? A: The original learning rate in the configuration file is for multi-GPU training (8x GPU). If single GPU training is used, the learning rate must be adjusted (for example, divide by 8).

For faster_RCNn_R50, the table of calculation rules in the static diagram is shown below. They are equivalent, and the changing nodes in the table are boundaries in piecewise decay:

Number of GPU	Batch size/card	vector	Maximum number of rounds	Change the node
2	1	0.0025	720000	[480000, 640000]
4	1	0.005	360000	[240000, 320000]
8	1	0.01	180000	[120000, 160000]

The above method applies to static diagrams. In the dynamic graph, since the training is counted in epoch mode, only the learning rate needs to be modified after adjusting the number of GPU cards, and the modification method is the same as that in the static graph.

Q: When customizing data sets, how should NUM_classes be set in the configuration file? A: In the dynamic diagram, num_classes can be uniformly set to the number of classes of the user-defined data set. In the static diagram (under the static directory), YOLO series models and Anchor Free series models can be set to the number of classes of the user-defined data set. Other models, such as RCNN series, SSD, RetinaNet, SOLOv2 etc. Because the detection principle needs to distinguish background and foreground in classification, num_classes must be set to the number of custom data set categories +1, that is, add one background class.

Q: Pp-yolov2 model training uses -eval for training validation. How should I handle the first eval? A: If pp-YOLO series models only load the pretraining weights of backbone and train from the beginning, convergence will be slow. When the model does not converge well, the prediction box better than the output will be chaotic. Sorting and filtering in NMS will be very time-consuming, just like hanging in EVAL. This situation generally occurs when a custom data set is used and the number of samples of the custom data set is small, resulting in a small number of training rounds in the first eval, and the model is not well converged. Troubleshooting can be performed in the following three aspects.

The default configuration provided by PaddleDetection generally adopts the configuration of 8-card training. The number of batCH_size in the configuration file is the batch size of each card. If 8 cards are not used during training or the batCH_size is modified, The initial Learning_rate needs to be scaled down to achieve good convergence
If a custom data set is used and the sample size is small, it is recommended to increase the number of snapshot_epoch to increase the number of training rounds when eval is first performed to ensure that the model has been well converged
If you use custom data sets for training, you can load the weights trained on COCO or VOC data sets released by us for finetune training to speed up the convergence speed. You can specify the pre-training weights by using -o pretrain_weights= XXX. XXX can be a Model weight link published in Model Zoo

Q: How to better understand reader and custom modify reader files

# Transforms Sample_modules/modules/modules/modules/modules/modules -mixup: {alpha: 1.5, beta: {alpha: 1.5, beta: {} # OP RandomExpand: {} # OP RandomExpand: {} # OP RandomExpand: {} {fill_value: [123.675, 116.28, 103.53]} # random Canvas fill, optional op-randomcrop: {} # RandomCrop, optional op-randomflip: {} # transforms batCH_transforms batCH_transforms: -batchrandOMresize: {target_size: [320, 352, 384, 416, 448, 480, 512, 544, 576, 608], random_size: True, random_interp: True, keep_ratio: False} - NormalizeBox: {} - PadBox: {num_max_boxes: 50} - BboxXYXY2XYWH: {} - NormalizeImage: {mean: [0.485, 0.456, 0.406], STD: [0.229, 0.224, 0.225], is_Scale: True} -permute: {} -gt2yolotarget: {anchor_masks: [[6, 7, 8], [3, 4, 5], [0, 1, 2]], anchors: [[10, 13], [16, 30], [33, 23], [30, 61], [62, 45], [59, 119], [116, 90], [156, 198], [373, 326]], downsample_ratios: [32, 16, 8]} # batch_size batCH_size: 24 True # mixup_EPOCH, greater than the maximum epoch, indicates that the training process has been using mixup data augmentation. The default value is -1, indicating that Mixup is not used. If you delete -mixup: {alpha: 1.5, beta: 1.5}, you must also set mixup_epoch to -1 or delete mixup_epoch: Whether to use shared memory for data read acceleration, ensure that the shared memory size (such as /dev/shm) is greater than 1 gb use_shared_memory: True If single-scale training is required, remove the BatchRandomResize line from the batch_transforms line and add -resize: {target_size: [608, 608], keep_ratio: False, interp: Mixup RandomDistort RandomExpand RandomCrop RandomFlip, Mixup RandomDistort RandomExpand RandomCrop RandomFlip, Note that if you annotate or delete Mixup, you must also comment or delete the mixup_epoch line or set it to -1 to not use Mixup sample_transforms: -decode: {} -resize: {target_size: [608, 608], keep_ratio: False, interp: 2}Copy the code

Q: How do users control the category output? That is, there are multiple classes of objects in the graph and only some of them are output

A: Users can modify the code by themselves and add conditional Settings.

# filter by class_id
keep_class_id = [1, 2]
bbox_res = [e for e in bbox_res if int(e[0]) in keep_class_id]
Copy the code

Github.com/PaddlePaddl…

Q: User – defined data set training, prediction result label error

A: When setting the path of the dataset, the user does not pay attention to anno_path in the TestDataset. Users are required to set anno_path to their own path.

TestDataset: ! ImageFolder anno_path: annotations/instances_val2017.jsonCopy the code

Welcome to PaddleDetection. In view of the problems encountered in PaddleDetection, we have sorted out the frequently occurring situations into FAQs (Frequently asked Questions).

Related Posts

Live Wallpaper HD for Mac

Keras: A neural network model for multiple inputs and mixed data inputs

Do you really understand phonetic features?