PyTorch custom data set

To prepare data

Prepare the COCO128 dataset, which is the first 128 data of COCO Train2017. Organized by YOLOv5 directory:

Tree ~ / datasets/coco128 - L $2 / home/John/datasets/coco128 ├ ─ ─ images │ └ ─ ─ train2017 │ ├ ─ ─... │ └ ─ ─ 000000000650 JPG ├ ─ ─ labels │ └ ─ ─ train2017 │ ├ ─ ─... │ ├─ ├─ class.txt │ ├─ class.txtCopy the code

See Train Custom Data for details.

Define the Dataset

Datasets are an abstract class for a Dataset. To customize a Dataset, inherit Dataset and override the following methods:

__len__: len(dataset)Gets the data set size.
__getitem__: dataset[i]Access to the firstiA data.

See:

torch.utils.data.Dataset
torchvision.datasets.vision.VisionDataset

Examples of customizing a YOLOv5 dataset:

import os
from pathlib import Path
from typing import Any.Callable.Optional.Tuple

import numpy as np
import torch
import torchvision
from PIL import Image


class YOLOv5(torchvision.datasets.vision.VisionDataset) :

  def __init__(
    self,
    root: str,
    name: str,
    transform: Optional[Callable] = None,
    target_transform: Optional[Callable] = None,
    transforms: Optional[Callable] = None.) - >None:
    super(YOLOv5, self).__init__(root, transforms, transform, target_transform)
    images_dir = Path(root) / 'images' / name
    labels_dir = Path(root) / 'labels' / name
    self.images = [n for n in images_dir.iterdir()]
    self.labels = []
    for image in self.images:
      base, _ = os.path.splitext(os.path.basename(image))
      label = labels_dir / f'{base}.txt'
      self.labels.append(label if label.exists() else None)

  def __getitem__(self, idx: int) - >Tuple[Any.Any] :
    img = Image.open(self.images[idx]).convert('RGB')

    label_file = self.labels[idx]
    if label_file is not None:  # found
      with open(label_file, 'r') as f:
        labels = [x.split() for x in f.read().strip().splitlines()]
        labels = np.array(labels, dtype=np.float32)
    else:  # missing
      labels = np.zeros((0.5), dtype=np.float32)

    boxes = []
    classes = []
    for label in labels:
      x, y, w, h = label[1:]
      boxes.append([
        (x - w/2) * img.width,
        (y - h/2) * img.height,
        (x + w/2) * img.width,
        (y + h/2) * img.height])
      classes.append(label[0])

    target = {}
    target["boxes"] = torch.as_tensor(boxes, dtype=torch.float32)
    target["labels"] = torch.as_tensor(classes, dtype=torch.int64)

    if self.transforms is not None:
      img, target = self.transforms(img, target)

    return img, target

  def __len__(self) - >int:
    return len(self.images)
Copy the code

The above implementation, inherited the VisionDataset subclass. Its __getitem__ returns:

Image: PIL image, the size is(H, W)
target: dictContains the following fields:
- boxes (FloatTensor[N, 4]): Real annotation box[x1, y1, x2, y2].xThe scope of[0,W].yThe scope of[0,H]
- labels (Int64Tensor[N]): Category identification of the above boxes

Read the Dataset

dataset = YOLOv5(Path.home() / 'datasets/coco128'.'train2017')
print(f'dataset: {len(dataset)}')
print(f'dataset[0]: {dataset[0]}')
Copy the code

Output:

dataset: 128
dataset[0]: (<PIL.Image.Image image mode=RGB size=640x480 at 0x7F6F9464ADF0>, {'boxes': Tensor ([249.7296, 200.5402, 460.5399, 249.1901], [448.1702, 363.7198, 471.1501, 406.2300]... [0.0000, 188.8901, 172.6400, 280.9003]]),'labels': tensor([44, 51, 51, 51, 51, 44, 44, 44, 44, 44, 45, 45, 45, 45, 45, 45, 45, 45, 45, 50, 50, 50, 51, 51, 60, 42, 44, 45, 45, 45, 50, 51, 51, 51, 51, 51, 51, 51, 51, 44, 50, 50, 45])})Copy the code

Preview:

Use the DataLoader

Training requires batch extraction of data using DataLoader:

dataset = YOLOv5(Path.home() / 'datasets/coco128'.'train2017',
  transform=torchvision.transforms.Compose([
    torchvision.transforms.ToTensor()
  ]))

dataloader = DataLoader(dataset, batch_size=64, shuffle=True,
                        collate_fn=lambda batch: tuple(zip(*batch)))

for batch_i, (images, targets) in enumerate(dataloader):
  print(f'batch {batch_i}, images {len(images)}, targets {len(targets)}')
  print(f'  images[0]: shape={images[0].shape}')
  print(f'  targets[0]: {targets[0]}')
Copy the code

Output:

batch 0, images 64, targets 64
  images[0]: shape=torch.Size([3, 480, 640])
  targets[0]: {'boxes': Tensor ([249.7296, 200.5402, 460.5399, 249.1901], [448.1702, 363.7198, 471.1501, 406.2300]... [0.0000, 188.8901, 172.6400, 280.9003]]),'labels': tensor([44, 51, 51, 51, 51, 44, 44, 44, 44, 44, 45, 45, 45, 45, 45, 45, 45, 45,
        45, 50, 50, 50, 51, 51, 60, 42, 44, 45, 45, 45, 50, 51, 51, 51, 51, 51,
        51, 44, 50, 50, 50, 45])}
batch 1, images 64, targets 64
  images[0]: shape=torch.Size([3, 248, 640])
  targets[0]: {'boxes': Tensor ([337.9299, 167.8500, 378.6999, 191.3100], [383.5398, 148.4501, 452.6598, 191.4701], [467.9299, 149.9001, 540.8099, 193.2401, [196.3898, 142.7200, 271.6896, 190.0999], [134.3901, 154.5799, 193.9299, 189.1699], [89.5299, 162.1901, 124.3798, 188.3301], [1.6701, 154.9299, 56.8400, 188.3700]],'labels': tensor([20, 20, 20, 20, 20, 20, 20])}
Copy the code

The source code

datasets.py
test_datasets.py

reference

Loading data in PyTorch
Datasets & Dataloaders
Writing Custom Datasets, DataLoaders and Transforms

APIs:

torch.utils.data
torchvision.datasets

GoCoding personal practice experience sharing, please pay attention to the public account!

To prepare data

Define the Dataset

Read the Dataset

Use the DataLoader

The source code

reference

Related Posts

[PaperRead]CBAM: Convolutional Block Attention Module

Path planning based on MATLAB GUI artificial potential field algorithm for robot obstacle avoidance path planning (manual obstacles)

Association rule mining and Apriori algorithm