cv2.imread()

Now we use cv2.imread() to read a photo,

image = cv2.imread('image_name.jpg')
image.shape
# output:(96, 64, 3)

image[0] [0]
# output:
# array([175, 197,239], dType =uint8)
Copy the code

It can be seen that output is (96, 64, 3), where 96 represents the height of the picture, 64 represents the width of the picture, and 3 represents the channel BGR.

There are two problems with using CV2 to read images. The first is the way BGR is used, and the normal RGB format data is different

The second is the way we [H,W,C] our PyTorch reads data [C,H,W]

If we look at the image[0] array, we can see that it is [175,197,239].

cv2.cvtColor()

If we use cv2.cvtColor(image, cv2.color_bgr2RGB) to convert BGR to RGB,

image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
image.shape
# output:(96, 63, 3)

image[0] [0]
# output:
# array([239, 197, 175], dtype=uint8)
Copy the code

Again, 96, 64, 3, except that the last three digits of each group have been transformed.

Image [0][0] [239, 197, 175]

np.transpose()

When we need to feed these arrays into the neural network, we also need to convert the format from [H, W, C] to [C, H, W].

In this case we can use:

image = np.transpose(image, [2.0.1])
image.shape
# output:(3, 96, 64)
 
To type to PyTorch, we need to add a batch_size dimension
image = np.expand_dims(image, axis=0)
image.shape
# output:(1, 3, 96, 64)
Copy the code