Starting with this article, we will learn to recognize handwritten numbers using convolutional neural networks. First let’s look at how to visualize pixel information from handwritten digital images of MNIST datasets.

First, load MNIST data set

First we create a new data/mnist folder in the SRC directory and put the contents of the MNIST dataset — mnist_images.png (which is actually a giant Sprite image, It has a lot of handwritten numbers) and mnist_labels_uint8 (a binary file for storing labels).

Then, create a new SRC \mnist\data.js file with the following contents, which is used to cut the contents of the Sprite image into small handwritten figures and convert binary labels into JS.

/ * * *@license* Licensed under the Apache License, Version 2.0 (the "License"); * you may not use this file except in compliance with the License. * You may obtain a copy of the License at * * http://www.apache.org/licenses/LICENSE-2.0 * * Unless required by applicable law or agreed to in writing, software * distributed under the License is distributed on an "AS IS" BASIS, * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the  License. * ============================================================================= */

import * as tf from '@tensorflow/tfjs';

const IMAGE_SIZE = 784;
const NUM_CLASSES = 10;
const NUM_DATASET_ELEMENTS = 65000;

const TRAIN_TEST_RATIO = 5 / 6;

const NUM_TRAIN_ELEMENTS = Math.floor(TRAIN_TEST_RATIO * NUM_DATASET_ELEMENTS);
const NUM_TEST_ELEMENTS = NUM_DATASET_ELEMENTS - NUM_TRAIN_ELEMENTS;

const MNIST_IMAGES_SPRITE_PATH =
    'http://127.0.0.1:8080/mnist/mnist_images.png';
const MNIST_LABELS_PATH =
    'http://127.0.0.1:8080/mnist/mnist_labels_uint8';

/**
 * A class that fetches the sprited MNIST dataset and returns shuffled batches.
 *
 * NOTE: This will get much easier. For now, we do data fetching and
 * manipulation manually.
 */
export class MnistData {
  constructor() {
    this.shuffledTrainIndex = 0;
    this.shuffledTestIndex = 0;
  }

  async load() {
    // Make a request for the MNIST sprited image.
    const img = new Image();
    const canvas = document.createElement('canvas');
    const ctx = canvas.getContext('2d');
    const imgRequest = new Promise((resolve, reject) = > {
      img.crossOrigin = ' ';
      img.onload = () = > {
        img.width = img.naturalWidth;
        img.height = img.naturalHeight;

        const datasetBytesBuffer =
            new ArrayBuffer(NUM_DATASET_ELEMENTS * IMAGE_SIZE * 4);

        const chunkSize = 5000;
        canvas.width = img.width;
        canvas.height = chunkSize;

        for (let i = 0; i < NUM_DATASET_ELEMENTS / chunkSize; i++) {
          const datasetBytesView = new Float32Array(
              datasetBytesBuffer, i * IMAGE_SIZE * chunkSize * 4,
              IMAGE_SIZE * chunkSize);
          ctx.drawImage(
              img, 0, i * chunkSize, img.width, chunkSize, 0.0, img.width,
              chunkSize);

          const imageData = ctx.getImageData(0.0, canvas.width, canvas.height);

          for (let j = 0; j < imageData.data.length / 4; j++) {
            // All channels hold an equal value since the image is grayscale, so
            // just read the red channel.
            datasetBytesView[j] = imageData.data[j * 4] / 255; }}this.datasetImages = new Float32Array(datasetBytesBuffer);

        resolve();
      };
      img.src = MNIST_IMAGES_SPRITE_PATH;
    });

    const labelsRequest = fetch(MNIST_LABELS_PATH);
    const [imgResponse, labelsResponse] =
        await Promise.all([imgRequest, labelsRequest]);

    this.datasetLabels = new Uint8Array(await labelsResponse.arrayBuffer());

    // Create shuffled indices into the train/test set for when we select a
    // random dataset element for training / validation.
    this.trainIndices = tf.util.createShuffledIndices(NUM_TRAIN_ELEMENTS);
    this.testIndices = tf.util.createShuffledIndices(NUM_TEST_ELEMENTS);

    // Slice the the images and labels into train and test sets.
    this.trainImages =
        this.datasetImages.slice(0, IMAGE_SIZE * NUM_TRAIN_ELEMENTS);
    this.testImages = this.datasetImages.slice(IMAGE_SIZE * NUM_TRAIN_ELEMENTS);
    this.trainLabels =
        this.datasetLabels.slice(0, NUM_CLASSES * NUM_TRAIN_ELEMENTS);
    this.testLabels =
        this.datasetLabels.slice(NUM_CLASSES * NUM_TRAIN_ELEMENTS);
  }

  nextTrainBatch(batchSize) {
    return this.nextBatch(
        batchSize, [this.trainImages, this.trainLabels], () = > {
          this.shuffledTrainIndex =
              (this.shuffledTrainIndex + 1) % this.trainIndices.length;
          return this.trainIndices[this.shuffledTrainIndex];
        });
  }

  nextTestBatch(batchSize) {
    return this.nextBatch(batchSize, [this.testImages, this.testLabels], () = > {
      this.shuffledTestIndex =
          (this.shuffledTestIndex + 1) % this.testIndices.length;
      return this.testIndices[this.shuffledTestIndex];
    });
  }

  nextBatch(batchSize, data, index) {
    const batchImagesArray = new Float32Array(batchSize * IMAGE_SIZE);
    const batchLabelsArray = new Uint8Array(batchSize * NUM_CLASSES);

    for (let i = 0; i < batchSize; i++) {
      const idx = index();

      const image =
          data[0].slice(idx * IMAGE_SIZE, idx * IMAGE_SIZE + IMAGE_SIZE);
      batchImagesArray.set(image, i * IMAGE_SIZE);

      const label =
          data[1].slice(idx * NUM_CLASSES, idx * NUM_CLASSES + NUM_CLASSES);
      batchLabelsArray.set(label, i * NUM_CLASSES);
    }

    const xs = tf.tensor2d(batchImagesArray, [batchSize, IMAGE_SIZE]);
    const labels = tf.tensor2d(batchLabelsArray, [batchSize, NUM_CLASSES]);

    return{xs, labels}; }}Copy the code

Among them, the following two lines were originally corresponding to foreign links, but because of well-known reasons, domestic access cannot be, so we need to locally start a static server:

const MNIST_IMAGES_SPRITE_PATH =
    'http://127.0.0.1:8080/mnist/mnist_images.png';
const MNIST_LABELS_PATH =
    'http://127.0.0.1:8080/mnist/mnist_labels_uint8';
Copy the code

Install http-server:

npm i http-server -g
Copy the code

Then execute:

hs src/data --cors
Copy the code

This starts the SRC /data directory through http-server as a static server accessible through http://127.0.0.1:8080 that allows cross-domain access.

We load the data from the MNIST dataset with the following code.

import * as tf from '@tensorflow/tfjs';
import * as tfvis from '@tensorflow/tfjs-vis';
import { MnistData } from './data';

window.onload = async() = > {const data = new MnistData();
  await data.load();
  const examples = data.nextTestBatch(20); The nextTestBatch method is used to load validation sets. The parameter is the number of validation sets to load
  console.log(examples);
}
Copy the code

As you can see, the labels shape is [20, 10], that is, 20 numbers, 10 is 0-9 total 10 numbers.

As shown in the figure, we can right click on the labels of the printed object and select Store as Global Varaiable to save this as a variable temp1. Then we can print the labels data structure from the console with temp1.print(). As follows:

[[0, 0, 0, 0, 0, 0, 0, 0, 1, 0] to [0, 0, 0, 0, 0, 0, 0, 0, 1, 0] to [0, 0, 0, 0, 1, 0, 0, 0, 0, 0] to [0, 0, 0, 0, 0, 0, 1, 0, 0, 0], [0, 0, 0, 0, 1, 0, 0, 0, 0, 0] to [0, 0, 0, 0, 0, 0, 0, 0, 0, 1], [0, 0, 0, 0, 0, 0, 1, 0, 0, 0] to [0, 0, 0, 0, 0, 0, 0, 0, 0, 1], [0, 1, 0, 0, 0, 0, 0, 0, 0, 0] to [0, 0, 0, 0, 0, 1, 0, 0, 0, 0] to [0, 0, 0, 0, 1, 0, 0, 0, 0, 0] to [0, 1, 0, 0, 0, 0, 0, 0, 0, 0] to [0, 0, 0, 1, 0, 0, 0, 0, 0, 0] to [0, 0, 0, 0, 0, 0, 1, 0, 0, 0] to [0, 0, 0, 0, 0, 0, 1, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 1], [0, 0, 0, 1, 0, 0, 0, 0, 0, 0] to [0, 0, 0, 0, 0, 0, 0, 0, 0, 1], [0, 1, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0]]Copy the code

Each array represents a piece of data, and each piece of data has 10 values. Only one of these 10 values is 1, and the index of that value is the value of that piece of data.

Xs stands for input data. The shape of XS is [20,784], which also means 20 pieces of data, and each piece of data has 784 characteristic values. Why 784? Since the input data here is the information of the picture, the picture of the MNIST dataset here is represented in 28 * 28 pixels, and since the pixels are black and white, it is 28 * 28 * 1 = 784. If it was RGB, you’d have to multiply it by 3, but because it’s black and white, you just multiply it by 1.

Second, cut out from the data the pixel value of each handwritten number

Then, how to cut out the pixel value data of each handwritten number from XS?

We need to introduce the slice method of tensorflow. js.

const x = tf.tensor1d([1.2.3.4]);
x.slice([1], [2]).print(); / / [2, 3]
Copy the code

The first argument to slice is begin, which is passed as [1], indicating the start of the second term in the first dimension. The second argument to slice is size, which indicates the length of the slice. [2] indicates the length of the slice, so the result is [2, 3].

const x = tf.tensor2d([1.2.3.4], [2.2]);
x.print(); / / [[1, 2], [3, 4]]
x.silce([1.0], [1.2]);
Copy the code

For a Tensor like [[1,2],[3,4]], you start at [1,0], you start at the first dimension, the second term, the first term in the second dimension, the 3 in [[1,2],[3,4]. The dimensions are [1, 2], that is, take one term in the first dimension and two terms in the second dimension, so the result is [3,4].

Now that we know how to use slice, we can use it to slice the pixels of each handwritten number.

for(let i = 0; i < 20; i++) {
    const imageTensor = example.xs.slice([i, 0], [1.784]);
}
Copy the code

However, because of the large amount of data being operated on here. When you play Tensor, you call GPU acceleration and you leave some memory behind, and there’s a Tidy method in tensorflow.js that will clean up some of that memory and prevent a memory leak. We add it:

const imageTensor = tf.tidy(() = > {
      return example.xs.slice([i, 0], [1.784]);
});
Copy the code

Third, the pixel data of each handwritten number cut into pictures to display on the page

First of all, we’re 0 taking the Tensor information from the above into 0 0 28 x 28 pixels black and white

const imageTensor = tf.tidy(() = > {
      return examples.xs.slice([i, 0], [1.784])
        .reshape([28.28.1]);
});
Copy the code

Next we will use the.browser.topixels method of tensorflow.js to turn it into an image and display it on the page. This method takes two arguments, the first is a Tensor for image information, and the second is an HTML Canvas element.

const canvas = document.createElement('canvas');
canvas.width = 28;
canvas.height = 28;
canvas.style.margin = '4px';
await tf.browser.toPixels(imageTensor, canvas);
document.body.appendChild(canvas);
Copy the code

Render the result as follows:

4. Integrate image rendering into tF-VIS visual panel

The code is as follows:

import * as tf from '@tensorflow/tfjs';
import * as tfvis from '@tensorflow/tfjs-vis';
import { MnistData } from './data';
window.onload = async() = > {const data = new MnistData();
  await data.load();
  const examples = data.nextTestBatch(20); The nextTestBatch method is used to load validation sets. The parameter is the number of validation sets to load
  console.log(examples);
  const surface = tfvis.visor().surface({ name: 'Input Sample' });
  for(let i = 0; i < 20; i++) {
    const imageTensor = tf.tidy(() = > {
      return examples.xs.slice([i, 0], [1.784])
        .reshape([28.28.1]);
    });
    const canvas = document.createElement('canvas');
    canvas.width = 28;
    canvas.height = 28;
    canvas.style.margin = '4px';
    awaittf.browser.toPixels(imageTensor, canvas); surface.drawArea.appendChild(canvas); }}Copy the code

Main added line 9 (const surface = tfvis.visor().surface({name: ‘input example’});) And modify the line 20, the line 20 from the document. The body. The appendChild (canvas); To the surface. The drawArea. The appendChild (canvas); . The final result is as follows: