An overview of

This is a Hello World level article offering tea to AI gurus 🍵

This article will take you through

  1. What is a tensorflow. Js
  2. How to Use pretrained model for Speech Recognition
  3. How to train custom models with transfer learning
  4. How to use a custom model for speech recognition

We will identify the northeast dialect as follows

This article is in progress…

tensorflow.js

First, take a look at the application tensorflow.js. It is said that girls like it very much

It is recommended to choose real-time test color

This is little sister’s live teaching

Here’s how this cool APP works

Then you can visit the website of TensorFlow: tensorflow.google.cn

Tensorflow.js is a machine learning library that can be used in browsers and Node.js, with better real-time performance

Tensor is the extension of vectors and matrices into higher dimensions, which is the equivalent of a multidimensional array

The Tensor will vectorize the for loop at N levels, and you’ll accelerate your GPU

I would like to add the relationship between artificial intelligence, machine learning and deep learning, as shown below

Speech recognition using pre-trained models

Pre-trained models: Models that are pre-trained and ready to use out of the box are available in a variety of formats. Tensorflow. js can use model files in Web format

Some tensorflow.js models can be found here: github.com/tensorflow/…

It contains the speech model we will use in this article

Can open voice command model test link experience: storage.googleapis.com/tfjs-speech…

In addition, we stripped the model link, meta-information link and sharded model data (group1-SHARd1of2, group1-shard2of2) from the test link.

The basic definition of speech recognition: input audio, output classified data

Basic principles of speech recognition: audio -> spectrogram -> convolutional neural network image recognition

Note:

  1. To enable recording rights, use HTTPS
  2. Links to model files need to use absolute paths
<template lang='pug'>
	.speech
		button.btn(
			v-for="(item, index) in labels"
			:class="{'current': index === currentIndex }"
		) {{ item }}
</template>

<script lang="ts"> /** * @description uses the pre-trained model for speech recognition */ import * as tf from'@tensorflow/tfjs'
import * as speechCommands from '@tensorflow-models/speech-commands'
import Vue from 'vue'
import Component from 'vue-class-component'

const PATH = window.location.origin + window.location.pathname

@Component
export default class SpeechComponent extends Vue {

	labels = []
	currentIndex = -1

	mounted() {
		this.init()
	}

	async init() {const recognizer: speechCommands SpeechCommandRecognizer = speechCommands. Create (/ browser/native Fourier transform'BROWSER_FFT', // custom word null, // model link PATH +'data/speech/model.json', // Meta link PATH +'data/speech/metadata.json') / / recognizer to ensure good model load await recognizer. EnsureModelLoaded () the console. Warn ('recognizer'Recognizer) // Recognizer = recognizer.wordlabels () console. Warn () recognizer = recognizer.wordlabels () console.'this.labels', this.labels) // Note the use of HTTPS, getUserMedia // https://developer.mozilla.org/en-US/docs/Web/API/MediaDevices/getUserMedia recognizer.listen( result => { const { scores } = result const maxValue = Math.max(... (<Array<any>>scores)) this.currentIndex = (<Array<any>>scores).indexOf(maxValue)returnResolve ()}, {// overlapFactor: 0.0001, // probabilityThreshold: </script> <style scoped lang='stylus' rel='stylesheet/stylus'>
.speech
	width 100vw
	height 100vh
	background-color #5CACEE
	.label
		float left
		background-color #C9C9C9
		font-size 17px
		text-align center
		height 25px
		margin 5px
		padding 5px
		display table
		color #ffffff
	.current
		background-color #CD69C9
		color #9AC0CD
</style>
Copy the code

How to use transfer learning to train custom models

Transfer learning: Storing solution models of existing problems and applying them to different but related problems. The original intention is to save the time of manual annotation of samples, so that the model can migrate from the existing source domain data to the unlabeled data. To train models that are applicable to the target domain.

In human language, we trained a model that could recognize northeastern Chinese from a model that originally recognized only English words.

Specific operations: collect custom speech training data in the browser, including background noise, so that the model knows which sounds are not recognized, and then save and download the binary model file

<template lang="pug">
  .container
    button(@click="collect") What are you looking at button(@click="collect") Button (@click="collect"Background noise pre {{countInfo}} button(@click="save"<br/> button(@click="train"Training <br/> SPAN Recording switch my-switch(class="switch" :isOpen="isRecording" @onSwitch="handleSwitch")
    <br/>
    span {{ result }}
</template>

<script lang="ts"> /** * @description creates a migration learner using the pre-training model and generates custom training data */ import * as tf from'@tensorflow/tfjs'
import * as speechCommands from '@tensorflow-models/speech-commands'
import * as tfvis from '@tensorflow/tfjs-vis'
import Vue from 'vue'
import Component from 'vue-class-component'
import MySwitch from '.. /common_components/switch.vue'

const PATH = window.location.origin + window.location.pathname

@Component({
  components: {
    MySwitch
  }
})
export default class SpeechComponent extends Vue {

	labels = []
  currentIndex = -1
  transferRecognizer: speechCommands.TransferSpeechCommandRecognizer = null
  countInfo = ' '
  isRecording: Boolean = false
  result = ' '

	mounted() {
		this.createTransferRecognizer()
  }

  async createTransferRecognizer() {const recognizer: speechCommands SpeechCommandRecognizer = speechCommands. Create (/ browser/native Fourier transform'BROWSER_FFT', // custom word null, // model link PATH +'data/speech/model.json', // Meta link PATH +'data/speech/metadata.json') / / recognizer to ensure good model load await recognizer. EnsureModelLoaded () the console. Warn ('recognizer'Recognizer) // Recognizer = recognizer.wordlabels () console. Warn () recognizer = recognizer.wordlabels () console.'this.labels', this.labels)
    this.transferRecognizer = recognizer.createTransfer('myTransfer')
	}

  async collect (e) {
    const btn = e.target
    btn.disabled = true
    const label = btn.innerText
    await this.transferRecognizer.collectExample(
      label === 'Background noise' ? '_background_noise_' : label
    )
    btn.disabled = false
    this.countInfo = JSON.stringify(
      this.transferRecognizer.countExamples(),
      null,
      2,
    )
  }

  async train() {await this. TransferRecognizer. "train" ({/ / the number of iterations epochs: 100, the callback: tfvis. Show the fitCallbacks ({name:'Training effect'}, // measure, view loss, accuracy ['loss'.'acc'],
        { callbacks: ['onEpochEnd'] },
      )
    })
  }

  async handleSwitch () {
    this.result = ' '
    if(! this.isRecording) { this.isRecording =true
      await this.transferRecognizer.listen(
        result => {
          const { scores } = result
          console.warn('result', result) const maxValue = Math.max(... (<Array<any>>scores)) const index= (<Array<any>>scores).indexOf(maxValue) const labels = this.transferRecognizer.wordLabels() console.log(labels[index]) this.result =' '
          setTimeout(() => {this.result = 'you are not saying"${ labels[index] }"? `}, 1000).returnPromise.resolve()}, {overlapFactor: 0.1, probabilityThreshold: 0.9,})}else {
      this.isRecording = false
      this.transferRecognizer.stopListening()
    }
  }

  save () {
    const arrayBuffer: ArrayBuffer = this.transferRecognizer.serializeExamples()
    const blob = new Blob([arrayBuffer])
    const link = document.createElement('a')
    link.href = window.URL.createObjectURL(blob)
    link.download = 'data.bin'
    link.click()
  }

}
</script>

<style scoped lang="stylus" rel="stylesheet/stylus">
@import "~styles/custom.styl"
</style>
Copy the code

How to use a custom model for speech recognition

Load the custom model, training can be speech recognition

In the graph below, the probability of what you see is 94+ on the first test, and 99+ on the second test

<template lang="pug">
  .container(v-if="transferRecognizer"<br/> SPAN 1. Wait for the model to finish training <br/> SPAN 2. Turn on the recording switch"What are you looking at?" / "What do you think?"<br/> <br/> SPAN Recording switch my-switch(:canOpen="canRecording"
      :isOpen="isRecording"
      @onSwitch="handleSwitch"
    )
    <br/>
    span {{ result }}
</template>

<script lang="ts"> /** * @description uses its own data set to train the model and perform speech recognition */ import * as tf from'@tensorflow/tfjs'
import * as speechCommands from '@tensorflow-models/speech-commands'
import * as tfvis from '@tensorflow/tfjs-vis'
import Vue from 'vue'
import Component from 'vue-class-component'
import MySwitch from '.. /common_components/switch.vue'

const PATH = window.location.origin + window.location.pathname

@Component({
  components: {
    MySwitch
  }
})
export default class SpeechComponent extends Vue {

	labels = []
  currentIndex = -1
  transferRecognizer: speechCommands.TransferSpeechCommandRecognizer = null
  countInfo = ' '
  isTrainDone = false
  isRecording = false
  canRecording = false
  result = ' '

	mounted() {
		this.createTransferRecognizer()
  }

  async createTransferRecognizer() {const recognizer: speechCommands SpeechCommandRecognizer = speechCommands. Create (/ browser/native Fourier transform'BROWSER_FFT', // custom word null, // model link PATH +'data/speech/model.json', // Meta link PATH +'data/speech/metadata.json') / / recognizer to ensure good model load await recognizer. EnsureModelLoaded () the console. Warn ('recognizer'Recognizer) // Recognizer = recognizer.wordlabels () console. Warn () recognizer = recognizer.wordlabels () console.'this.labels', this.labels)
    this.transferRecognizer = recognizer.createTransfer('myTransfer')
    const res = await fetch(PATH + 'data/speech/data.bin') const arrayBuffer = await res.arrayBuffer() this.transferRecognizer.loadExamples(arrayBuffer) The console. Warn (enclosing transferRecognizer. CountExamples ()) await this. TransferRecognizer. "train" ({/ / the number of iterations epochs: 100, callback: tfvis.show.fitCallbacks( { name:'Training effect'}, // measure, view loss, accuracy ['loss'.'acc'],
        { callbacks: ['onEpochEnd'] },
      )
    })
    this.canRecording = true
  }

  async handleSwitch () {
    this.result = ' '
    if(! this.isRecording) { this.isRecording =true
      await this.transferRecognizer.listen(
        result => {
          const { scores } = result
          console.warn('result', result) const maxValue = Math.max(... (<Array<any>>scores)) const index= (<Array<any>>scores).indexOf(maxValue) const labels = this.transferRecognizer.wordLabels() console.log(labels[index]) this.result =' '
          setTimeout(() => {this.result = 'you are not saying"${ labels[index] }"? `}, 1000).returnPromise.resolve()}, {overlapFactor: 0.1, probabilityThreshold: 0.9,})}else {
      this.isRecording = false
      this.transferRecognizer.stopListening()
    }
  }

}
</script>

<style scoped lang="stylus" rel="stylesheet/stylus">
@import "~styles/custom.styl"
</style>
Copy the code