An overview of
This is a Hello World level article offering tea to AI gurus 🍵
This article will take you through
- What is a tensorflow. Js
- How to Use pretrained model for Speech Recognition
- How to train custom models with transfer learning
- How to use a custom model for speech recognition
We will identify the northeast dialect as follows
This article is in progress…
tensorflow.js
First, take a look at the application tensorflow.js. It is said that girls like it very much
It is recommended to choose real-time test color
This is little sister’s live teaching
Here’s how this cool APP works
Then you can visit the website of TensorFlow: tensorflow.google.cn
Tensorflow.js is a machine learning library that can be used in browsers and Node.js, with better real-time performance
Tensor is the extension of vectors and matrices into higher dimensions, which is the equivalent of a multidimensional array
The Tensor will vectorize the for loop at N levels, and you’ll accelerate your GPU
I would like to add the relationship between artificial intelligence, machine learning and deep learning, as shown below
Speech recognition using pre-trained models
Pre-trained models: Models that are pre-trained and ready to use out of the box are available in a variety of formats. Tensorflow. js can use model files in Web format
Some tensorflow.js models can be found here: github.com/tensorflow/…
It contains the speech model we will use in this article
Can open voice command model test link experience: storage.googleapis.com/tfjs-speech…
In addition, we stripped the model link, meta-information link and sharded model data (group1-SHARd1of2, group1-shard2of2) from the test link.
The basic definition of speech recognition: input audio, output classified data
Basic principles of speech recognition: audio -> spectrogram -> convolutional neural network image recognition
Note:
- To enable recording rights, use HTTPS
- Links to model files need to use absolute paths
<template lang='pug'>
.speech
button.btn(
v-for="(item, index) in labels"
:class="{'current': index === currentIndex }"
) {{ item }}
</template>
<script lang="ts"> /** * @description uses the pre-trained model for speech recognition */ import * as tf from'@tensorflow/tfjs'
import * as speechCommands from '@tensorflow-models/speech-commands'
import Vue from 'vue'
import Component from 'vue-class-component'
const PATH = window.location.origin + window.location.pathname
@Component
export default class SpeechComponent extends Vue {
labels = []
currentIndex = -1
mounted() {
this.init()
}
async init() {const recognizer: speechCommands SpeechCommandRecognizer = speechCommands. Create (/ browser/native Fourier transform'BROWSER_FFT', // custom word null, // model link PATH +'data/speech/model.json', // Meta link PATH +'data/speech/metadata.json') / / recognizer to ensure good model load await recognizer. EnsureModelLoaded () the console. Warn ('recognizer'Recognizer) // Recognizer = recognizer.wordlabels () console. Warn () recognizer = recognizer.wordlabels () console.'this.labels', this.labels) // Note the use of HTTPS, getUserMedia // https://developer.mozilla.org/en-US/docs/Web/API/MediaDevices/getUserMedia recognizer.listen( result => { const { scores } = result const maxValue = Math.max(... (<Array<any>>scores)) this.currentIndex = (<Array<any>>scores).indexOf(maxValue)returnResolve ()}, {// overlapFactor: 0.0001, // probabilityThreshold: </script> <style scoped lang='stylus' rel='stylesheet/stylus'>
.speech
width 100vw
height 100vh
background-color #5CACEE
.label
float left
background-color #C9C9C9
font-size 17px
text-align center
height 25px
margin 5px
padding 5px
display table
color #ffffff
.current
background-color #CD69C9
color #9AC0CD
</style>
Copy the code
How to use transfer learning to train custom models
Transfer learning: Storing solution models of existing problems and applying them to different but related problems. The original intention is to save the time of manual annotation of samples, so that the model can migrate from the existing source domain data to the unlabeled data. To train models that are applicable to the target domain.
In human language, we trained a model that could recognize northeastern Chinese from a model that originally recognized only English words.
Specific operations: collect custom speech training data in the browser, including background noise, so that the model knows which sounds are not recognized, and then save and download the binary model file
<template lang="pug">
.container
button(@click="collect") What are you looking at button(@click="collect") Button (@click="collect"Background noise pre {{countInfo}} button(@click="save"<br/> button(@click="train"Training <br/> SPAN Recording switch my-switch(class="switch" :isOpen="isRecording" @onSwitch="handleSwitch")
<br/>
span {{ result }}
</template>
<script lang="ts"> /** * @description creates a migration learner using the pre-training model and generates custom training data */ import * as tf from'@tensorflow/tfjs'
import * as speechCommands from '@tensorflow-models/speech-commands'
import * as tfvis from '@tensorflow/tfjs-vis'
import Vue from 'vue'
import Component from 'vue-class-component'
import MySwitch from '.. /common_components/switch.vue'
const PATH = window.location.origin + window.location.pathname
@Component({
components: {
MySwitch
}
})
export default class SpeechComponent extends Vue {
labels = []
currentIndex = -1
transferRecognizer: speechCommands.TransferSpeechCommandRecognizer = null
countInfo = ' '
isRecording: Boolean = false
result = ' '
mounted() {
this.createTransferRecognizer()
}
async createTransferRecognizer() {const recognizer: speechCommands SpeechCommandRecognizer = speechCommands. Create (/ browser/native Fourier transform'BROWSER_FFT', // custom word null, // model link PATH +'data/speech/model.json', // Meta link PATH +'data/speech/metadata.json') / / recognizer to ensure good model load await recognizer. EnsureModelLoaded () the console. Warn ('recognizer'Recognizer) // Recognizer = recognizer.wordlabels () console. Warn () recognizer = recognizer.wordlabels () console.'this.labels', this.labels)
this.transferRecognizer = recognizer.createTransfer('myTransfer')
}
async collect (e) {
const btn = e.target
btn.disabled = true
const label = btn.innerText
await this.transferRecognizer.collectExample(
label === 'Background noise' ? '_background_noise_' : label
)
btn.disabled = false
this.countInfo = JSON.stringify(
this.transferRecognizer.countExamples(),
null,
2,
)
}
async train() {await this. TransferRecognizer. "train" ({/ / the number of iterations epochs: 100, the callback: tfvis. Show the fitCallbacks ({name:'Training effect'}, // measure, view loss, accuracy ['loss'.'acc'],
{ callbacks: ['onEpochEnd'] },
)
})
}
async handleSwitch () {
this.result = ' '
if(! this.isRecording) { this.isRecording =true
await this.transferRecognizer.listen(
result => {
const { scores } = result
console.warn('result', result) const maxValue = Math.max(... (<Array<any>>scores)) const index= (<Array<any>>scores).indexOf(maxValue) const labels = this.transferRecognizer.wordLabels() console.log(labels[index]) this.result =' '
setTimeout(() => {this.result = 'you are not saying"${ labels[index] }"? `}, 1000).returnPromise.resolve()}, {overlapFactor: 0.1, probabilityThreshold: 0.9,})}else {
this.isRecording = false
this.transferRecognizer.stopListening()
}
}
save () {
const arrayBuffer: ArrayBuffer = this.transferRecognizer.serializeExamples()
const blob = new Blob([arrayBuffer])
const link = document.createElement('a')
link.href = window.URL.createObjectURL(blob)
link.download = 'data.bin'
link.click()
}
}
</script>
<style scoped lang="stylus" rel="stylesheet/stylus">
@import "~styles/custom.styl"
</style>
Copy the code
How to use a custom model for speech recognition
Load the custom model, training can be speech recognition
In the graph below, the probability of what you see is 94+ on the first test, and 99+ on the second test
<template lang="pug">
.container(v-if="transferRecognizer"<br/> SPAN 1. Wait for the model to finish training <br/> SPAN 2. Turn on the recording switch"What are you looking at?" / "What do you think?"<br/> <br/> SPAN Recording switch my-switch(:canOpen="canRecording"
:isOpen="isRecording"
@onSwitch="handleSwitch"
)
<br/>
span {{ result }}
</template>
<script lang="ts"> /** * @description uses its own data set to train the model and perform speech recognition */ import * as tf from'@tensorflow/tfjs'
import * as speechCommands from '@tensorflow-models/speech-commands'
import * as tfvis from '@tensorflow/tfjs-vis'
import Vue from 'vue'
import Component from 'vue-class-component'
import MySwitch from '.. /common_components/switch.vue'
const PATH = window.location.origin + window.location.pathname
@Component({
components: {
MySwitch
}
})
export default class SpeechComponent extends Vue {
labels = []
currentIndex = -1
transferRecognizer: speechCommands.TransferSpeechCommandRecognizer = null
countInfo = ' '
isTrainDone = false
isRecording = false
canRecording = false
result = ' '
mounted() {
this.createTransferRecognizer()
}
async createTransferRecognizer() {const recognizer: speechCommands SpeechCommandRecognizer = speechCommands. Create (/ browser/native Fourier transform'BROWSER_FFT', // custom word null, // model link PATH +'data/speech/model.json', // Meta link PATH +'data/speech/metadata.json') / / recognizer to ensure good model load await recognizer. EnsureModelLoaded () the console. Warn ('recognizer'Recognizer) // Recognizer = recognizer.wordlabels () console. Warn () recognizer = recognizer.wordlabels () console.'this.labels', this.labels)
this.transferRecognizer = recognizer.createTransfer('myTransfer')
const res = await fetch(PATH + 'data/speech/data.bin') const arrayBuffer = await res.arrayBuffer() this.transferRecognizer.loadExamples(arrayBuffer) The console. Warn (enclosing transferRecognizer. CountExamples ()) await this. TransferRecognizer. "train" ({/ / the number of iterations epochs: 100, callback: tfvis.show.fitCallbacks( { name:'Training effect'}, // measure, view loss, accuracy ['loss'.'acc'],
{ callbacks: ['onEpochEnd'] },
)
})
this.canRecording = true
}
async handleSwitch () {
this.result = ' '
if(! this.isRecording) { this.isRecording =true
await this.transferRecognizer.listen(
result => {
const { scores } = result
console.warn('result', result) const maxValue = Math.max(... (<Array<any>>scores)) const index= (<Array<any>>scores).indexOf(maxValue) const labels = this.transferRecognizer.wordLabels() console.log(labels[index]) this.result =' '
setTimeout(() => {this.result = 'you are not saying"${ labels[index] }"? `}, 1000).returnPromise.resolve()}, {overlapFactor: 0.1, probabilityThreshold: 0.9,})}else {
this.isRecording = false
this.transferRecognizer.stopListening()
}
}
}
</script>
<style scoped lang="stylus" rel="stylesheet/stylus">
@import "~styles/custom.styl"
</style>
Copy the code