Push my open source project GoVPr again, a speaker recognition (voice print recognition) engine based on GOLang’s GMM-UBM algorithm, and consider launching Java version, C/C ++ version, or even SWIFT version in the future. Gmm-ubm is a relatively backward algorithm in voice print recognition, which is completely unable to compare with the current I-vector. However, GMM-UBM is still the foundation and the top priority in the beginning of voice print recognition. If conditions permit, it is also possible to use GO to write some i-vector algorithm, or even DNN algorithm speech recognition open source projects.
Introduction to the
Govpr is a golang implementation based on GMM-UBM speaker recognition engine (voiceprint recognition), which can be used for speech verification, identity recognition scenarios. Currently, only Chinese digital voice is supported. The voice format is WAV (16000 bit rate,16bits, mono).
The installation
go get github.com/liuxp0827/govpr
The sample
Here is a simple example. You can go to Example to view the detailed example. In this example, the voice is a pure 8-digit number. After the voice verification, a score is obtained, and the threshold value can be set to judge whether the voice verification is the registered trainer.
package main import ( "github.com/liuxp0827/govpr" "github.com/liuxp0827/govpr/log" "github.com/liuxp0827/govpr/waveIO" "io/ioutil" ) type engine struct { vprEngine *govpr.VPREngine } func NewEngine(sampleRate, delSilRange int, ubmFile, userModelFile string) *engine { return &engine{ vprEngine: govpr.NewVPREngine(sampleRate, delSilRange, ubmFile, userModelFile), } } func (this *engine) DestroyEngine() { this.vprEngine = nil } func (this *engine) TrainSpeech(buffers [][]byte) error { var err error count := len(buffers) for i := 0; i < count; i++ { err = this.vprEngine.AddTrainBuffer(buffers[i]) if err ! = nil { log.Error(err) return err } } defer this.vprEngine.ClearTrainBuffer() defer this.vprEngine.ClearAllBuffer() err = this.vprEngine.TrainModel() if err ! = nil { log.Error(err) return err } return nil } func (this *engine) RecSpeech(buffer []byte) error { err := this.vprEngine.AddVerifyBuffer(buffer) defer this.vprEngine.ClearVerifyBuffer() if err ! = nil { log.Error(err) return err } err = this.vprEngine.VerifyModel() if err ! = nil { log.Error(err) return err } Score := this.vprEngine.GetScore() log.Infof("vpr score: %f", Score) return nil } func main() { log.SetLevel(log.LevelDebug) vprEngine := NewEngine(16000, 50, ".. /ubm/ubm", "model/test.dat") trainlist := []string{ "wav/train/01_32468975.wav", "wav/train/02_58769423.wav", "wav/train/03_59682734.wav", "wav/train/04_64958273.wav", "wav/train/05_65432978.wav", } trainBuffer := make([][]byte, 0) for _, file := range trainlist { buf, err := loadWaveData(file) if err ! = nil { log.Error(err) return } trainBuffer = append(trainBuffer, buf) } verifyBuffer, err := waveIO.WaveLoad("wav/verify/34986527.wav") if err ! = nil { log.Error(err) return } vprEngine.TrainSpeech(trainBuffer) vprEngine.RecSpeech(verifyBuffer) } func loadWaveData(file string) ([]byte, error) { data, err := ioutil.ReadFile(file) if err ! = nil { return nil, err } // remove .wav header info 44 bits data = data[44:] return data, nil }Copy the code
A few days ago has uploaded based on Beego and mysql simple implementation of Httpapi, function implementation is relatively hasty so very poke, free to optimize ha, incidentally detailed writing document ~~