A concept,

Sound: The sound here refers to a series of voltage changes through the microphone, which can get many numbers between [-1,1]. If you want to play, you need to convert it to PCM format

PCM: IN PCM format, three parameters are used to describe [sampling frequency, sampling number and number of sound channels]. A picture is found on the Internet:


Input sampling frequency: The frequency at which the microphone collects sound, since the microphone needs to convert the waveform of sound into a signal [-1,1], which specifies how many samples to collect per unit of time

Output sampling frequency: The number of samples played per unit time, generally consistent with the input sampling frequency

2. Practice scenario

The following is a demo. Open the computer microphone through Google browser, record using WEBRTC-related API, then convert it to PCM and WAV format, play it with audio label, and draw the sound range map with Cavans. The general process is as follows:

Three, implementation steps

1. Obtain the microphone permission

The navigator.getUserMedia method is used here, of course, if only using Google Browser, can not be compatible processing, the main structure code is as follows

navigator.getUserMedia = navigator.getUserMedia || navigator.webkitGetUserMedia || navigator.mozGetUserMedia || navigator.msGetUserMedia
navigator.getUserMedia({ 
  audio: trueVideo}, (stream) => {<! }, (error) => {console.log(error)})Copy the code

2. PCM data acquisition

The following uses window.AudioContext to parse microphone information, focusing on createMediaStreamSource, createScriptProcessor, and onAudioProcess. The specific structure codes are as follows

<! First, create an AudioContext object as the carrier of the sound source.letaudioContext = window.AudioContext || window.webkitAudioContext const context = new audioContext() <! Enter the sound into this object, stream is the stream that returns the sound source.letaudioInput = context.createMediaStreamSource(stream) <! CreateScriptProcessor creates a cache node for the sound. The second and third parameters of the createScriptProcessor method are the number of channels in the input and output. The first parameter is the cache size.letRecorder = context. CreateScriptProcessor (config. BufferSize, config. ChannelCount config. The channelCount) / / config is the custom here, The source code <! -- This method is audio cache, where audioData is a custom object that will implement waV file conversion, Cache PCM data, and so on -- > recorder. Onaudioprocess = (e) = > {audioData. Input (e.i nputBuffer. GetChannelData (0))}Copy the code

However, there should be a trigger point in the acquisition process. For example, the final demo of this practice looks like this:


Audioinput.connect (recorder) // Audio source link filter handler recorder. Connect (context.destination) // Filter handler link speakerCopy the code

After linking, the createScriptProcessor’s onAudioProcess method continuously returns sample data in the range [-1,1] of type Float32. Now all you have to do is gather it up and turn it into PCM file data.

3. AudioData definition

First define an audioData object, used to process data, the overall structure is as follows, see the source code below:

letAudioData = {size: 0, // Recording file length Buffer: [], // Recording cache inputSampleRate: context.sampleRate, // Input sampling rate inputSampleBits: 16, // Input sampling digit 8, 16 outputSampleRate: config.sampleRate, // Output sampling rate oututSampleBits: Config. sampleBits, // output sampling bits 8, 16 input:function(data) {// Store recording data in real time}, getRawData:function() {// merge compression}, covertWav:function() {// convert waV file data}, getFullWavData:function() {// generate file with blob}, closeContext:function(){// Close AudioContext or there will be many errors}, reshapeWavData:function(sampleBits, offset, iBytes, oData) {// 8 bits of sample bytes}, getWavBuffer:function() {// For drawing buffer data in WAV format}, getPcmBuffer:function() {// PCM buffer data}}Copy the code

According to the GIF above:

A. Click “record” in the first step to perform section 1: Obtaining microphone permission and 2: obtaining PCM data. Then call audioData object input method in onAudioProcess method to store buffer data.

B. Click the “Download PCM” label, the audioData object getRawData and getPcmBuffer methods will be executed successively, but the downloaded TXT file is not PCM file, because I don’t know how to convert TXT file into PCM file in JS environment, So I will TXT file downloaded directly after the manual modification of the extension name, of course, the modified file can be played, the operation process is as follows

PCM file online play link

4. PCM to WAV

PCM has no header information, just add 44 bytes of header information can be converted to WAV, the header information is fixed, can be used directly, borrowed from the same code snippets on the Internet

let writeString = function (str) {  
  for(var i = 0; i < str.length; I ++) {data.setuint8 (offset + I, str.charcodeat (I))}}'RIFF'); Offset += 4 // Total bytes from the next address to the end of the file, that is, the file size -8 data.setuint32 (offset, 36 + dataLength,true); Offset += 4 // WAV file marker writeString('WAVE'); Offset += 4 writeString('fmt '); Offset += 4 // Filter bytes, generally 0x10 = 16 data.setuint32 (offset, 16,true); Offset += 4 // Format category (PCM form sampling data) data.setuint16 (offset, 1,true); Offset += 2 // Number of channels data.setuint16 (offset, config.channelCount,true); Offset += 2 // Sampling rate, number of samples per second, which indicates the playback speed of each channel data.setUint32(offset, sampleRate,true); Offset += 4 // Waveform data transmission rate (average bytes per second) Mono x data bits per second x data bits per sample /8 data.setuint32 (offset, config.channelCount * sampleRate * (sampleBits / 8),true); SetUint16 (offset, config.channelCount * (sampleBits /8),true); Offset += 2 // data.setuint16 (offset, sampleBits,true); Offset += 2 // writeString('data'); Offset += 4 // Total number of sampled data, i.e. total size of data -44 data.setUint32(offset, dataLength,true); Data = this.reshapeWavData(sampleBits, bytes, data)Copy the code

5. Data to range diagram

The createAnalyser method in the AudioContext is used to decompose the sound waves as follows:

Window. AudioBufferSouceNode = context. CreateBufferSource () / / create the sound source object audioBufferSouceNode buffer = buffer/sound source buffer file stream GainNode = context. CreateGain () / / create the volume controller gainNode. Gain. The value = 2 audioBufferSouceNode. Connect (gainNode) / / sound source link volume controllerletAnalyser = context.createAnalyser() // Create analyser.fftSize = 256 gainNode.connect(analyser) // Volume controller link analyzer Analyser.connect (context.destination) // Analyzer links speakersCopy the code

Then pick analyser. FrequencyBinCount data using the drawing canvas, to the main code is as follows:

let drawing = function() {
  let array = new Uint8Array(analyser.frequencyBinCount)
  analyser.getByteFrequencyData(array)
  ctx.clearRect(0, 0, 600, 200)
  for(let i = 0; i < array.length; i++) {
    let _height = array[i]
    if(! Top [I] | | (_height > top [I])) {/ / hat head drop top [I] = _height}else{top[I] -= 1} CTX. FillRect (I * 20, 200 - _height, 4, _height) CTX. FillRect (I * 20, 200 - _height, 4, _height) CTX. FillRect (I * 20, 200 - _height, 4, _height) CTX. Ctx.fillstyle = gradient} requestAnimationFrame(drawing)}Copy the code

Four, the source address

Github address: audio