This article has two key words: Audio visualization and Web Audio. The former is the practice, the latter is the technical support behind it. Web Audio is a big topic, and this article will focus on how to get Audio data. For more on its API, check out MDN.
Additionally, in order to convert Audio data into visual graphics, you need to have some knowledge of Canvas (specifically 2D, the same below) and even WebGL (optional) in addition to Web Audio. If you don’t have any knowledge of them, you can start with the following resources:
- Canvas Tutorial
- WebGL Tutorial
What is audio visualization
Through the acquisition of frequency, waveform and other data from the sound source, it is converted into graphics or images displayed on the screen, and then interactive processing.
There are plenty of examples of audio motion in cloud music, but some of them are too complex or too businesslike. So here are two relatively simple but representative examples.
The first is an audio bar graph implemented on Canvas.
↑ Click play ↑
The second is particle effects implemented with WebGL.
↑ Click play ↑
In practice, in addition to these basic shapes (rectangles, circles, etc.) transformation, audio can also be combined with natural motion, 3D graphics.
Click to see: Some visual effects on Pinterest
What is Web Audio
Web Audio is a set of apis for processing and analyzing Audio on the Web side. It can set up different audio sources (including
Next, I’ll focus on the role that Web Audio plays in visualization, as shown in the figure below.
In simple terms, it is the two processes of taking data + mapping data. Let’s first solve the problem of “fetching data”. We can operate in the following 5 steps.
1. Create AudioContext
Before any operation on audio, the AudioContext must be created. Its function is associated with audio input, audio decoding, control audio playback pause and other basic operations.
The creation method is as follows:
const AudioContext = window.AudioContext || window.webkitAudioContext;
const ctx = new AudioContext();
Copy the code
2. Create AnalyserNode
AnalyserNode is used to obtain audio FrequencyData and TimeDomainData. To achieve audio visualization.
It just reads the audio and doesn’t make any changes to it.
const analyser = ctx.createAnalyser();
analyser.fftSize = 512;
Copy the code
And fftSize, which is kind of hard to understand in MDN, is a parameter of the FAST Fourier transform.
It can be understood from the following perspectives:
1. What is its value?
FftSize is required to be a power of 2, such as 256, 512, etc. The higher the number, the more refined the result.
For mobile web pages, the bitrate of the audio itself is mostly 128Kbps, so there is no need to use too large frequency array to store the source data which is not fine enough. In addition, the size of the mobile screen is smaller than that of the desktop, so you don’t need to capture every frequency in the final display. Only the rhythm is needed, so 512 is a reasonable value.
2. What does it do?
FftSize determines the length of frequencyData, which is half of fftSize.
For Why 1/2, check out this article: Why is the FFT “mirrored”?
3. Set the SourceNode
Now, we need to associate the audio node with the AudioContext as input to the entire audio analysis process.
In Web Audio, there are three types of Audio sources:
- MediaElementAudioSourceNodeAllows you to
<audio>
Node directly as input, can achieve streaming playback. - AudioBufferSourceNode preloads the audio file through XHR and decodes it with the AudioContext.
- MediaStreamAudioSourceNodeYou can use the user’s microphone as input. through
navigator.getUserMedia
After obtaining the user’s audio or video stream, an audio source is generated.
The three audio source, in addition to MediaStreamAudioSourceNode has its irreplaceable usage scenarios (such as voice or video broadcast). MediaElementAudioSourceNode and AudioBufferSourceNode are relatively easy to mix, so here to introduce emphatically.
MediaElementAudioSourceNode
MediaElementAudioSourceNode will < audio > tag as an audio source. Its API calls are very simple.
// Get
const audio = document.getElementById('audio');
// Create an audio source using the
const source = ctx.createMediaElementSource(audio);
// Associate the audio source with the analyzer
source.connect(analyser);
// Associate the analyzer with the output device (headset, speaker)
analyser.connect(ctx.destination);
Copy the code
AudioBufferSourceNode
In one case, in AnZhuoDuan, tested in Chrome / 69 (not including) the following version, MediaElementAudioSourceNode, get frequencyData is full of 0 array.
Therefore, to be compatible with such machines, you need to use an AudioBufferSourceNode instead of preloading as follows:
// Create an XHR
var xhr = new XMLHttpRequest();
xhr.open('GET'.'/path/to/audio.mp3'.true);
// Set the response type to ArrayBuffer
xhr.responseType = 'arraybuffer';
xhr.onload = function() {
var source = ctx.createBufferSource();
// Decode the response content
ctx.decodeAudioData(xhr.response, function(buffer) {
// Assign the decoded value to buffer
source.buffer = buffer;
/ / finish. Bind source to CTX. You can also connect to AnalyserNode
source.connect(ctx.destination);
});
};
xhr.send();
Copy the code
Is it easier to understand AnalyserNode as middleware?
Compare regular < Audio > playback with the playback flow in Web Audio:
4. Play the audio
For the < audio > nodes, which USES MediaElementAudioSourceNode, play is relatively familiar with:
audio.play();
Copy the code
But if it’s AudioBufferSourceNode, it doesn’t have a play method. Instead:
/ / create AudioBufferSourceNode
const source = ctx.createBufferSource();
// Buffer is an audio file obtained by XHR
source.buffer = buffer;
// Call the start method to play
source.start(0);
Copy the code
5. Obtain frequencyData
At this point, we have associated the audio input with an AnalyserNode and started playing the audio. For the Web Audio part, there is only one task left: getting frequency data.
Web Audio provides two related apis for frequency:
analyser.getByteFrequencyData
analyser.getFloatFrequencyData
Both return TypedArray, the only difference being the accuracy.
GetByteFrequencyData returns a Uint8Array of 0 to 255. GetFloatFrequencyData returns a Float32Array ranging from 0 to 22050.
In contrast, getByteFrequencyData is recommended for projects where performance is more important than precision. Here’s an example:
Regarding the length of the array (256), as explained above, it is half of fftSize.
Now, let’s see how to get the frequency array:
const bufferLength = analyser.frequencyBinCount;
const dataArray = new Uint8Array(bufferLength);
analyser.getByteFrequencyData(dataArray);
Copy the code
Note that getByteFrequencyData assigns to an existing array element, not returns a new array after creation.
The advantage is that there is only one reference to the dataArray in the code, and no need to revalue it through function calls and parameter passing.
Visualization of two implementation schemes
Now that you know about Web Audio, you can use getByteFrequencyData to fetch an array of Uint8Array, tentatively named dataArray.
In principle, visualization relies on data that can be audio, temperature changes, or even random numbers. So, for the rest of the presentation, we just need to worry about mapping dataArray to graphical data, not Web Audio.
(To simplify the description of Canvas and WebGL, Canvas is referred to specifically as Canvas 2D.)
1. The Canvas
Click here to see the source code for the first example
Canvas itself is the playback of a sequence of frames. It’s going to empty the Canvas every frame and then redraw it.
Here’s an excerpt from the sample code:
function renderFrame() {
requestAnimationFrame(renderFrame);
// Update the frequency data
analyser.getByteFrequencyData(dataArray);
// bufferLength specifies the number of rectangles in the histogram
for (var i = 0, x = 0; i < bufferLength; i++) {
// Map a rectangle height according to frequency
barHeight = dataArray[i];
// Map a background color based on each rectangle height
var r = barHeight + 25 * (i / bufferLength);
var g = 250 * (i / bufferLength);
var b = 50;
// Draw a rectangle and fill it with a background color
ctx.fillStyle = "rgb(" + r + "," + g + "," + b + ")";
ctx.fillRect(x, HEIGHT - barHeight, barWidth, barHeight);
x += barWidth + 1;
}
}
renderFrame();
Copy the code
For visualization, the core logic is how to map frequency data to graphical parameters. In the example above, you simply changed the height and color of each rectangle in the bar diagram.
Canvas has a rich drawing API and can do a lot of cool things just from a 2D perspective. Just like DOM, if a combination of
2. WebGL scheme
Click here to see the source code for the second example
Canvas is CPU computation. For the for loop 10,000 times, and every frame has to be computed repeatedly, the CPU is not able to load. So we rarely see Canvas 2D used for particle effects. Instead, use WebGL and leverage the computing power of the GPU.
One concept that is relatively new to WebGL is shaders. It is a general term for the rendering algorithm that runs on the GPU. It is written in GLSL (OpenGL Shading Language), which is simply a C-style Language. Here’s a simple example:
void main(a)
{
gl_Position = projectionMatrix * modelViewMatrix * vec4(position, 1.0);
}
Copy the code
For a more detailed introduction to shaders, check out this article.
WebGL’s native API is quite complex, so we use three.js as the base library, which makes writing business logic easy.
Let’s take a look at what’s going on in the development process.
In this case, an IMAGINATIVE audio array was of type simple Object and was passed to the shader as an imaginative property. What the shader does is simply map an imaginative array of properties defined in its uniforms to vertices and colors on the screen.
Vertex shaders and chip shaders are often written without front-end development, which may be familiar to those who have learned Unity3D and other technologies. Readers can look for ready-made shaders on ShaderToy.
Then introduce the following Three classes in three.js:
1. THREE.Geometry
You can think of it as a shape. That is, it is up to the class to decide whether the final object is a sphere, a cuboid, or some other irregular shape.
So, you need to pass it some vertex coordinates. For example, if a triangle has three vertices, it passes in three vertex coordinates.
Of course, three. js has many popular shapes built in, such as BoxGeometry and CircleGeometry.
2. THREE.ShaderMaterial
You can think of it as color. Again, taking triangles as an example, a triangle can be black, white, gradient, etc. These colors are determined by the ShaderMaterial.
ShaderMaterial is a Material that is defined by vertex shaders and slice shaders.
3. THREE.Mesh
Once you have defined the shapes and colors of objects, you need to group them together, called a Mesh. Once you have the Mesh, you can add it to the canvas. Then there is the normal requestAnimationFrame flow.
Again, we have taken the key code from the example and annotated it.
I. Create Geometry (this class is inherited from THREE.BufferGeometry) :
var geometry = ParticleBufferGeometry({
// TODO some parameters
});
Copy the code
Imaginative:
var uniforms = {
dataArray: {
value: null.type: 't' / / the corresponding THREE DataTexture
},
// TODO other attributes
};
Copy the code
Iii. Create ShaderMaterial:
var material = new THREE.ShaderMaterial({
uniforms: uniforms,
vertexShader: ' '.// TODO passes in vertex shaders
fragmentShader: ' '.// TODO passes in the slice shader
// TODO other parameters
});
Copy the code
Iv. Create Mesh:
var mesh = new THREE.Mesh(geometry, material);
Copy the code
V. Create some required render objects in three. js, including scenes and cameras:
var scene, camera, renderer;
renderer = new THREE.WebGLRenderer({
antialias: true.alpha: true
});
camera = new THREE.PerspectiveCamera(45.1.1..1e3);
scene = new THREE.Scene();
Copy the code
Vi. General render logic:
function animate() {
requestAnimationFrame(animate);
// TODO can trigger events to update frequency data
renderer.render(scene, camera);
}
Copy the code
summary
This article begins by showing you how to get Audio frequency data through Web Audio’s related apis.
Then it introduces two visualization schemes of Canvas and WebGL, and some common ways of mapping frequency data to graphic data.
In addition, the cloud music client has been online for some time, after reading this article, any students want to try to implement their own audio motion effect?
Here are two codepen examples from the article:
- Codepen. IO/jchenn/pen /…
- Codepen. IO/jchenn/pen /…
This article is published from netease Cloud music front end team, the article is prohibited to be reproduced in any form without authorization. We’re always looking for people, so if you’re ready to change jobs and you like cloud music, join us!