What is an audio editor?

  • The audio editor is an online audio editing software, for up or professional music producers to use, must have a certain knowledge of music theory. For example, recognize beats, accompaniment, bars, 4/4 notes, pitch, loudness, tension, and so on. So they can create some audio. At the same time, it provides corresponding audio sources and songs to improve their creation efficiency
  • Online Web Audio editor link: yan.qq.com/audioEditor
  • X + Vuex + VUUE – CLI + vuue -router + SVG + AXIos + Element – UI etc

How to use the audio editor?

Here are a few video instructions to use

  • Use it directly from the main process
  • Painting pitch block
  • Loudness tension adjustment

How to make an audio editor?

[1]. Basic principles

  • Pitch blocks: Draw sound blocks on the page with the mouse, each sound block has its own width and height coordinate position, in unit of PX -> Convert the corresponding relationship between BPM (warp) tempo and PX into time -> Synthesize the time of each sound block to the back-end synthesis -> synthesize the audio link and then play back and move the playback line.
  • Pitch line: Get the user in the location of the mouse on the screen – > the position and AI back to the song the pitch line for fitting, pitch line is drawn to get a user – > again according to the beat, drive and px conversion into a collection of data, the relation between transmitted to the backend, and synthesis of a song – > backend back to give me the link of this song, playing on the front end, And do the corresponding operations.

[2] Overall data architecture design

  • The data design of the entire editor is roughly as shown in the figure above, mainly relying on Vuex, a data management warehouse, to help us manage the shared state and the data flow of the entire page when developing large single-page applications. The main management data is divided into several modules:
    • 1. Basic elements of the editor, such as: warp speed, sound source, whole stage correlation, width, height, tempo, playing line, etc.
    • 2. Store the sound block data of the whole stage: stagePitches, which will be converted into pitchList during synthesis, synthesizing sound and audio for the background and playing back.
    • 3. Pitch line: it is divided into ai-synthesized reference pitch line and user-edited pitch line, as well as px corresponding local editing part.
    • 4. Operation signs: Because the editor’s synthesis relies on many state modifications, a modification will require to determine whether to synthesize playback, which is strongly related to the next step of the playback pause control state switching back and forth
    • 5. Mode switching: such as note/pitch line/phoneme mode switching

[3]. Overall component framework

[4]. Breakthrough in key and difficult points

(1) The problem of playing state switch -> Use finite state machine to solve:

whenOnce the play control button is clicked, there are several states, and each state needs different processing. At the same time, the operations inside have the same operation and are coupled with each other, so how to allocate becomes a problem. After combing, referring to some source code, it is found that this is a state machine transformation. So what do we do next? Details are as follows:

1. First, list all the states that the page needs

2. In the beginning, the code looks like this: there are a lot of if-else, and each if else has a lot of judgments in it.

3. The transformation into a state machine is in the following form.

  • There are too many if and else in the code, which will lead to poor scalability. If you want to expand new states later, you will not know where to start. However, using state machine, you can continuously expand new states into the code. This is also a reference to the Typescript source that uses state machines to make processes clearer.
State machines in Typescript source code
  • First TSC divides up a number of states, each of which handles a logic. Such as:

1).createProgram parse source code into ast 2). SyntaxDiagnostics handles syntax errors 3). SemanticDiagnostics handles semantic errors and Typescript Through this state modification to complete the flow of different processing logic, if processed to the end state represents the end of the process. In this way, the overall process can be easily expanded and modified. For example, if you want to expand a stage, you only need to add a state; if you want to modify the processing logic of a certain state, you only need to modify the state turning of the state machine. Rather than a lot of if else jumbled up and difficult to extend and modify.

4. State machine is finally formed, and the flow chart of state switching is as follows:

Page is an initial state, when to play, will switch to the state of play, at this time if it is in play, click the pause playback, and switch to the suspended state, and then when the pause state to play, will switch to the state of play, play the finished state, will switch to the end state, end state to play again, will be to switch back to the state of play. Until the end of the audio.

5. In addition, in the flow chart, we found that every time when playing, we need to judge whether we need to synthesize, at this time, we used the monitoring of operation sign state mentioned above, that is, as long as a state changes, we will synthesize again.

(2), the playback progress is not smooth -> use requestAnimationFrame to solve the problem

After going to play, when playing, the playing progress is not smooth. The line movement is very slow. The main reason for this is that the browser renders the page once in 16ms. Why does the browser render a page in 16ms?

  • Since the most widely used screens today have a fixed refresh rate (the latest ones are typically 60Hz, for example), it makes no sense for the browser to redraw twice between hardware refreshes and only drains performance. So the browser uses this interval of 16ms (1000ms/60) to throttle the drawing appropriately
  • We need to use requestAnimationFrame to solve this problem. Why can it solve this problem? The main reason for this is that the browser allows developers to control the point before rendering and execute requestAnimationFrame after performing the macro task before each rendering. Then, to perform the next rendering, of course, before executing the macro task to perform the micro task inside the macro task first. The steps are as follows: Execute all microtasks under this macro task -> Execute this macro task -> Execute requestAnimationFrame (if any) -> Execute all microtasks under the next macro task ->…. And so on.
  1. First, declare a playAudio method that plays the audio. This method is responsible for playing the audio. After setting the basic properties of the audio, such as the play link, listen for the playing properties.

2. Then use requestAnimationFrame to move the line while it is playing. Can solve the problem of the playback progress is not smooth.

(3) When dragging the sound block, the mouse moved out of the sound block, lost the focus, how to solve this problem? -> With virtual blocks

  • When we move the sound block with the mouse, we may accidentally touch the mouse scroll button, resulting in the loss of focus of the mouse and the sound block, so that it can not be separated. The problem is shown below

  • This is a reference to Chrome, where a virtual block is stuck after the mouse is clicked so that the mouse never loses focus

  • Solution: when clicking the mouse, a mask, transparent, stick the mouse, mouse to where, the mask will follow where, and then when the mouse a release of the time to remove the mask. So as to achieve a relatively smooth effect. Underneath is a transparent red transparent layer over the sound block, and it never loses focus.

(4) How to pronounce the piano keys -> with Web Audio API

  • Click the piano key on the left to make a sound. How? The problem is where the source comes from, and how to locate one of his pitches.
  • First, the audio source is very simple. Load a C4 standard sound, c4.mp4

We then use the Web Audio API to retrieve the Audio, convert it to a binary stream, and offset the pitch of the Audio. The offset formula is playback- Rate (a, B)=2 ** ((note-60) / 12). 60 is the pitch of C4, and then note is the note value that’s passed in, and then the pitch is converted. Zpl. fi/ pitch-Shift… The resulting code is as follows:

(5) Related problems of pitch line

1. Selection of pitch line drawing technology, conducted a survey and compared Canvas and SVG.
  • Canvas: Advantages: Good performance and fluency. Disadvantages: the early need to build a large number of basic code, to operate, and the need to draw the workload is also very large. It’s a lot of code.
  • SVG: Advantages: It is a DOM element with some basic DOM manipulation capabilities. Cons: Performance is not very good, because direct manipulation of the DOM causes a rearrangement of the entire browser, which results in a re-rendering process for the browser. But you write a lot less code.
  • Overall consideration, this is a PC page, the user’s computer performance must be better than the phone H5 page, and time is tight, so SVG was adopted.
2. How to draw pitch lines -> with SVG
  • Every 10ms point of the pitch line parsed from the tone block using SVG’s path attribute is converted to the corresponding pixels on the screen, and then painted on each point.
3. How to solve the jagged line -> fill frame

  • As shown above, when the mouse moves very fast, the pitch line appears jagged, which is why. The main reason is that the browser renders the page every 16ms. When the mouse moves too fast for the browser to render, it loses the position of the mouse, and the only way to do this is to patch up the lost data. The concrete implementation is as follows:

    The frame completion logic here is to calculate the X-axis of the previous mouse and the X-axis of the present, and then calculate the distance between their Y-axis, and then loop, and add the data.

4. How to solve the problem of large amount of data -> optimize string operation + group rendering

  • When the amount of data is very large, browser rendering will be slow, resulting in the modification of pitch lines will not follow the mouse, this is mainly because dom rendering is too large when the amount of data is very large. So I did group render + string instead of array. Instead of storing all the data in one SVG, we have multiple SVG open arrays, so that when we change, only the data in this SVG is changed.

5. How to solve the pitch line mutation problem after switching BPM -> intercept data request, keep unchanged before data request and change after data request

  • Because our pitch line and BPM are strongly correlated (because we need the AI’s reference line). So when BPM changes. This is because Vue is MVVM mode, when the data is changed, the page will be changed accordingly, and then get the data synthesized by the pitch line AI to fit. Finally we get what we want. So there is a process of change

  • The solution to this problem is to set the BPM back to the original BPM before the data comes in, and then change back to the new BPM. The code is as follows:

(6), Recall (CTRL + Z) and Cancel Recall (CTRL + Y) shortcut keys -> Use command mode to solve.

This is mainly based on the practice of the editor in three.js. In command mode, undo and redo operations are processed in command.

three.js

If you look at the source of the editor in three.js, you can see that there is a built-in Commands file with all the operations that need to be withdrawn. Each command has its own undo and redo files.

So using the idea of three.js, I design my own editor undo and redo.

  1. A history.js file is defined to store all undo and redo files, and two stacks are defined to store all operations: the undo stack and the forward stack. And then every time you do it, you put it on the recall stack, and when you need to do it, you take the last one out of the recall stack, and you do it, and you put it on the forward stack. When it’s time to move forward, it takes the last one in the forward stack, executes it, and puts the action in the back stack, and then the next time it needs to move back, it can continue in the back stack.

The specific code is as follows:2. Define an editor class that introduces our history, registers it globally, and provides methods such as undo.redo.

3. Register the corresponding command where you want to withdraw the command and perform the corresponding operation in the command. For example, register command when deleting a sound block

Do the corresponding operation in DeletePitchCommand

This implements undo and redo.

(7) How to implement development once, all three terminals (MAC/Windows/Web) can be used -> with the aid of electron

I have investigated the current schemes in the market. Mainly, vscode commonly used in our front end is packaged with electron to realize THE PC client, which is relatively mature, so I choose the electron to package my code, and then generate the client installation package. There is an entrance on the pageClick on it and you can download it. The specific implementation, is with the help of electron chrome browser kernel + Node. js mechanism, the introduction of packaging tools, so that you can run the browser code.

Fourth, the outlook for the future

  1. SVG is changed to Canvas implementation, but need to build canvas based events, and also encounter browser rendering problems, this consideration whether to do so.
  2. Identify audio sources and synthesize data into a song? If it can be implemented, it can solve the problem of user debugging wait time and consumption of machine and CPU resources, which needs to be verified.
  3. If you have the time, you can recompose the code into Vue3+ Typescript

Above, is all I share, if there are mistakes and omissions, please point out.