Algorithms are romantic too - we harmony Audio experts chatted about bel Canto, noise reduction and ultra-high sound quality

**/** It’s not so much that audio technology is a niche, it’s that people are so used to the things around them that they don’t realize the audio technology behind them. /

The romanticism of the audio expert

**LiveVideoStack: ** Of the audio development projects you’ve worked on, what are the most impressive? Can you share with us your experience at that time?

Feng Jianyuan: I have worked on a wide range of projects in the audio field, from real-time bel Canto, sound effects, audio yellow detection to AI noise reduction and AI-Codec. What impressed me most was the real-time bel Canto project, in which we summarized a set of definitions of “good voice” through data mining, and used AI and traditional algorithms to beautify human voice from multiple dimensions, thus creating the real-time bel Canto function.

This project is the most profound, mainly because in the process of doing the project, we need to find the reason why they sound good among all kinds of good sounds. In addition, ** “using different algorithms to make some ordinary sounds shine” ** the fact itself has helped me improve my aesthetic. The combination of science and art also satisfies a lot of my romantic feelings.

**LiveVideoStack: ** What are the latest innovations and technology trends that you’ve been watching?

** The trend I’ve been focusing on recently is how to better use AI in our audio technology, including AI noise reduction, audio source separation, AI-Codec, sound change and so on.

**LiveVideoStack: ** If you could recommend a book for someone who wants to work in audio algorithm technology, what would you recommend?

** Jianyuan Feng: Phonetics: Phonetic Transcription, Production, Acoustics, and Perception, by Henning Reitz and Allard Jungmann. ** This book is a classic old wine book, which tells the reader in a systematic and scientific way how speech sounds are produced and received. If you have this scientific understanding of speech before you design an algorithm, things can get a lot more done.

Ultra-high sound quality and deep learning

**LiveVideoStack: ** You are currently responsible for the design and development of soundnet’s “ultra High Quality audio System”. What is the concept of “ultra high quality audio” here in terms of technology and product experience?

Feng Jianyuan: Ultra high sound quality, as the name implies, is beyond the high sound quality. We can start by looking at what high sound quality is: pull the sampling rate up to 48K — to meet the range of frequencies that all ears can hear — and use coDEC, noise reduction, and AEC, which are less damaging to speech — without destroying the original sound quality. On the basis of this “high sound quality”, we also need to study how to beautify the sound and enhance the voice in different scenes, so as to make the voice sound better and more detailed. In addition, we also landed such as language chat bel Canto, singing bel canto and a series of products.

**LiveVideoStack: ** Could you please talk with us about the role of deep learning in audio algorithms? What is the thinking of audio functions and related products developed by Sound Net combining deep learning and machine learning?

** Deep learning has become an indispensable part of sound Net audio algorithms. We have integrated deep learning algorithms in event detection, noise reduction, CODEC and other fields. In fact, we will combine traditional algorithms, physical modeling and deep learning in algorithm design to optimize the effect and computation power. At the same time, we also set up a deep learning model optimization team, which is specially responsible for the optimization of deep learning operators, so that the algorithm of deep learning can be quickly implemented.

Behind the “accustomed”

**LiveVideoStack: ** What are the challenges facing audio algorithms in the RTC scenario that have not been fully solved? At present, what needs to be improved and improved for the audio system supporting RTC scenarios, and what is the current progress in this aspect in China?

** Feng Jianyuan: Network instability, echo problem, noise problem, spatial sound reduction and other aspects of RTC scene still need to be improved and perfected. ** In the domestic noise reduction, echo and other technologies there are a lot of good development, such as the use of AI noise reduction, and echo cancellation in the software and hardware have many excellent products.

Network problems not only need to improve the stability of network facilities, but also need to make coDEC with smaller bit rate, better PLC and better spatial acoustic reduction at the algorithm level. ** Foreign development may be slightly ahead in these areas.

**LiveVideoStack: ** What is your next development direction?

Feng Jianyuan: More extreme audio experience under high sound quality scenes. Specifically, these include all-band ultra-low bit rate CODEC, better all-band noise reduction systems, etc.

**LiveVideoStack: ** Audio technology still seems to be in a relatively small niche in China, do you agree with this statement and how do you see this phenomenon?

** Feng Jianyuan: ** The total number of people engaged in audio technology is indeed not very many, but the application of audio technology is very extensive. From communications to entertainment, people are switching between sound fields all the time, but the talent gap in the audio technology direction is always large. It’s not so much that audio technology is a niche, it’s that people are so used to the things around them that they don’t realize the audio technology behind them.

LiveVideoStack: * * * * for this LiveVideoStackCon 2021 Shanghai audio audio “new forces” lecturer team/share content (sh2021. LiveVideoStack. Cn/switchable viewer / 3169), you have what kind of look forward to? Which aspect/lecturer are you most interested in sharing so far?

** Feng Jianyuan: ** I hope to hear more about the application scenario and development direction of audio technology: for work, I am interested in OPPO Wu Hanjie’s speech, which is about how to make a better sound; For my own interest, the story behind the making of the game is interesting to me.

**LiveVideoStack: ** Can you share one of your New Year’s Resolutions with us?

** Feng Jianyuan: ** Hopes the epidemic will end soon and the wonderful interaction will never end.

Editor: Coco Liang

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

Algorithms are romantic too – we harmony Audio experts chatted about bel Canto, noise reduction and ultra-high sound quality

The romanticism of the audio expert

Ultra-high sound quality and deep learning

Behind the “accustomed”

Algorithms are romantic too – we harmony Audio experts chatted about bel Canto, noise reduction and ultra-high sound quality

The romanticism of the audio expert

Ultra-high sound quality and deep learning

Behind the “accustomed”

Related Posts

Implement a Service tutorial | use Istio Mesh in order to simplify the communication mode between micro Service

Multitouch: Custom Mac gesture control for easy play

What is high cohesion, low coupling?