Python plays PCM audio files

What is the PCM

Real-life audio data is a continuous analog signal, shown in the red line. This continuous signal can be recorded in grooves on a vinyl record. However, computers use digital signals to store information and cannot store continuous signals. Therefore, if you want to record audio data in the computer, you need to convert analog signals into digital signals that the computer can store. One of the most common conversion method is Pulse Code Modulation (PCM). Using PCM, the audio represented by the red line in the figure can be converted to the corresponding value of the blue dot, which can be stored on the computer after coding.

Relevant concepts

If you want to convert an audio analog signal into a PCM file, you need to consider the following parameters: sampling rate, sampling bits, number of channels, and bit rate

Sampling rate

To analog signals into digital signals in the process of the need to take the sample, such as in the figure above for the analog signal took 26 points and corresponding numerical, 26 after 26 also will only save this value, the other didn’t get to the part of the lost, so the analog signals into digital signals in the process of the inevitable loss of data. In order to ensure the integrity of data, the number of sampling points should be increased. The number of samples taken in a second is called the sampling rate and is expressed in Hz. The higher the sampling rate is, the closer the waveform of the sampled digital signal is to that of the original analog signal. Common sampling rates are as follows:

Sampling rate	use
8k Hz	The phone
22.05 k Hz	FM radio
44.1 k Hz	CD
48k Hz	Standard DVD
96k Hz	Blu-ray DVD
192k Hz	sound card

Sampling digit

The sampling number is the number of bits used to represent the value of a sampling point. The value range of the blue dots in the figure is 0~15, so only 4 bits are needed to satisfy the demand, so the sampling number is 4. Obviously, the higher the sampling number, the finer the sound. Common sampling bits are as follows:

Sampling digit	use
8	The phone
16	CD
24	DVD

channel

The number of channels refers to the independent audio signals in different spatial positions during sound recording, or the number of microphones during sound recording. Here is a diagram of a 5.1-channel home theater with six speakers:

C (Central) Central
FL (Front Left) Front Left
FR (Front Right) Front
SL (Surround Left) Left Surround
SR (Surround Right
SW (Subwoofer)

Bit rate

Bit rate is the number of bits of data per second in BPS. According to the sampling rate, sampling bits and number of channels, we can calculate the corresponding digital signal bit rate is sampling rate * sampling bits * number of channels common bit rates are as follows:

Bit rate	use
96k bps	FM radio
128k, 160k, 192k, 320k bps	MP3
400k~1411 kbps	Lossless compressed audio
1411.2 KBPS	PCM stores digital audio CDS

PCM format

PCM is to directly write the sampled and encoded binary data into a file. For example, for a data with 16 sampling bits and 2 sound channels, its saving format is as follows:

+ -- -- -- -- -- -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- -- -- -- - + -- -- -- -- -- 1 (16 bit) | | track channel 2 (16 bit) 1 (16 bit) | | track channel 2 (16 bit) |... +-------------+-------------+-------------+-------------+-----Copy the code

Sixteen bits represent the data of one sampling point obtained by one channel, and the data of two channels are evenly spaced. PCM data is just pure binary data, players get the data later if you don’t know the corresponding sampling rate, sampling number, channel number will not be able to cut this binary data into the original byte segment, so you need to packaging format Basic parameters of binary data corresponding to the description, the commonly used packaging format is WAVE.

The WAVE format

The standard WAVE format is shown below:

It consists of the following parts:

RIFF head
1. ChunkID: Fixed to “RIFF” string
2. ChunkSize: indicates the length of the remaining part, whose value is4 + (8 + SubChunk1Size)+(8 + SubChunk2Size)
3. Format: Specifies the string “WAVE”
“FMT” part
1. Subchunk1ID: fixed to “FMT” string (note t followed by a space)
2. Subchunk1Size: indicates the size of Subchunk. The value of PCM is 16
3. AudioFormat: compressed AudioFormat. The value of PCM is 1, indicating that no compression is performed
4. NumChannel: indicates the number of channels
5. SampleRate: SampleRate
6. ByteRate: ByteRate,Byte rate = bit rate /8
7. BlockAlign: size of file block alignment, passNumber of channels * Sampling bits /8
8. BitsPerSample: indicates the number of samples
Data section
1. Subchunk2ID: The value is a string of data
2. Subchunk2Size: data size,Data size = Number of samples x Number of channels x Number of sampling bits /8
3. Data: audio Data

After encapsulating the PCM into a WAVE file, you can use the player to play the audio file

Play the PCM audio file

There are two ways to play PCM in Python:

Encapsulate the PCM as a WAVE file
Play PCM files directly

Encapsulate it as a WAVE to play

Wave libraries are built into Python to process files in wave format. As you saw in the previous section, all you need to do to wrap a PCM into a WAVE is add a header to the audio data. The Wave library provides rich interfaces to help us do this, as follows:

Set the number of channels
Set the number of bytes sampled
Set sampling rate
Write the data from the PCM file to the Data section of the WAVE file

def pcm2wav(pcm_file, wav_file, channels=1, bits=16, sample_rate=16000) :
    Open the PCM file
    pcmf = open(pcm_file, 'rb')
    pcmdata = pcmf.read()
    pcmf.close()
    
    Open the WAVE file to be written
    wavfile = wave.open(wav_file, 'wb')
    # Set channel number
    wavfile.setnchannels(channels)
    Set the sampling bit width
    wavfile.setsampwidth(bits // 8)
	# Set the sampling rate
    wavfile.setframerate(sample_rate)
    Write the data section
    wavfile.writeframes(pcmdata)
    wavfile.close()
    
pcm2wave("f1.pcm"."f2.wav")
Copy the code

This successfully converts PCM format F1.pCM to WAVE format F2.wav, which can be played by the player. The PyAudio library in Python implements the player function by providing an interface to play WAVE audio files directly

import pyaudio  
import wave  

Set the read unit size
chunk = 1024  

# Open file
f = wave.open(r"/usr/share/sounds/alsa/Rear_Center.wav"."rb")
Initialize the player
p = pyaudio.PyAudio()  
Initialize data stream, set sampling bit width, number of channels, sampling rate
stream = p.open(format = p.get_format_from_width(f.getsampwidth()),  
                channels = f.getnchannels(),  
                rate = f.getframerate(),  
                output = True)  
Read the data and write to the PyAudio object's data stream
data = f.readframes(chunk)  
while data:  
    stream.write(data)  
    data = f.readframes(chunk)  

# Stop playing
stream.stop_stream()  
stream.close()  
p.terminate()  
Copy the code

Play PCM audio directly

Playing PCM audio directly with PyAudio is much the same as playing WAVE files, except that you need to manually set the sampling bits, sampling rates, and number of channels of the PyAudio player before writing PCM data to the PyAudio stream.

import pyaudio

Initialize the player
p = pyaudio.PyAudio()
stream = p.open(format=p.get_format_from_width(2), channels=1, rate=16000, output=True)

Write PCM data directly to PyAudio's data stream
with open("f1.pcm"."rb") as f:
    stream.write(f.read())

stream.stop_stream()
stream.close()
p.terminate()
Copy the code

reference

Soundfile.sapp.org/doc/WaveFor…
People.csail.mit.edu/hubert/pyau…