What is the PCM

Real-life audio data is a continuous analog signal, shown in the red line. This continuous signal can be recorded in grooves on a vinyl record. However, computers use digital signals to store information and cannot store continuous signals. Therefore, if you want to record audio data in the computer, you need to convert analog signals into digital signals that the computer can store. One of the most common conversion method is Pulse Code Modulation (PCM). Using PCM, the audio represented by the red line in the figure can be converted to the corresponding value of the blue dot, which can be stored on the computer after coding.

Relevant concepts

If you want to convert an audio analog signal into a PCM file, you need to consider the following parameters: sampling rate, sampling bits, number of channels, and bit rate

Sampling rate

To analog signals into digital signals in the process of the need to take the sample, such as in the figure above for the analog signal took 26 points and corresponding numerical, 26 after 26 also will only save this value, the other didn’t get to the part of the lost, so the analog signals into digital signals in the process of the inevitable loss of data. In order to ensure the integrity of data, the number of sampling points should be increased. The number of samples taken in a second is called the sampling rate and is expressed in Hz. The higher the sampling rate is, the closer the waveform of the sampled digital signal is to that of the original analog signal. Common sampling rates are as follows:

Sampling rate use
8k Hz The phone
22.05 k Hz FM radio
44.1 k Hz CD
48k Hz Standard DVD
96k Hz Blu-ray DVD
192k Hz sound card

Sampling digit

The sampling number is the number of bits used to represent the value of a sampling point. The value range of the blue dots in the figure is 0~15, so only 4 bits are needed to satisfy the demand, so the sampling number is 4. Obviously, the higher the sampling number, the finer the sound. Common sampling bits are as follows:

Sampling digit use
8 The phone
16 CD
24 DVD

channel

The number of channels refers to the independent audio signals in different spatial positions during sound recording, or the number of microphones during sound recording. Here is a diagram of a 5.1-channel home theater with six speakers:

  • C (Central) Central
  • FL (Front Left) Front Left
  • FR (Front Right) Front
  • SL (Surround Left) Left Surround
  • SR (Surround Right
  • SW (Subwoofer)

Bit rate

Bit rate is the number of bits of data per second in BPS. According to the sampling rate, sampling bits and number of channels, we can calculate the corresponding digital signal bit rate is sampling rate * sampling bits * number of channels common bit rates are as follows:

Bit rate use
96k bps FM radio
128k, 160k, 192k, 320k bps MP3
400k~1411 kbps Lossless compressed audio
1411.2 KBPS PCM stores digital audio CDS

PCM format

PCM is to directly write the sampled and encoded binary data into a file. For example, for a data with 16 sampling bits and 2 sound channels, its saving format is as follows:

+ -- -- -- -- -- -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- -- -- -- - + -- -- -- -- -- 1 (16 bit) | | track channel 2 (16 bit) 1 (16 bit) | | track channel 2 (16 bit) |... +-------------+-------------+-------------+-------------+-----Copy the code

Sixteen bits represent the data of one sampling point obtained by one channel, and the data of two channels are evenly spaced. PCM data is just pure binary data, players get the data later if you don’t know the corresponding sampling rate, sampling number, channel number will not be able to cut this binary data into the original byte segment, so you need to packaging format Basic parameters of binary data corresponding to the description, the commonly used packaging format is WAVE.

The WAVE format

The standard WAVE format is shown below:

It consists of the following parts:

  1. RIFF head
    1. ChunkID: Fixed to “RIFF” string
    2. ChunkSize: indicates the length of the remaining part, whose value is4 + (8 + SubChunk1Size)+(8 + SubChunk2Size) 
    3. Format: Specifies the string “WAVE”
  2. “FMT” part
    1. Subchunk1ID: fixed to “FMT” string (note t followed by a space)
    2. Subchunk1Size: indicates the size of Subchunk. The value of PCM is 16
    3. AudioFormat: compressed AudioFormat. The value of PCM is 1, indicating that no compression is performed
    4. NumChannel: indicates the number of channels
    5. SampleRate: SampleRate
    6. ByteRate: ByteRate,Byte rate = bit rate /8 
    7. BlockAlign: size of file block alignment, passNumber of channels * Sampling bits /8
    8. BitsPerSample: indicates the number of samples
  3. Data section
    1. Subchunk2ID: The value is a string of data
    2. Subchunk2Size: data size,Data size = Number of samples x Number of channels x Number of sampling bits /8 
    3. Data: audio Data

After encapsulating the PCM into a WAVE file, you can use the player to play the audio file

Play the PCM audio file

There are two ways to play PCM in Python:

  1. Encapsulate the PCM as a WAVE file
  2. Play PCM files directly

Encapsulate it as a WAVE to play

Wave libraries are built into Python to process files in wave format. As you saw in the previous section, all you need to do to wrap a PCM into a WAVE is add a header to the audio data. The Wave library provides rich interfaces to help us do this, as follows:

  1. Set the number of channels
  2. Set the number of bytes sampled
  3. Set sampling rate
  4. Write the data from the PCM file to the Data section of the WAVE file
def pcm2wav(pcm_file, wav_file, channels=1, bits=16, sample_rate=16000) :
    Open the PCM file
    pcmf = open(pcm_file, 'rb')
    pcmdata = pcmf.read()
    pcmf.close()
    
    Open the WAVE file to be written
    wavfile = wave.open(wav_file, 'wb')
    # Set channel number
    wavfile.setnchannels(channels)
    Set the sampling bit width
    wavfile.setsampwidth(bits // 8)
	# Set the sampling rate
    wavfile.setframerate(sample_rate)
    Write the data section
    wavfile.writeframes(pcmdata)
    wavfile.close()
    
pcm2wave("f1.pcm"."f2.wav")
Copy the code

This successfully converts PCM format F1.pCM to WAVE format F2.wav, which can be played by the player. The PyAudio library in Python implements the player function by providing an interface to play WAVE audio files directly

import pyaudio  
import wave  

Set the read unit size
chunk = 1024  

# Open file
f = wave.open(r"/usr/share/sounds/alsa/Rear_Center.wav"."rb")
Initialize the player
p = pyaudio.PyAudio()  
Initialize data stream, set sampling bit width, number of channels, sampling rate
stream = p.open(format = p.get_format_from_width(f.getsampwidth()),  
                channels = f.getnchannels(),  
                rate = f.getframerate(),  
                output = True)  
Read the data and write to the PyAudio object's data stream
data = f.readframes(chunk)  
while data:  
    stream.write(data)  
    data = f.readframes(chunk)  

# Stop playing
stream.stop_stream()  
stream.close()  
p.terminate()  
Copy the code

Play PCM audio directly

Playing PCM audio directly with PyAudio is much the same as playing WAVE files, except that you need to manually set the sampling bits, sampling rates, and number of channels of the PyAudio player before writing PCM data to the PyAudio stream.

import pyaudio

Initialize the player
p = pyaudio.PyAudio()
stream = p.open(format=p.get_format_from_width(2), channels=1, rate=16000, output=True)

Write PCM data directly to PyAudio's data stream
with open("f1.pcm"."rb") as f:
    stream.write(f.read())

stream.stop_stream()
stream.close()
p.terminate()
Copy the code

reference

  • Soundfile.sapp.org/doc/WaveFor…
  • People.csail.mit.edu/hubert/pyau…