Digital audio technology knowledge points and sound card drivers

About PCM

PCM is the abbreviation of Pulse code modulaTIon, which is the most direct way to encode waveforms. Its position in audio may be similar to the status of BMP in the picture.

Sampling rate: The conversion from analog signal to digital signal, that is, from continuous signal to discrete signal is completed by discrete sampling. The Sampling rate is the number of samples per second. According to the Shannon sampling theorem, to ensure that the signal is not distorted, the Sampling rate is greater than twice the maximum frequency of the signal. We know that the frequency range that people can hear is 20hz â€“ 20khz, so the Sampling rate of 40k is enough, and more is just a waste. But sometimes in order to save bandwidth and storage resources, the Sampling rate can be reduced and the quality of the sound is lost, so we often see sound data with a sampling rate of less than 40k.

Sample size: used to quantify the amplitude of a sample, typically 8 bits, 16 bits and 24 bits. 8 bits only have early sound card support, and 24 bits are only supported by professional sound cards. We usually use 16 bits.

Number of channels: The number of sound channels, one for mono, two for stereo, and more (such as the 7.1 format for 8 channels). In general, each channel comes from a separate mic, so the multi-channel effect will be better (more realistic), of course, the cost is greater.

Frame: Frame is a sampled data containing all channels. For example, for a 16-bit dual channel, the size of a frame is 4 bytes (2 * 16).

First, digital audio

The audio signal is a continuously changing analog signal, but the computer can only process and record the binary digital signal. The audio signal obtained by the natural sound source must be transformed into a digital audio signal before being sent to the computer for further processing. deal with.

Digital audio systems reproduce the original sound by converting the waveform of the sound wave into a series of binary data. The device that implements this step is often referred to as an analog-to-digital converter (A/D). The A/D converter samples the sound waves at a rate of tens of thousands of times per second. Each sample point records the state of the original analog sound wave at a certain moment, usually called a sample, and every second. The number of samples sampled by the clock is called the sampling frequency. By connecting a series of consecutive samples, a sound can be described in the computer. For each sample in the sampling process, the digital audio system allocates a certain storage bit to record the amplitude of the sound wave, which is generally called sampling resolution or sampling accuracy. The higher the sampling accuracy, the finer the sound is restored.

Digital audio involves a lot of concepts. For programmers programming audio under Linux, the most important thing is to understand the two key steps of sound digitization: sampling and quantization. Sampling is the reading of the amplitude of the sound signal at regular intervals, while quantification is the conversion of the amplitude of the sampled sound signal into a digital value. In essence, sampling is digitization in time, while quantization is digitization in amplitude. Here are a few of the technical indicators you often need to use for audio programming:

Sampling frequency

The sampling frequency is the number of times the sound amplitude sample is extracted per second when the analog sound waveform is digitized. The selection of the sampling frequency should follow the Harry Nyquist sampling theory: if an analog signal is sampled, the highest signal frequency that can be restored after sampling is only half of the sampling frequency, or as long as the sampling frequency is higher than the input signal. At twice the highest frequency, the original signal can be reconstructed from the sampled signal series. The frequency range of normal human hearing is about 20Hz~20kHz. According to Nyquist sampling theory, in order to ensure that the sound is not distorted, the sampling frequency should be around 40kHz. Commonly used audio sampling frequencies are 8 kHz, 11.025 kHz, 22.05 kHz, 16 kHz, 37.8 kHz, 44.1 kHz, 48 kHz, etc. If a higher sampling frequency is used, the sound quality of the DVD can also be achieved. Among them, 8kHZ is the sampling frequency of the phone.

Quantization digit

The number of quantized bits digitizes the amplitude of the analog audio signal, which determines the dynamic range of the analog signal after digitization. Commonly used are 8-bit, 12-bit, and 16-bit. The higher the quantization bit, the larger the dynamic range of the signal, and the more likely the digitized audio signal is to approach the original signal, but the more storage space is required.

Number of channels

The number of channels is another important factor reflecting the quality of audio digitization. It has mono and dual channels. Two-channel, also known as stereo, has two lines in the hardware, and the sound quality and tone are better than mono, but the amount of storage space occupied by digitization is twice as large as that of mono.

Second, the sound card driver

For security reasons, applications under Linux cannot directly operate on hardware devices such as sound cards, but must be completed by drivers provided by the kernel. The essence of audio programming on Linux is to use the driver to complete the various operations on the sound card.

The control of the hardware involves the operation of the various bits in the register. Usually this is directly related to the device and the timing requirements are very strict. If these tasks are left to the application programmer, the programming of the sound card will become abnormal. Complex and difficult, the driver's role is to shield these low-level details of the hardware, thus simplifying the writing of the application. At present, there are two main types of sound card drivers commonly used under Linux: OSS and ALSA.

The audio programming interface first appeared on Linux is OSS (Open Sound System), which consists of a complete set of kernel driver modules, which can provide a unified programming interface for most sound cards. OSS has a relatively long history. Some of these kernel modules (OSS/Free) are released free of charge with the Linux kernel source, while others are provided in binary form by 4Front Technologies. Thanks to the support of commercial companies, OSS has become the de facto standard for audio programming under Linux, and applications that support OSS work well on most sound cards.

Although OSS is very mature, it is a commercial product that is not completely open source. ALSA (AdvancedLinux Sound Architecture) just makes up for this gap. It is another sound card driver for audio programming under Linux. program. In addition to providing a set of kernel driver modules like OSS, ALSA also provides a corresponding function library for simplifying the writing of applications. Compared with the original ioctl-based programming interface provided by OSS, ALSA uses the ALSA function library. More convenient. The main features of ALSA are:

Support multiple sound card devices

Modular kernel driver

Support for SMP and multithreading

Provide application development function library

Compatible with OSS applications

The biggest difference between ALSA and OSS is that ALSA is a free project maintained by volunteers, while OSS is a commercial product provided by the company, so OSS is better than ALSA in terms of hardware compatibility, and it can support sound card types. More. Although ALSA is not as widely used as OSS, it has a more friendly programming interface and is fully compatible with OSS, which is a better choice for application programmers.

Barrier Terminal Block

Cixi Xinke Electronic Technology Co., Ltd. , https://www.cxxinke.com