Voice transmission is analogical, whereas the
data network is digital. The process to sample analogical
waves into digital information is made by an encoder-decoder
(CODEC). There are many standards to sample an analogical
voice signal into a digital one. The process is often quite
complex. Most of the conversions use pulse code modulation
(PCM) or variations
In addition, the CODEC zip the sequence of data, and sometimes
provides echo cancellation. The compression of the
waveform can save bandwidth. This is especially interesting
in low speed connections so you can have more VoIP connections
at the same time. Another way to save bandwidth is
using the silence suppression. The goal is not to send packages
when there is no voice in the conversations.
Next is a table with the most known codecs
in use:
- Bit Rate - The rate at which bits are transmitted over a
communication path. Normally expressed in Kilobits per second
(Kbps)
- Sampling Rate - the number of samples taken per second when
digitizing sound. The quality of the digital reproduction
improves as the number of samples taken per second increases.
- Frame size - The time between packets sent
- MOS - (Mean Opinion Score). It is a subjective measure of
sound quality from 1 to 5.
In order to understand better the codec process and the parameters
expressed in the table we recommended to read the section
of G.711
codec process where it is possible to learned how it works
the G.711 codec.
| Number |
Standard by |
Description |
Bit rate (kb/s) |
Sampling rate (kHz) |
Frame size (ms) |
Remarks
| MOS (Mean Opinion Score)
|
| G.711 * |
ITU-T |
Pulse code modulation (PCM) |
64 |
8 |
Sampling |
U-law (US, Japan) and A-law (Europe) companding |
4.1
|
| G.721 |
ITU-T |
Adaptive differential pulse code modulation (ADPCM) |
32 |
8 |
Sampling |
Now described in G.726; obsolete. |
|
| G.722 |
ITU-T |
7 kHz audio-coding within 64 kbit/s |
64 |
16 |
Sampling |
Subband-codec that divides 16 kHz band into two subbands, each coded using ADPCM |
|
| G.722.1 |
ITU-T |
Coding at 24 and 32 kbit/s for hands-free operation in systems with low frame loss |
24/32 |
16 |
20 |
|
|
| G.723 |
ITU-T |
Extensions of Recommendation G.721 adaptive differential pulse
code modulation to 24 and 40 kbit/s for digital circuit multiplication
equipment application |
24/40 |
8 |
Sampling |
Superceded by G.726; obsolete. This is a completely different codec than G.723.1 |
|
| G.723.1 |
ITU-T |
Dual rate speech coder for multimedia communications transmitting at
5.3 and 6.3 kbit/s |
5.6/6.3 |
8 |
30 |
Part of H.324 video conferencing. It encodes speech or other audio
signals in frames using linear predictive analysis-by-synthesis coding.
The excitation signal for the high rate coder is Multipulse Maximum
Likelihood Quantization (MP-MLQ) and for the low rate coder is
Algebraic-Code-Excited Linear-Prediction (ACELP). |
3.8-3.9 |
| G.726 |
ITU-T |
40, 32, 24, 16 kbit/s adaptive differential pulse code modulation
(ADPCM) |
16/24/32/40 |
8 |
Sampling |
ADPCM; replaces G.721 and G.723. |
3.85 |
| G.727 |
ITU-T |
5-, 4-, 3- and 2-bit/sample embedded adaptive differential pulse
code modulation
(ADPCM) |
var. |
|
Sampling |
ADPCM. Related to G.726 |
|
| G.728 |
ITU-T |
Coding of speech at 16 kbit/s using low-delay code excited linear prediction |
16 |
8 |
2.5 |
CELP. |
3.61 |
| G.729 ** |
ITU-T |
Coding of speech at 8 kbit/s using conjugate-structure
algebraic-code-excited linear-prediction (CS-ACELP) |
8 |
8 |
10 |
Low delay (15 ms) |
3.92 |
| GSM 06.10 |
ETSI |
Regular璓ulse Excitation Long璗erm Predictor (RPE-LTP) |
13 |
8 |
22.5
| Used for GSM cellular telephony. |
|
| LPC10 |
USA Government |
Linear-predictive codec |
2.4 |
8 |
22.5 |
10 coefficients. |
|
| Speex |
|
|
8, 16, 32 |
2.15-24.6 (NB)
4-44.2 (WB) |
30 ( NB )
34 ( WB ) |
|
|
| iLBC |
|
|
8 |
13.3 |
30 |
|
|
| DoD CELP |
American Department of Defense (DoD) USA Government |
|
4.8 |
|
30 |
|
|
| EVRC |
3GPP2 |
Enhanced Variable Rate CODEC |
9.6/4.8/1.2 |
8 |
20 |
Se usa en redes CDMA |
|
| DVI |
Interactive Multimedia Association (IMA) |
DVI4 uses an adaptive delta pulse code modulation (ADPCM) |
32 |
Variable |
Sampling |
|
|
| L16 |
|
Uncompressed audio data samples |
128 |
Variable |
Sampling |
|
|
* G711 has two versions called U-law (US,
Japan) and A-law (Europe) . U-law is in relation with the T1
standard used in North America and Japan. The A-law is relation
with the E1 standard used in the rest of the world. The difference
is the method to sample the analog signal. In both schemes,
the signal is not sampled linearly, but in a logarithmic way.
For more information about the differences you could visit
G.711
A Law versus u Law.
** There are different versions of g729 codec
that it is interesting to explain because this codec is very
used nowadays.
G729: original codec
G729A or A annex: it is a simplification of G729 and it is compatible
with G729. He is less complex but it has less quality.
G729B or B annex: G729 with silence suppression and not compatible
with the previous ones
G729AB: g729A with silence suppression and only compatible with
G729B.
Besides, every version of G729 have 8Kbps of bitrate but there
are versions with 6.4 kbps (D annex) and 11.4 Kbps (E annex).