This document provides a glossary of audio-related terminology, including a list of widely used, generic terms and a list of terms that are specific to Android.
Generic Terms
These are audio terms that are widely used, with their conventional meanings.
Digital Audio
acousticsThe study of the mechanical properties of sound, for example how the physical placement of transducers such as speakers and microphones on a device affects perceived audio quality.
A multiplicative factor less than or equal to 1.0, applied to an audio signal to decrease the signal level. Compare to "gain."
audiophile
An is an individual who is concerned with a superior music reproduction experience, especially someone willing to make tradeoffs (of expense, component size, room design, etc.) beyond what an ordinary person might choose.
bits per sample or bit depth
Number of bits of information per sample.
channel
A single stream of audio information, usually corresponding to one location of recording or playback.
downmixing
To decrease the number of channels, e.g. from stereo to mono, or from 5.1 to stereo. This can be accomplished by dropping some channels, mixing channels, or more advanced signal processing. Simple mixing without attenuation or limiting has the potential for overflow and clipping. Compare to "upmixing."
Direct Stream Digital, a proprietary audio encoding based on . Whereas PCM encodes a waveform as a sequence of individual audio samples of multiple bits, DSD encodes a waveform as a sequence of bits at a very high sample rate. For DSD, there is no concept of "samples" in the conventional PCM sense. Both PCM and DSD represent multiple channels by independent sequences. DSD is better suited to content distribution than as an internal representation for processing, as it can be difficult to apply traditional DSP algorithms to DSD. DSD is used in (SACD) and in DSD over PCM (DoP) for USB. See the Wikipedia article for more information.
duck
To temporarily reduce the volume of one stream, when another stream becomes active. For example, if music is playing and a notification arrives, then the music stream could be ducked while the notification plays. Compare to "mute."
FIFO
A hardware module or software data structure that implements queueing of data. In the context of audio, the data stored in the queue are typically audio frames. A FIFO can be implemented by acircular buffer. frame A set of samples, one per channel, at a point in time. frames per buffer The number of frames handed from one module to the next at once; for example the audio HAL interface uses this concept.
frame
A set of samples, one per channel, at a point in time.
frames per buffer
The number of frames handed from one module to the next at once; for example the audio HAL interface uses this concept.
gain
A multiplicative factor greater than or equal to 1.0, applied to an audio signal to increase the signal level. Compare to "attenuation."
HD audio
High-Definition audio, a synonym for "high-resolution audio." Not to be confused with Intel High Definition Audio.
Hz
The units for sample rate or frame rate.
high-resolution audio
There is no standard definition, but high-resolution usually means any representation with greater bit-depth and sample rate than CDs (which are stereo 16-bit PCM at 44.1 kHz), and with no lossy data compression applied. Equivalent to "HD audio." See the Wikipedia article for more information.
latency
Time delay as a signal passes through a system.
A algorithm preserves bit accuracy across encoding and decoding. The result of decoding any previously encoded data is equivalent to the original data. Examples of lossless audio content distribution formats include , PCM within , and . Note that the authoring process may reduce the bit depth or sample rate from that of the . Distribution formats that preserve the resolution and bit accuracy of masters are the subject of "high-resolution audio."
lossy
A algorithm attempts to preserve the most important features of media across encoding and decoding. The result of decoding any previously encoded data is perceptually similar to the original data, but it is not identical. Examples of lossy audio compression algorithms include MP3 and AAC. As analog values are from a continuous domain, whereas digital values are discrete, ADC and DAC are lossy conversions with respect to amplitude. See also "transparency."
mono
One channel.
multichannel
See "surround sound." Strictly, since stereo is more than one channel, it is also "multi" channel. But that usage would be confusing.
mute
To (temporarily) force volume to be zero, independently from the usual volume controls.
overrun
An audible caused by failure to accept supplied data in sufficient time. See Wikipedia article [sic; the article for "buffer overrun" describes an unrelated failure]. Compare to "underrun."
panning
To direct a signal to a desired position within a stereo or multi-channel field.
ramp
To gradually increase or decrease the level of a particular audio parameter, for example volume or the strength of an effect. A volume ramp is commonly applied when pausing and resuming music, to avoid a hard audible transition.
sample
A number representing the audio value for a single channel at a point in time.
sample rate or frame rate
Number of frames per second; note that "frame rate" is thus more accurate, but "sample rate" is conventionally used to mean "frame rate."
sonification
The use of sound to express feedback or information, for example touch sounds and keyboard sounds.
stereo
Two channels.
stereo widening
An effect applied to a stereo signal, to make another stereo signal which sounds fuller and richer. The effect can also be applied to a mono signal, in which case it is a type of upmixing.
surround sound
Various techniques for increasing the ability of a listener to perceive sound position beyond stereo left and right.
The ideal result of lossy data compression, as stated in the Wikipedia article. A lossy data conversion is said to be transparent if it is perceptually indistinguishable from the original by a human subject.
underrun
An audible caused by failure to supply needed data in sufficient time. See Wikipedia article . Compare to "overrun."
upmixing
To increase the number of channels, e.g. from mono to stereo, or from stereo to surround sound. This can be accomplished by duplication, panning, or more advanced signal processing. Compare to "downmixing."
virtualizer
An effect that attempts to spatialize audio channels, such as trying to simulate more speakers, or give the illusion that various sound sources have position.
volume
Loudness, the subjective strength of an audio signal.
Hardware and Accessories
These terms are related to audio hardware and accessories.
Inter-device interconnect
These technologies connect audio and video components between devices, and are readily visible at the external connectors. The HAL implementor may need to be aware of these, as well as the end user.
BluetoothA short range wireless technology. The major audio-related and are described at these Wikipedia articles:
- for music
- for telephony
Digital display interface by VESA.
HDMI
High-Definition Multimedia Interface, an interface for transferring audio and video data. For mobile devices, either a micro-HDMI (type D) or MHL connector is used.
Intel HDA
(commonly shortened to HDA) is a specification for, among other things, a front-panel connector. Not to be confused with generic "high-definition audio" or "high-resolution audio."
Mobile High-Definition Link is a mobile audio/video interface, often over micro-USB connector.
phone connector
A mini or sub-mini phone connector connects a device to wired headphones, headset, or line-level amplifier.
SlimPort
An adapter from micro-USB to HDMI.
S/PDIF
Sony/Philips Digital Interface Format is an interconnect for uncompressed PCM. See Wikipedia article.
Thunderbolt
is a multimedia interface that competes with USB and HDMI for connecting to high-end peripherals.
USB
Universal Serial Bus. See Wikipedia article .
Intra-device interconnect
These technologies connect internal audio components within a given device, and are not visible without disassembling the device. The HAL implementor may need to be aware of these, but not the end user.
See these Wikipedia articles:
- , for control channel
- , for audio data
Audio Signal Path
These terms are related to the signal path that audio data follows from an application to the transducer, or vice-versa.
ADCAnalog to digital converter, a module that converts an analog signal (continuous in both time and amplitude) to a digital signal (discrete in both time and amplitude). Conceptually, an ADC consists of a periodic sample-and-hold followed by a quantizer, although it does not have to be implemented that way. An ADC is usually preceded by a low-pass filter to remove any high frequency components that are not representable using the desired sample rate. See Wikipedia article .
AP
Application processor, the main general-purpose computer on a mobile device.
codec
Coder-decoder, a module that encodes and/or decodes an audio signal from one representation to another. Typically this is analog to PCM, or PCM to analog. Strictly, the term "codec" is reserved for modules that both encode and decode, however it can also more loosely refer to only one of these. See Wikipedia article .
DAC
Digital to analog converter, a module that converts a digital signal (discrete in both time and amplitude) to an analog signal (continuous in both time and amplitude). A DAC is usually followed by a low-pass filter to remove any high frequency components introduced by digital quantization. See Wikipedia article .
DSP
Digital Signal Processor, an optional component which is typically located after the application processor (for output), or before the application processor (for input). The primary purpose of a DSP is to off-load the application processor, and provide signal processing features at a lower power cost.
PDM
Pulse-density modulation is a form of modulation used to represent an analog signal by a digital signal, where the relative density of 1s versus 0s indicates the signal level. It is commonly used by digital to analog converters. See Wikipedia article .
PWM
Pulse-width modulation is a form of modulation used to represent an analog signal by a digital signal, where the relative width of a digital pulse indicates the signal level. It is commonly used by analog to digital converters. See Wikipedia article .
transducer
A transducer converts variations in physical "real-world" quantities to electrical signals. In audio, the physical quantity is sound pressure, and the transducers are the loudspeaker and microphone. See Wikipedia article .
Sample Rate Conversion
downsampleTo resample, where sink sample rate < source sample rate.
Nyquist frequency
The Nyquist frequency, equal to 1/2 of a given sample rate, is the maximum frequency component that can be represented by a discretized signal at that sample rate. For example, the human hearing range is typically assumed to extend up to approximately 20 kHz, and so a digital audio signal must have a sample rate of at least 40 kHz to represent that range. In practice, sample rates of 44.1 kHz and 48 kHz are commonly used, with Nyquist frequencies of 22.05 kHz and 24 kHz respectively. See and for more information.
resampler
Synonym for sample rate converter.
resampling
The process of converting sample rate.
sample rate converter
A module that resamples.
sink
The output of a resampler.
source
The input to a resampler.
upsample
To resample, where sink sample rate > source sample rate.
Android-Specific Terms
These are terms specific to the Android audio framework, or that may have a special meaning within Android beyond their general meaning.
ALSAAdvanced Linux Sound Architecture. As the name suggests, it is an audio framework primarily for Linux, but it has influenced other systems. See Wikipedia article for the general definition. As used within Android, it refers primarily to the kernel audio framework and drivers, not to the user-mode API. See tinyalsa.
audio device
Any audio I/O end-point that is backed by a HAL implementation.
AudioEffect
An API and implementation framework for output (post-processing) effects and input (pre-processing) effects. The API is defined at android.media.audiofx.AudioEffect.
AudioFlinger
The sound server implementation for Android. AudioFlinger runs within the mediaserver process. See Wikipedia article for the generic definition.
audio focus
A set of APIs for managing audio interactions across multiple independent apps. See Managing Audio Focus and the focus-related methods and constants of android.media.AudioManager.
AudioMixer
The module within AudioFlinger responsible for combining multiple tracks and applying attenuation (volume) and certain effects. The Wikipedia article may be useful for understanding the generic concept. But that article describes a mixer more as a hardware device or a software application, rather than a software module within a system.
Service responsible for all actions that require a policy decision to be made first, such as opening a new I/O stream, re-routing after a change, and stream volume management.
AudioRecord
The primary low-level client API for receiving data from an audio input device such as microphone. The data is usually in pulse-code modulation (PCM) format. The API is defined at android.media.AudioRecord.
AudioResampler
The module within AudioFlinger responsible for sample rate conversion.
audio source
An audio source is an enumeration of constants that indicates the desired use case for capturing audio input. As of API level 21 and above, audio attributes are preferred.
AudioTrack
The primary low-level client API for sending data to an audio output device such as a speaker. The data is usually in PCM format. The API is defined at android.media.AudioTrack.
audio_utils
An audio utility library for features such as PCM format conversion, WAV file I/O, and non-blocking FIFO, which is largely independent of the Android platform.
client
Usually same as application or app, but sometimes the "client" of AudioFlinger is actually a thread running within the mediaserver system process. An example of that is when playing media that is decoded by a MediaPlayer object.
HAL
Hardware Abstraction Layer. HAL is a generic term in Android. With respect to audio, it is a layer between AudioFlinger and the kernel device driver with a C API, which replaces the earlier C++ libaudio.
A thread within AudioFlinger that sends audio data to lower latency "fast tracks" and drives the input device when configured for reduced latency.
FastMixer
A thread within AudioFlinger that receives and mixes audio data from lower latency "fast tracks" and drives the primary output device when configured for reduced latency.
fast track
An AudioTrack or AudioRecord client with lower latency but fewer features, on some devices and routes.
MediaPlayer
A higher-level client API than AudioTrack, for playing either encoded content, or content which includes multimedia audio and video tracks.
media.log
An AudioFlinger debugging feature, available in custom builds only, for logging audio events to a circular buffer where they can then be dumped retroactively when needed.
mediaserver
An Android system process that contains a number of media-related services, including AudioFlinger.
NBAIO
An abstraction for "non-blocking" audio input/output ports used within AudioFlinger. The name can be misleading, as some implementations of the NBAIO API actually do support blocking. The key implementations of NBAIO are for pipes of various kinds.
normal mixer
A thread within AudioFlinger that services most full-featured AudioTrack clients, and either directly drives an output device or feeds its sub-mix into FastMixer via a pipe.
OpenSL ES
An audio API standard by . Android versions since API level 9 support a native audio API that is based on a subset of .
silent mode
A user-settable feature to mute the phone ringer and notifications, without affecting media playback (music, videos, games) or alarms.
SoundPool
A higher-level client API than AudioTrack, used for playing sampled audio clips. It is useful for triggering UI feedback, game sounds, etc. The API is defined at android.media.SoundPool.
Stagefright
See Media.
StateQueue
A module within AudioFlinger responsible for synchronizing state among threads. Whereas NBAIO is used to pass data, StateQueue is used to pass control information.
strategy
A grouping of stream types with similar behavior, used by the audio policy service.
stream type
An enumeration that expresses a use case for audio output. The audio policy implementation uses the stream type, along with other parameters, to determine volume and routing decisions. Specific stream types are listed at android.media.AudioManager.
tee sink
See the separate article on tee sink in Audio Debugging.
A small user-mode API above ALSA kernel with BSD license, recommended for use in HAL implementations.
ToneGenerator
A higher-level client API than AudioTrack, used for playing DTMF signals. See the Wikipedia article , and the API definition at android.media.ToneGenerator.
track
An audio stream, controlled by the AudioTrack or AudioRecord API.
volume attenuation curve
A device-specific mapping from a generic volume index to a particular attenuation factor for a given output.
volume index
A unitless integer that expresses the desired relative volume of a stream. The volume-related APIs ofandroid.media.AudioManager operate in volume indices rather than absolute attenuation factors.