ALSA Parameter Setting and Constraint Handling

Basic Introduction

ALSA (Advanced Linux Sound Architecture) is a framework that provides a generic API that can be used to implement applications that either provide audio capture and playback services or want make use of such providers. In the most basic setup the provider of the audio playback and capture services is a ALSA kernel subsystem with a kernel driver that directly talks to a hardware device and consumer of these services is a userspace application that wants to capture or playback audio, for example like a music player. The ALSA framework is quite flexible though and more complex setups are possible and often required. ALSA provides a stack-able plugin system. In the typical setup on a modern system a application will not directly communicate with a kernel driver, but rather talk to a so called sound server. The sound server is typically capable of providing additional services in addition to what is supported by the raw hardware. This for example includes mixing multiple audio streams from different applications and being able to reformat and resample audio streams. The sound server will then communicate with the kernel driver.

In such a setup the application acts as a client to the sound server and the sound server as a server to the application. Furthermore the sound server acts as a client to the kernel driver whereas the kernel driver acts a server to the server.

And the following will refer to servers and clients. The audio server is a services that provides audio playback or capture capabilities to the audio client. A hardware driver is typically a audio server while a application is typically a audio client. But for example a sound server providing mixing capabilities to other applications will act as a audio client towards a kernel driver and as a audio server to a normal application.

Different kind of hardware devices have different sets of capabilities. As a generic framework ALSA has to abstract these capabilities and needs a device independent way to describe and configure them. In ALSA this is done by having a defined generic set of parameter types that are used to describe a audio stream. Parameter types include things like the samplerate or the number of audio channels to be used during playback or capture. They are generic and the same set of parameters is used for each audio stream. A full set of these parameter is in ALSA terminology called hardware parameters (or short hwparams). When a applications wants to configure a audio stream it populates a set of parameters to the requested configuration and then passes them to the kernel driver. The kernel driver will take this generic device configuration and translate it to a hardware specific configuration that will for example be written to the hardware’s configuration registers.

Configuation Space

If we only take into account the interval parameters the configuration space can be though as a multidimensional hyper-cube with each of the parameters representing one dimension. The configuration space spans all valid configurations, but not necessarily every configuration in the configuration space is a valid configuration. This can make things tricky when it comes to finding a single valid configuration point.

For example image the following simplified example where we only consider the samplerate and the number of channels thus reducing the number of dimensions to 2. A device supports a 16 kHz, 32 kHz and 48 kHz samplerate and at 16 kHz support both 1 channel and 2 channels. At 32 kHz and 48 kHz it only supports 1 channel. The intial configuration space for this device will span from 1 to 2 channels and from 16 kHz to 48 kHz for the samplerate. Now if the application performs a refine operation with a space spanning from 1 to 2 channels and from 32 to 48 kHz the result will be a space spanning 1 channel and from 32 kHz to 48 kHz.

Since not every hardware supports every kind of configuration a ALSA driver can specify constraints for each of the different parameters. A simple constraint can for example be that the hardware only supports stereo audio streams or a specific set of samplerates. A userspace application can query those constraints and then pick a configuration that it finds most suitable for its use case but still is within the constraints specified by the driver.

Having a static set of constraints already works well for a lot of usecases, but not all hardware limitations can be properly described by them. The possible values for one parameter might depend on the setting of one or more other parameters. E.g. a hardware might support mono and stereo playback at a 24 kHz samplerate, but due to memory bandwidth constraints it can only sustain a mono stream at 48 kHz.

In order to support this in ALSA the server and the client tend to go through some kind of bargaining process until they have finally agreed on one specific configuration. Often the client has certain priorities associated with how much it cares about getting a specific value for a configuration parameter and some parameters might be more important than others. So the client will start by setting only one parameters to a specif value then pass this set to the server which will update the remaining parameters with the new constraints and then pass it back to the application. Based on the new constraints the application will pick the next setting and the process continues until the application is satisfied with all parameters. The application does not necessarily have to fully specify all parameters and typically will not do so, it can leave some (or all) of them open in which case the server will try to pick one specific valid configuration. Once the application is satisfied with the parameter set it pass it to the driver and tell it to use this configuration. The driver will resolve any ambiguities by picking one valid setting and then pass the final configuration back to the application. This barging process is called parameter refining in ALSA terminology. E.g. a application might have the audio it wants to playback available in both mono and stereo but both are at a 48kHz samplerate. So as a first step it will specify that it wants 48kHz and then pass the configuration set to the kernel for further refinement. As the next step the application will pick either stereo or mono depending on what is available.

Real world constraints can take many different forms, e.g. a device might support a fixed set of sample rates, another device might support any sample rate that can be derived from a base frequency or a device might only support a even number of channels. For communicating between the driver and the application ALSA only knows two types of constraints and each

The first type is called a mask. Masks are used for parameters which have only a small set of discrete choice available, e.g. the sample format. A mask is a bitmap where each bit has an associated value (e.g. bit 0 means format A, bit 1 means format B), if the bit is set this means its associated value is available in the current configuration space, if it is not set it is not available.

The second type is called a interval. A interval is defined has having a minimum value and a maximum value. Both limits of the interval can either be open (numbers in the interval are greater/smaller than the minimum/maximum) or closed (numbers in the interval are greater/smaller or equal to the minimum/maximum). Additionally a interval can have the constraint that it must be integer, meaning only natural numbers are considered to be included in the interval, whereas for a interval without that constraint all rational numbers are considered to be included. Depending on the unit of the interval the integer requirement might come naturally. E.g. the number of channels must be integer whereas for the samplerate there is no such requirement. Finally the interval has also a empty property which will be set to 1 when the interval is empty, i.e. no numbers fall within its range.

A special from of a interval is the so called degenerate interval, which is a interval that refers to only a single value. In ALSA a interval is considered degenerate if either the minmum and the maximum value are the same or the minmum is the maximum - 1 and the maximum is a open limit. The later is necessary to be able to define degenerate intervals which contain a rational value. E.g. a interval that contains only 1.5 is expressed as (1, 2).

Another special interval is the empty interval. A empty interval does not contain any values. Any confiugration space that contains a empty interval is considered to be a invalid configuration space and any attempt to configure a device which such a configuration space will result in failure.

ALSA knows the following types of parameters.

SNDRV_PCM_HW_PARAM_ACCESS: How the memory for the audio buffer can be accessed and the layout of the samples of the buffer. Whereas the access method refers to how either mmap() or read()/write() (or one of the SNDRV_PCM_IOCTL_WRITE* SNDRV_PCM_IOCTL_READ* ioctls). The layout refers to whether the samples are stored interleaved or non-interleaved. Interleaved means that samples of one frame are stored in the same buffer one after another. E.g. ‘L,R,L,R,…’. Whereas non-interleaved means that each channel has its own buffer, e.g. ‘L,L,…’ and ‘R,R,…’. The ALSA API also specifies a so called COMPLEX layout which is used in cases where the layout can neither be described as interleaved or non-interleaved. This can for example be channel 1 and 2 interleaved in one buffer, and channel 3 and 4 interleaved in a second buffer.
SNDRV_PCM_HW_PARAM_FORMAT: Format the sample is stored in. This refers to the representation of how the audio data is stored in memory. This includes the type of encoding ([linear] PCM, A-Law, DSD, …), the number of bits per sample, whether it is unsigned or signed as well as the endianness.
SNDRV_PCM_HW_PARAM_SUBFORMAT: Allows to select a variation of the specified format, but it is reserved for future use and must currently always be SNDRV_PCM_SUBFORMAT_STD.
SNDRV_PCM_HW_PARAM_SAMPLE_BITS: The number of bits per sample. This refers to the what ALSA calls physical bits, which the effective amount of bits the sample will take when stored in memory including padding. E.g. both the SNDRV_PCM_FORMAT_S24_{LE,BE} and SNDRV_PCM_FORMAT_S24_3{LE,BE} format are 24 bit formats, but the former takes up 32 (physical) bits in memory while the later only takes up 24 bits.
SNDRV_PCM_HW_PARAM_FRAME_BITS: Similar to SAMPLE_BITS, but the number of bits for a full frame.
SNDRV_PCM_HW_PARAM_CHANNELS: The number of channels per frame. E.g. 1 for mono, 2 for stereo.
SNDRV_PCM_HW_PARAM_RATE: The samplerate of the playback or capture audio signal.
SNDRV_PCM_HW_PARAM_PERIOD_TIME: The length of one period in microseconds
SNDRV_PCM_HW_PARAM_PERIOD_SIZE: The length of one period in frames.
SNDRV_PCM_HW_PARAM_PERIOD_BYTES: The length of one period in bytes.
SNDRV_PCM_HW_PARAM_PERIODS: The number of periods in the buffer.
SNDRV_PCM_HW_PARAM_BUFFER_TIME: The length of the buffer in microseconds.
SNDRV_PCM_HW_PARAM_BUFFER_SIZE: The length of the buffer in frames.
SNDRV_PCM_HW_PARAM_BUFFER_BYTES: The length of the buffer in bytes.
SNDRV_PCM_HW_PARAM_TICK_TIME: This parameter is deprecated and no longer used, it should be ignored for all practical use cases. The only reason it is still part of the API definition is to satisfy the stable ABI requirement.

Some of these parameters are obviously interlinked, e.g. the PERIOD_TIME, PERIOD_SIZE and PERIOD_BYTES refer to the same value, just in different units. Similarly BUFFER_TIME and PERIOD_TIME are linked by the number of PERIODS. For these kinds of interlinked parameters ALSA has some built-in rules that are always installed that will make sure that when one of the parameters or the constraints for the parameter are updated the others will be updated as well.

E.g. imagine a driver that sets a constraint that it can support periods of a size from 1000 to 100000 bytes and can support between 2 to 16 periods. Even though the driver does not set a explicit constraint for the size of the buffer, due to the built-in rules the buffer size will have a constraint from 2000 to 1600000 bytes. Similarly if a application decides to select 4 periods, but does not yet decide on the buffer size or the period size the buffer size constraints will automatically be update to 4000 to 400000 bytes.

Device Driver Constraint Handling

As explained in the previous section typically the application and the driver go through a negotiation process. This negotiation process can start as soon as a PCM device has been opened. Hence drivers must setup all their constraints in their open callback, which is called when the PCM device is opened and before the application is notified that the device was opened successfully.

When setting up multiple constraints for the same parameter the result will always be the interection of all the constraints. E.g. if one constraint specifies that the number of channels must be an even number and another constraint specifies that the number of channels must be between 1 and 8 the final constraint will be that the number of samples must be either 2, 4, 6 or 8. It is possible to setup multiple constraints for the same parameter. In this case the par

Simple Constraints

Many devices do have rather simple constraints for most of their parameters. For specifying these constraint the ALSA core provides the snd_pcm_hardware struct. Each PCM runtime has one of these structs embedded and can be intiailzed by the driver in the drivers open callback.

struct snd_pcm_hardware {
    unsigned int info;      /* SNDRV_PCM_INFO_* */
    u64 formats;            /* SNDRV_PCM_FMTBIT_* */
    unsigned int rates;     /* SNDRV_PCM_RATE_* */
    unsigned int rate_min;      /* min rate */
    unsigned int rate_max;      /* max rate */
    unsigned int channels_min;  /* min channels */
    unsigned int channels_max;  /* max channels */
    size_t buffer_bytes_max;    /* max buffer size */
    size_t period_bytes_min;    /* min period size */
    size_t period_bytes_max;    /* max period size */
    unsigned int periods_min;   /* min # of periods */
    unsigned int periods_max;   /* max # of periods */
    size_t fifo_size;       /* fifo size in bytes */
};

Static and Dynamic Constraints

Constraints that are setup by drivers can grouped into two different categories. Static constraints and dynamic constraints. Most of the simple constraints mentioned above are static constraints.

For dynamic constraints it is possible to specify dependencies on one or more parameters. If any of the dependencies changed the constraint will be re-evaluated. Dynamic constraints can be used model constraints that have dependencies on other parameters. Dynamic constraints can also be used to model constraints that can not be fully expressed using intervals and masks.

Built-in Constraints

By default the ALSA core already installs a set of built-in constraints that will always be present for each runtime. These constraints mainly take care of enforcing some … and make sure that interlinked parameters stay synchronized.

The following list gives a overview of all the existing built-in dynamic constraints.

FORMAT = List of formats that have SAMPLE_BITS physical bits
SAMPLE_BITS = Set of physical bits that are valid for FORMAT
SAMPLE_BITS = FRAME_BITS / CHANNELS
FRAME_BITS = SAMPLE_BITS * CHANNELS
CHANNELS = FRAME_BITS / SAMPLE_BITS
FRAME_BITS = PERIOD_BYTES * 8 / PERIOD_SIZE
FRAME_BITS = BUFFER_BYTES * 8 / BUFFER_SIZE
RATE = PERIOD_SIZE * 1000000 / PERIOD_TIME
RATE = BUFFER_SIZE * 1000000 / BUFFER_TIME
PERIODS = BUFFER_SIZE / PERIOD_SIZE
PERIOD_SIZE = BUFFER_SIZE / PERIODS
PERIOD_SIZE = PERIOD_BYTES * 8 / FRAME_BITS
PERIOD_SIZE = PERIOD_TIME * RATE / 1000000
BUFFER_SIZE = PERIOD_SIZE * PERIODS
BUFFER_SIZE = PERIOD_BYTES * 8 / FRAME_BITS
BUFFER_SIZE = BUFFER_TIME * RATE / 1000000
PERIOD_BYTES = PERIOD_SIZE * FRAME_BITS / 8
BUFFER_BYTES = BUFFER_SIZE * FRAME_BITS / 8
PERIOD_TIME = PERIOD_SIZE * 1000000 / RATE
BUFFER_TIME = BUFFER_SIZE * 1000000 / RATE

Note that all these rules work on intervals and masks rather than just a single value. Which means they follow the rules for multiplying and dividing intervals. The result of such a operation is the interval that contains all possible values that can result from applying the operation on any two values that are in the operand intervals. So for multplication this is a.minA * b.min to a.max * b.max and for division this is from a.min / b.max to a.max / b.min.

Additionally the core also installs static constraints to make sure that CHANNELS, BUFFER_SIZE, BUFFER_BYTES, SAMPLE_BITS and FRAME_BITS must be a integer number.

Standard Constraints

The ALSA core provides a set of standard helper functions that can be used to set common types constraints. All these functions take a PCM runtime as their first parameter. This is the runtime to which the constraint is applied. Most of functions also take a snd_pcm_hw_param_t which specifies to which ??? parameter the constraint should be applied. Some functions, if they add a dynamic constraint, additionally allow to specify the condition under which the constraint is valid.

Mask

int snd_pcm_hw_constraint_mask(struct snd_pcm_runtime *runtime, snd_pcm_hw_param_t var,
                               u_int32_t mask);
int snd_pcm_hw_constraint_mask64(struct snd_pcm_runtime *runtime, snd_pcm_hw_param_t var,
                                 u_int64_t mask);

Minimum and Maximum

int snd_pcm_hw_constraint_minmax(struct snd_pcm_runtime *runtime, snd_pcm_hw_param_t var,
                                 unsigned int minimum, unsigned int maxium);

Setting a minimum-maximum constraint will ensure that only values that are larger or equal to the minimum and less or equal to the maximum will be considered a valid configuration.

A commonly used idom is to set both minmum and maximum to the same value to constrain the parameter to a single valid value.

ret = snd_pcm_hw_constraint_minmax(runtime, SNDRV_PCM_HW_PARAM_CHANNELS, 2, 2);
if (ret < 0)
    ...

Integer

int snd_pcm_hw_constraint_integer(struct snd_pcm_runtime *runtime, snd_pcm_hw_param_t var);

Setting a integer constraint on a parameter will make sure that only natural numbers will be considered a valid configuration.

For example often a driver (or the hardware) requires that the buffer size is a multiple of the period size. This can be achived by installing a integer constraint for the number of periods.

ret = snd_pcm_hw_constraint_integer(runtime, SNDRV_PCM_HW_PARAM_PERIODS);
if (ret < 0)
    ...

List

struct snd_pcm_hw_constraint_list {
        unsigned int count;
        const unsigned int *list;
        unsigned int mask;
};

int snd_pcm_hw_constraint_list(struct snd_pcm_runtime *runtime,
                               unsigned int cond, snd_pcm_hw_param_t var,
                               const struct snd_pcm_hw_constraint_list *l);

A list constraint restricts the valid configuration values to discreate set of values.

Variable Denumerator

struct snd_ratnum {
        unsigned int num;
        unsigned int den_min, den_max, den_step;
};

struct snd_pcm_hw_constraint_ratnums {
        int nrats;
        struct snd_ratnum *rats;
};

int snd_pcm_hw_constraint_ratnums(struct snd_pcm_runtime *runtime,
                                  unsigned int cond, snd_pcm_hw_param_t var,
                                  struct snd_pcm_hw_constraint_ratnums *r);

This constraint can be used if the valid values can be expressed as a fraction with a constant numberator and a variable denumerator.

This is for example useful where the samplerate is derived from static reference clock with a programable divider. The following example shows a system with two selectable reference clocks, one running at 12.288 MHz and the other at 11.2896 MHz, and a programable divider in the range of 128 to 1024.

static struct snd_ratnum rats[] = {
    {
        .num = 122880000,
        .den_min = 128,
        .den_max = 1024,
    }, {
        .num = 11289600,
        .den_min = 128,
        .den_max = 1024,
    },
};

static struct snd_pcm_hw_constraint_ratnums ratnums = {
    .nrats = ARRAY_SIZE(rats),
    .rats = rats,
};

ret = snd_pcm_hw_constraint_ratnums(runtime, 0, SNDRV_PCM_HW_PARAM_RATE, &ratnums);
if (ret < 0)
    ...

Variable Numerator

struct snd_ratden {
        unsigned int num_min, num_max, num_step;
        unsigned int den;
};

struct snd_pcm_hw_constraint_ratdens {
        int nrats;
        struct snd_ratden *rats;
};

int snd_pcm_hw_constraint_ratdens(struct snd_pcm_runtime *runtime,
                                  unsigned int cond, snd_pcm_hw_param_t var,
                                  struct snd_pcm_hw_constraint_ratdens *r);

Most Significant Bits

int snd_pcm_hw_constraint_msbits(struct snd_pcm_runtime *runtime,
                                 unsigned int cond, unsigned int width, unsigned int msbits);

A most significant bits constraints can be used to set the number of most significant bits depending on the bit width of the select sample format. This rule is only applied if the format parameter has been narrowed down to a single format.

Step Size

int snd_pcm_hw_constraint_step(struct snd_pcm_runtime *runtime,
                               unsigned int cond, snd_pcm_hw_param_t var,
                               unsigned long step);

A step size constraint ensures that any valid configuration value is a multiple of the specified step size.

E.g. to specify that a driver only supports a even number of channels the following constraint can be used.

ret = snd_pcm_hw_constraint_step(runtime, 0, SNDRV_PCM_HW_PARAM_PERIODS, 2);
if (ret < 0)
    ...

Power of Two

int snd_pcm_hw_constraint_pow2(struct snd_pcm_runtime *runtime,
                               unsigned int cond, snd_pcm_hw_param_t var);

A power of two constraint ensures that any valid configuration value is a power of two.

No Resample

int snd_pcm_hw_rule_noresample(struct snd_pcm_runtime *runtime,
                               unsigned int base_rate);

This constraint is mostly interesting for devices that run the ADC or the DAC at a fixed samplerate but have a hardware (asynchronous) samplerate converter to support applications that request playback or capture at a different samplerate. This constraint will become active if the application requests that no samplerate rate conversion is performed by the hardware, in which case the available rates will be restricted to the specified rate.

Custom Dynamic Constraints

Most drivers will be able to properly express their constraints using either static constraints or using the dynamic constraint functions that are provided by the ALSA core.

int snd_pcm_hw_rule_add(struct snd_pcm_runtime *runtime, unsigned int cond,
                        int var, snd_pcm_hw_rule_func_t func, void *private, int dep, ...)

When adding custom rules for one parameter that has a dependency on another paramter it is important to also adding a rule for the inverse constraint. E.g. if the hardware supports both 16 kHz and 48 kHz samplerates and at 16 kHz supports both 32-bit and 16-bit samples but at 48 kHz only supports 16-bit two rules should be added. The first rule targeting the RATE parameter should restrict the rate to 16 kHz if the current format selection does not contain a 16-bit format. The second rule targeting the FORMAT parameter should restrict the available formats to 16-bit formats if the current samplerate selection does not contain 16 kHz.

Constraint Handling for ASoC drivers

Constraint handling in ASoC driver follows the same principals as constraint handling in regular drivers. But a few things have to be considered to make it work correctly and seamlessly. A ASoc sound card will be made up out of different components, each component having its own driver.

struct snd_soc_pcm_stream {
    const char *stream_name;
    u64 formats;            /* SNDRV_PCM_FMTBIT_* */
    unsigned int rates;     /* SNDRV_PCM_RATE_* */
    unsigned int rate_min;      /* min rate */
    unsigned int rate_max;      /* max rate */
    unsigned int channels_min;  /* min channels */
    unsigned int channels_max;  /* max channels */
    unsigned int sig_bits;      /* number of bits of content */
};

struct snd_soc_dai_driver {
    [...]
    /* DAI capabilities */
    struct snd_soc_pcm_stream capture;
    struct snd_soc_pcm_stream playback;
    unsigned int symmetric_rates:1;
    unsigned int symmetric_channels:1;
    unsigned int symmetric_samplebits:1;
    [...]
};