Using portable, multi-OS sound systems

407

Author: Christian Vincenot

In the first part of this article, we talked about several different sound systems and APIs that were available for Linux and its various desktop environments. Many developers, however, need to write applications that will work across multiple environments, including different operating systems. Here’s how we can accomplish this.

The idea for getting multi-system portability to work is to add one more layer on top of the existing ones, and to make calls to that higher API that will detect what systems are available and use them. This works as an abstraction library which makes the developer’s life easier and the lower levels completely invisible in the program. There are a few projects that tried to fulfill this idea:

  • libao — This library is available on most Linux distributions and works with many sound systems classed in levels of priority: priority 30 -> ALSA, 20 -> OSS, Irix, Sun, 10 -> esd, aRTs, 0 -> NULL, and file outputs. The back end tries to find a sound system starting with the highest priority and lowering it until one if found.

    The only real problems with libao are that the API lacks power and the fact that only sound output is supported. That makes libao useful for application needing simple sound output without having to care about the sound system.

  • CSL — The Common Sound Layer seems to be an almost dead project now (its last infos are dated from 2001) and supports only aRTs and OSS, but it supports them in full duplex, with more options and with latency management.

PortAudio: the holy grail?

PortAudio is a free (under an MITish license, so GNU compatible) powerful cross-platform audio library that works on Windows, Macintosh (8,9,X), Linux, FreeBSD, Solaris, SGI, and BeOS. The latest stable version (V18) supports an impressive number of sound systems: Windows DirectSound, Windows MME, Macintosh SoundMgr for OS7-9 and CARBON, Core Audio for OS X, OSS, ASIO for Mac and Windows, Silicon Graphics Irix, and BeOS. It works with the same interrupt-driven method as JACK, but an extra utility called PABLIO also enables a programmer to access the audio stream as a file by writing to a FIFO which is read by the callback. Notes on compilation can be found in the PortAudio Tutorial. Here comes a tiny piece of code :

#include "stdio.h"
#include "portaudio.h"

static int myCallback(void *inputBuffer, void *outputBuffer,
                       unsigned long framesPerBuffer, PaTimestamp outTime, void *userData)
{
    float *out = (float *) outputBuffer;
    float *in  = (float *) inputBuffer;
    float leftInput, rightInput;
    unsigned int i;
    if (inputBuffer == NULL) return 0;

    /* Read input buffer, process data, and fill output buffer. */
    for(i=0; i<framesPerBuffer; i++)
    {
        leftInput = *in++;     	/* Get interleaved samples from input buffer. */
        rightInput = *in++;
        *out++ = (...)       	/* L output treatment */
        *out++ = (...)      	/* R output treatment  */
    }
    return 0;
}

int main(void)
{
    PortAudioStream *stream;
    Pa_Initialize();
    Pa_OpenDefaultStream(
        &stream,
        2, 2,            	/* stereo input and output */
        paFloat32, 44100.0,	/* PA can use different data types. Here 32bits floats */
        64, 0,          	/* 64 frames per buffer, let PA determine numBuffers */
        myCallback, NULL);
    Pa_StartStream(stream);
    Pa_Sleep(10000);    	/* Sleep for 10 seconds while processing. */
    Pa_StopStream(stream);
    Pa_CloseStream(stream);
    Pa_Terminate();
    return 0;
}

The underlying systems are completely invisible to the programmer and the API is quite simple and has great capabilities. PortAudio really looks perfect, but there’s one thing: what about ALSA or JACK support? The latest stable version, V18, was released in 2001, and since then PortAudio developers decided to write a brand new support library and API (version 2) as well as many improvements, among which support for ALSA and JACK. This development version, V19, is already usable and the API is frozen, so, even if many things are still not running completely and some V18 support hasn’t been backported (especially Mac stuff), you can already use it in your applications.

Summary table
Type Multiplexing/full duplex API Design Latency Portability Other good/bad points
OSS/Free Driver None -power-simple Device file access Low, blocking I/O Unix Deprecated
ALSA Driver Potential SBSM and FD Powerful File-like/Callback Low Linux
ESD SS SBSM and FD Simple File-like medium E/GNOME N.T., popular
aRTs SS SBSM and FD (buggy?) Simple File-like High KDE N.T., good sound
JACK SS SBSM and FD Powerful Callback Low POSIX So much to say!
libao Abs. Lib. Output only Simple File-like OSS, ALSA, esd, NAS, and Unixes No JACK or Win
PortAudio Abs. Lib. SBSM and FD Powerful File-like/Callback Too many No ALSA|JACK
PA V19 Abs. Lib. SBSM and FD Powerful File-like/Callback Like V18 plus ALSA and JACK Designed but unfinished

Conclusion

My goal in these articles was to give an overview of the audio systems available for programmers who wish to make use of sound in their applications. I’ve omitted some sound systems that were too specific (NAS and OpenAL by Loki Software) or part of a bigger project and whose linking to the program would make the binary too heavy (SDL and GStreamer).

I think that ALSA and JACK are the future of sound under Linux, and PortAudio V19 will certainly be a safe choice for programmers seeking for compatibility with other systems. That’s why I suggest programmers use the power of these systems unless they have specific needs which would be better suited by others.

Vincenot has been a Linux user for eight years, and is currently a student at University Louis Pasteur in Strasbourg.