Author: Christian Vincenot
The idea for getting multi-system portability to work is to add one more layer on top of the existing ones, and to make calls to that higher API that will detect what systems are available and use them. This works as an abstraction library which makes the developer’s life easier and the lower levels completely invisible in the program. There are a few projects that tried to fulfill this idea:
- libao — This library is available on most Linux distributions and works with many sound systems classed in levels of priority: priority 30 -> ALSA, 20 -> OSS, Irix, Sun, 10 -> esd, aRTs, 0 -> NULL, and file outputs. The back end tries to find a sound system starting with the highest priority and lowering it until one if found.
The only real problems with libao are that the API lacks power and the fact that only sound output is supported. That makes libao useful for application needing simple sound output without having to care about the sound system.
- CSL — The Common Sound Layer seems to be an almost dead project now (its last infos are dated from 2001) and supports only aRTs and OSS, but it supports them in full duplex, with more options and with latency management.
PortAudio: the holy grail?
PortAudio is a free (under an MITish license, so GNU compatible) powerful cross-platform audio library that works on Windows, Macintosh (8,9,X), Linux, FreeBSD, Solaris, SGI, and BeOS. The latest stable version (V18) supports an impressive number of sound systems: Windows DirectSound, Windows MME, Macintosh SoundMgr for OS7-9 and CARBON, Core Audio for OS X, OSS, ASIO for Mac and Windows, Silicon Graphics Irix, and BeOS. It works with the same interrupt-driven method as JACK, but an extra utility called PABLIO also enables a programmer to access the audio stream as a file by writing to a FIFO which is read by the callback. Notes on compilation can be found in the PortAudio Tutorial. Here comes a tiny piece of code :
#include "stdio.h" #include "portaudio.h" static int myCallback(void *inputBuffer, void *outputBuffer, unsigned long framesPerBuffer, PaTimestamp outTime, void *userData) { float *out = (float *) outputBuffer; float *in = (float *) inputBuffer; float leftInput, rightInput; unsigned int i; if (inputBuffer == NULL) return 0; /* Read input buffer, process data, and fill output buffer. */ for(i=0; i<framesPerBuffer; i++) { leftInput = *in++; /* Get interleaved samples from input buffer. */ rightInput = *in++; *out++ = (...) /* L output treatment */ *out++ = (...) /* R output treatment */ } return 0; } int main(void) { PortAudioStream *stream; Pa_Initialize(); Pa_OpenDefaultStream( &stream, 2, 2, /* stereo input and output */ paFloat32, 44100.0, /* PA can use different data types. Here 32bits floats */ 64, 0, /* 64 frames per buffer, let PA determine numBuffers */ myCallback, NULL); Pa_StartStream(stream); Pa_Sleep(10000); /* Sleep for 10 seconds while processing. */ Pa_StopStream(stream); Pa_CloseStream(stream); Pa_Terminate(); return 0; }
The underlying systems are completely invisible to the programmer and the API is quite simple and has great capabilities. PortAudio really looks perfect, but there’s one thing: what about ALSA or JACK support? The latest stable version, V18, was released in 2001, and since then PortAudio developers decided to write a brand new support library and API (version 2) as well as many improvements, among which support for ALSA and JACK. This development version, V19, is already usable and the API is frozen, so, even if many things are still not running completely and some V18 support hasn’t been backported (especially Mac stuff), you can already use it in your applications.
Type | Multiplexing/full duplex | API | Design | Latency | Portability | Other good/bad points | |
OSS/Free | Driver | None | -power-simple | Device file access | Low, blocking I/O | Unix | Deprecated |
ALSA | Driver | Potential SBSM and FD | Powerful | File-like/Callback | Low | Linux | |
ESD | SS | SBSM and FD | Simple | File-like | medium | E/GNOME | N.T., popular |
aRTs | SS | SBSM and FD (buggy?) | Simple | File-like | High | KDE | N.T., good sound |
JACK | SS | SBSM and FD | Powerful | Callback | Low | POSIX | So much to say! |
libao | Abs. Lib. | Output only | Simple | File-like | OSS, ALSA, esd, NAS, and Unixes | No JACK or Win | |
PortAudio | Abs. Lib. | SBSM and FD | Powerful | File-like/Callback | Too many | No ALSA|JACK | |
PA V19 | Abs. Lib. | SBSM and FD | Powerful | File-like/Callback | Like V18 plus ALSA and JACK | Designed but unfinished |
Conclusion
My goal in these articles was to give an overview of the audio systems available for programmers who wish to make use of sound in their applications. I’ve omitted some sound systems that were too specific (NAS and OpenAL by Loki Software) or part of a bigger project and whose linking to the program would make the binary too heavy (SDL and GStreamer).
I think that ALSA and JACK are the future of sound under Linux, and PortAudio V19 will certainly be a safe choice for programmers seeking for compatibility with other systems. That’s why I suggest programmers use the power of these systems unless they have specific needs which would be better suited by others.
Vincenot has been a Linux user for eight years, and is currently a student at University Louis Pasteur in Strasbourg.