Finding voice codecs for free software

730

Author: Nathan Willis

In a recent article about VoIP softphones, I touched on the problem of proprietary, patent-encumbered codecs. To recap, the point was that a triumph of open protocols, like Session Initiation Protocol (SIP) and Inter-Asterisk EXchange (IAX), is hollow if the marketplace standardizes on closed, proprietary codecs for delivering the voice data itself. But how do you find the good free codecs? Here are some options.

If softphone vendors adopt only proprietary codecs, Linux distributors and software vendors will be forced to either pay royalties and license fees to the patent owners, or remove support for the popular standards, leaving their users in a VoIP ghetto — much like the situation today with many distros and MP3.

But not-for-profit software projects would suffer, too, by being forced to choose between poor interoperability and running through a legal minefield. And that is assuming that the specifications were open enough to permit coding a compliant library; many commercial codecs are completely off-limits to unpaying eyes.

Clearly, an ounce of prevention is worth a pound of cure here. Most softphones can support multiple codecs. The trick is making sure free ones are promoted and adopted as widely as possible. Even closed source applications can and should support free codecs. It’s good business sense — particularly the can’t-get-cheaper-than-free licensing.

If you are not sure which voice codecs are free and which are not, you are not alone. Clear information on the legal status of many codecs is hard to come by, and what is available is scattered in a hundred different places. Here’s a rundown of what’s out there.

Free as they get

First off are the codecs that I would call completely free: free of patents, free of all licensing restrictions, and free of royalties.

Speex is a voice-specific compression format developed by Xiph.org and released under a BSD-style license. It works at 8, 16, and 32kHz sample rates over a dozen bitrates ranging from 2.2Kbps to 44.2Kbps. Speex is an official part of the GNU Project, and the copyright to the code belongs to Xiph.org — a 501(c)3 non-profit organization established to protect open multimedia standards and software.

Sun Microsystems released implementations of several codecs into the public domain, including G.711, one of the older ITU-approved standards, a low-delay, high-quality format at 8kHz, 64Kbps.

Sun also released a suite of Adaptive Differential Pulse Code Modulation (ADPCM) codecs into the public domain. They correspond to the International Telecommunication Union (ITU) G.721, G.723, G.726, and G.727 standards and function at 16, 24, 32, and 40Kbps.

Mostly free

A second group comprises codecs that are usable in free software projects, but with some form of restriction.

Jack Jansen released his own, faster ADPCM implementation built on an algorithm referred to alternately as Intel/DVI or IMA ADPCM. The algorithm initially debuted in an Intel hardware device, but was later adopted as a standard by the now-defunct Interactive Multimedia Association. The accompanying license requires a copyright notice but is otherwise free of restrictions. Jansen’s implementation of ADPCM is available for download as a zip file.

Similarly, a Linear Predictive Coding (LPC) codec named OpenLPC requires a copyright notice for redistribution, although the company cited in the notice (Future Dynamics) appears to be out of business.

The US Department of Defense drafted two voice-compression proposals: a 2.4Kbps derivative of LPC called LPC-10, and a 4.8Kbps hybrid called Code-Excited Linear Prediction (CELP). They are officially known as Federal Standard 1015 and Federal Standard 1016, respectively, and are available from Tony Robinson’s comp.speech archive.

The US government holds the copyright on this code, but has indicated in the past that it may remove it. Also, certain countries under sanction or embargo may be barred from obtaining the source. On the other hand, none of the technology is patented, so independently developed implementations are free of these legal concerns.

Ramalho G.711 Lossless (RGL) is a G.711 implementation offered by Vovida.org under the Vovida Software License (VSL). The VSL is an open source license, approved by the Open Source Initiative (OSI), but it does require a copyright notice. Vovida.org is a unit of Cisco Systems.

Barely free

Next are codecs which are patented, or otherwise closed, but are free to use under certain conditions.

GlobalIPSound offers its Internet Low Bitrate Codec (iLBC) codec free of charge. This is good, because many programs use it, but the license terms allow GIPS to change its mind at any time, so watch out.

VoiceAge of Montreal offers object code for three of its codecs free for non-commercial usage. They are (cynically, if you ask me) named Open G.729A, Open AMR, and Open AMR-WB. To use them you must accept the restrictions in the license agreement, so read very carefully; frankly, there is not much that you are allowed to do.

Maybe free

Finally, some codecs are available under confusing or disputed license arrangements. I include these in case you wish to research them further. In the meantime, use them at your own risk.

The Low-Delay Code Excitation Linear Prediction (LD-CELP) codec, also known as G.728, is a good example. It is an enhancement of the CELP codec and released publicly, but with no license attached. At least two versions of the code are floating around, one attributed to Alex Zatsmann of Analog Devices and one to Michael Concannon. Some sources cite Analogical Systems as the owner.

GSM6.10 is the audio codec used by GSM mobile phone networks. A well-maintained free implementation of this codec is available, but there are disputed patent claims by Philips Electronics.

Be aware of the potential consequences when using any codec with unclear licensing terms.

Final word

Most of the sites referenced above contain more information about voice compression in general, including where to find updated implementations. Many of the free codecs date back to the early or mid-’90s, but don’t be fooled into thinking that makes them out-of-date.

Phil Frisbie has put a lot of work into his HawkVoice library — a LGPLed collection of 15 or so of these free voice codecs, rewritten to use a uniform interface. He told me:

When GSM was created a workstation could not encode and decode a single voice stream in real-time, but now your PDA can encode and decode a dozen GSM streams at once! Open source developers should not worry that the latest IP-laden codecs are denied to them. There is STILL much life in already available open source codecs to provide any application with good speech compression.

Lucky for us the VoIP marketplace is nascent enough that no dominant codecs have emerged in competing systems. There is still time for free and open codecs to make a play for the current generation of softphones. As Richard Stallman said, “The practice of using the non-free codecs is one of the major obstacles that free software faces, and the only way to surmount it is for people to start pushing back.”

Category:

  • Free Software