To write an audio application that samples, edits, or otherwise manipulates sound, the first decision you have to make is choosing which platform you want to lock yourself into. After all, even the most basic real-time audio playback functions are close to the bare metal of the operating system. If you’re going to put time and maybe money into an audio development effort, of course you want the widest swath of platforms for release. PortAudio answers the call by delivering a free, cross-platform, open-source audio I/O library. It lets you write simple audio programs in C that will compile and run on many platforms, including Windows, Mac, and Linux/Unix.
PortAudio, which provides a very simple API for recording and/or playing sound using a simple callback function, is intended to promote the exchange of audio synthesis software between developers on different platforms. It includes example programs that synthesize sine waves and pink noise, perform fuzz distortion on a guitar, list available audio devices, and much more. Carnegie Mellon University’s PortMusic project, which includes MIDI and soon will provide sound file support, recently selected PortAudio as its audio component.
Playing Musical Platforms
The PortAudio library supports an array of platforms including Windows, Linux, and Macintosh variants (see Table 1), but if you don’t have prior audio development experience you quickly will find yourself adrift in a sea of API standards. After computer audio became mainstreamed with Windows 3.1 MultiMedia Extension (MME) and the ubiquitous .WAV file back in 1991, a variety of solutions followed. First came Direct Sound in the Windows 95 era, which unfortunately lacked a record capability. The Windows 2000/XP generation then introduced the fastest solution for Windows users: the Windows Driver Model.
Platform Code | Minimum PortAudio Version | Description |
---|---|---|
pablio | 19.0 | PortAudio Blocking I/O (PABLIO) |
pa_asio | 18.1 | ASIO for Windows and Macintosh |
pa_beos | 18.1 | BeOS |
pa_jack | 19.0 | JACK for Linux and OSX |
pa_linux_alsa | 19.0 | Advanced Linux Sound Architecture (ALSA) |
pa_mac_sm | 18.1 | Macintosh Sound Manager for OS 8, 9, and Carbon |
pa_mac_core | 18.1 | Macintosh Core Audio for OS X |
pa_sgi | 18.1 | Silicon Graphics AL |
pa_unix_oss | 18.1 | Open Sound System (OSS) implementation for various Unix variants |
pa_win_ds | 18.1 | Windows Direct Sound |
pa_win_wdmks | 19.0 | Windows Driver Model with Kernel Support (WDMKS) |
pa_win_wmme | 18.1 | Windows MultiMedia Extension |
Table 1. Platforms Supported by PortAudio
The latency requirements of your application should dictate your choice of API. If your sound program does not require a quick response time (close to a “live” performance), you are certainly free to use the MME or Direct Sound platform. However, if you require very low latency (below 20ms response time), you will need ASIO or WDMKS. The downside of ASIO is that it requires (usually) proprietary drivers that at best require end-user installation and at worst are not even available for cheaper audio systems. (For more details, refer to the SoundCard FAQ.)
Getting Ready to Sound Off
To start programming with PortAudio, the first thing you need to do is go to www.portaudio.com and pick out a relevant distro. Because V18.1, the last official release, is nearing the three-year-old mark, you might as well start with a current V19 code snapshot. (An older precompiled DLL for PortAudio V17 also is available, but that’s all as of this writing.) Either way, it’s a matter of unpacking a ZIP file or tarball, because PortAudio is pretty much distributed in a source-only format.
As you might expect with any streaming interface, PortAudio supports two different programming models: a blocking API and a non-blocking API. The non-blocking API was developed first. The blocking API came later and is still unofficial. Although simple command-line type tools can use a blocking API with little impact, a modern GUI application would need to invoke a thread to manage blocking I/O calls. Otherwise, the app looks dead to both the OS and the end-user during I/O.
This article examines only the non-blocking API. A typical non-blocking PortAudio application requires the following steps:
- Write a callback function that PortAudio (PA) will call when audio processing is needed.
- Initialize the PA library and open a stream for audio I/O.
- Start the stream: PA now will call your callback function repeatedly in the background.
- Inside your callback, you can read audio data from the inputBuffer and/or write data to the outputBuffer.
- Stop the stream by returning a 1 from your callback or calling a stop function.
- Close the stream and terminate the library.
Hello PortAudio, A Sample Application
Although ASIO, WMSDK, and DirectSound layers are available, the sample application discussed in this section uses the Windows MME, the lowest common denominator. First, you need to build a static library out of the following modules:
- “Common” base library
- “Win” platform library (You will disable Direct Sound and ASIO for simplicity’s sake.)
- Layer-specific interface module
You do this from the DOS window by using Visual Studio C++ as follows (you may want to make this a .BAT file):
cd pa_snapshot_v19portaudiopa_common del *.lib copy ..pa_win cl /c /DPA_NO_DS /DPA_NO_ASIO *.c lib /out:portaudio.lib *.obj cd ..pa_win_wmme cl /c pa_win_wmme.c /I../pa_common
On this foundation, you can pick out a test program and link the thing together to see how it goes:
cd ..pa_tests cl patest_saw.c /I../pa_common /link ..pa_commonportaudio.lib ..pa_win_wmmepa_win_wmme.obj
Note: In the preceding three lines of code, lines two and three should be one continuous line. The line was broken only to display properly on this Web page.
What you get is about five seconds of pure sawtooth wave pleasure! But, that’s not the point. You now have a platform-independent, sound-synthesizing piece of code with which you could implement any number of effects.
PortAudio comes with about four dozen test programs. Look at the guitar fuzz distortion box simulator “pa_fuzz.c” (see below) so you can rock on like Peter Frampton and Joe Walsh. Use essentially the same build command as before:
cd ..pa_tests cl patest_toomanysines.c /I../pa_common /link ..pa_commonportaudio.lib ..pa_win_wmmepa_win_wmme.obj pa_fuzz.c: 1 #include <stdio.h> 2 #include <math.h> 3 #include "portaudio.h" 4 /* 5 ** Note that many of the older ISA sound cards on PCs do NOT 6 ** support full duplex audio (simultaneous record and playback). 7 ** And some support only full duplex at lower sample rates. 8 */ 9 #define SAMPLE_RATE (44100) 10 #define PA_SAMPLE_TYPE paFloat32 11 #define FRAMES_PER_BUFFER (64) 12 13 typedef float SAMPLE; 14 15 /* Non-linear amplifier with soft distortion curve. */ 16 float CubicAmplifier( float input ) 17 { 18 float output, temp; 19 if( input < 0.0 ) { 20 temp = input + 1.0f; 21 output = (temp * temp * temp) - 1.0f; 22 } else { 23 temp = input - 1.0f; 24 output = (temp * temp * temp) + 1.0f; 25 } 26 return output; 27 }
You can represent the signal in many ways with PortAudio. The most common mechanism is to use float values from -1.0 to +1.0 to represent the audio signal (paFloat32). You can also use 16-bit integers if you are more comfortable with that or some other representation. The CubicAmplifier() function simulates the distortion that an analog amplifier would produce, the mathematics of which are beyond the scope of the current discussion.