Before starting audio production, it's helpful to decide how a title will use audio and where the audio files will reside. This section discusses the following choices.
Type of audio: Samples (AIFF files) for music and sound effects or MIDI scores, with either samples or synthetic instruments attached to MIDI notes.
Run-time location of files: RAM-resident or CD-ROM-resident.
Playback for CD-ROM-resident sounds: spooled or streamed from disc.
RAM budget: stereo or mono, sample rate, and/or 2:1 compression.
RAM-resident vs. ROM-resident sound
First decide whether the title itself will be RAM-resident or will use data streaming.
A RAM-resident title places all necessary code within the 3 MB-or actually 2.2 MB-of the available system RAM during any run-time segment. RAM-resident titles leave the CD-ROM available for accessing audio through spooling, streaming, or loading.
Titles stream most data and code on a real-time, interactive basis from the CD-ROM. In this case, you cannot spool an audio sample file, but you can include audio as part of the data stream.
Since seek times-the time it takes to find the sound on the CD-ROM-limit how interactive sound can be, RAM-resident sounds have a distinct advantage. It is possible, however, to have multiple audio streams and dynamically fade, pan, and mix them according to what's going on in the title.
Spooled vs. streamed sound
Spooled sound comes directly from a file on disc. A program can play short individual sound effects like an explosion or a door opening. Background music can be spooled in using a separate task.
Streamed sound is played using functions from the 3DO Data Streamer library. The advantage of working with streamed audio is that it allows a title to benefit from audio sample files, even though the title is not RAM-resident. Note that you can't stream more than 300 KB/sec total of video and audio data.
For more information
The 3DO Jumpstart for Programmers provides examples for playing a sound effect and for spooling sound with the Sound Spooler.
The 3DO DataStreamer Programmer's Guide discusses the DataStreamer.
Sound file characteristics
Decisions about where sound will reside and how it will be played influence decisions about the characteristics of the sound file you generate. There's always a trade-off between file size and sound quality.
Stereo vs. mono
An early consideration should be whether to use stereo or mono sounds. Remember that stereo sound at a 44.1 kHz sample rate requires 10 MB per minute. If mono is adequate, the size is cut in half. Certain sound effects require stereo (panning, flanging, tennis matches, etc.), but many sound effects can be mono. If CD-ROM space is available, spooled or streamed files should be stereo, since more and more television sets provide stereo sound.
Sample rate
When you decide on the sample rate, remember that many televisions produce high-quality sound. You may not be able to get by with 22 kHz samples. Take care to avoid audio aliasing: Sample any given sound at twice the frequency of its highest component. Human speech and other sounds with many upper partials may sound boxy at 22 kHz.
Compression
You can use compressed sound as long as the correct DSP instrument is attached during playback.
For 2:1 compression, use the SquashSound MPW tool, which is part of the 3DO system release, or the SoundHack tool, which is documented in this book.
For 4:1 ADPMC compression, you can also use SoundHack. At 3DO, developers have found, however, that a 22 kHz sound compressed at a 2:1 ratio is preferable to a 44.1 kHz sound compressed at a 4:1 ratio.
When you decide whether to compress your audio files, remember that there is a DSP bandwidth trade-off: Compression saves RAM/ROM but decompression can result in a big system resource overhead hit because the DSP code required to decompress sample files takes up more DSP cycles and code space.