Thursday, September 27, 2007

Wavetable synthesis of sound


Wave Table Synthesis
The majority of synthesizers available today use some form of sampled-sound or Wavetable synthesis. The theory behind this technique is to use small sound samples of real instruments. Based on what musical note needs to be played, the original sound sample is modified or modulated digitally to create the desired sound. This is much easier approach and a synthesizer can sound very much like real instrument.


Looping and Envelope Generation
One of the primary techniques used in Wavetable synthesizers to conserve sample memory space is the looping of sampled sound segments. For many instrument sounds, the sound can be modeled as consisting of two major sections: the attack section and the sustain section. The attack section is the initial part of the sound, where the amplitude and the spectral characteristics of the sound may be changing very rapidly. The sustain section of the sound is that part of the sound following the attack, where the characteristics of the sound are changing less dynamically. Usually the sustain section of the sound can be looped for a long time creating sound of a natural instrument such as flute or violin.


The figure on the left shows a waveform with portions which could be considered the attack and the sustain sections indicated. In this example, the characteristics of the waveform remain constant throughout the sustain section, while the amplitude is decreasing at a fairly constant rate. This is an exaggerated example, in most natural instrument sounds, both the spectral characteristics and the amplitude continue to change through the duration of the sound. The sustain section, if one can be identified, is that section for which the characteristics of the sound are relatively constant.


A lot of memory can be saved in wavetable synthesis systems by storing only a short segment of the sustain section of the waveform, and then looping this segment during playback. If the original sound had a fairly constant spectral content and amplitude during the sustained section, then the sound resulting from this looping operation should be a good approximation of the sustained section of the original.

For many acoustic string or wind instruments (such as violin, flute, saxophone), the spectral characteristics of the sound remain almost constant during the sustain section, while the amplitude (or volume) of the signal decays. This can be achieved with a looped segment by decaying its volume with some factor over a period of time.


A typical wavetable synthesis system would store sample data for the attack section and the looped section of an instrument sound. These sample segments might be referred to as the initial sound and the loop sound. The initial sound is played once through, and then the loop sound is played repetitively until the note ends. An envelope generator function is used to create an envelope which is appropriate for the particular instrument, and this envelope is applied to the output samples during playback.


Playback of the sample data corresponding to initial sound (with the attack portion of the envelope applied) begins when a note is just started to play. For example the moment a person presses a key on the keyboard the attack section will start playing. The length of the initial sound depends on the kind of sample used and what kind of instrument sound it is. The length of the attack and decay sections of the envelope are usually fixed for a given instrument sound. The moment attach section finishes, the sustain section is played (while the key is still pressed). The sustain section continues to repeat the loop samples while applying the sustain envelope slope (which decays slowly), until the key is released. Releasing of the key starts triggers playing release portion of the envelope.


Sustain Loop length
The sustain loop length depends on the basic length of the sustain section being looped. The loop length is measured as a number of samples. The length of the loop should be equal to an integer number of periods of the fundamental pitch of the sound being played.
Percussive sounds

Instruments like violin/flute/strings have sustain sections because these instruments are capable of playing same note continuously for longer durations. Percussive sounds like drum or cymbals do not have a sustain section as their sounds starts and decays very fast. For such instruments the looping of sustain section can not be employed. These sounds are stored as one sample which can be played as it is. The figure on left shows the waveform of a snaredrum sound. Note that it does not have a sustain section


Wavetable samples
There is a lot of processing involved on an audio sample collected from a natural instrument before it can be utilized in a wave table synthesis technique. We need to extract out the initial (attack) and sustain potions from it which can be looped. Also the portion which needs to be looped should be corrected so that end and start points of the portion blend with each other otherwise it will cause a glitch whenever it is looped. The sound sample may also be required to compress the dynamic range of the sound to save on sample memory.
Samples for various instruments can then be combined in a table which is called a soundbank or patch table. The synthesizer can load a specified sample in memory from the table when user wants to play a particular instrument.
Pitch Shifting/ Transpose
We know that two notes of a piano are related to each other in terms of pitch. Same notes in two different sets have frequencies which are double from the previous set. For example note C1 has frequency of 32.7032 while note C2 has frequency of 32.7032 X2 = 65.4064
In order to minimize sample memory requirements, wavetable synthesis use pitch shifting techniques so that same sample can be used to play various pitches (notes) of the same instrument
For example, if the sample memory contains a sample of a middle C note on the acoustic piano, then this same sample data could be used to generate the C Sharp (C#) note or D note above middle C using pitch shifting.
Pitch shifting can be achieved by controlling the speed of the playback of the stored sample.
Suppose an audio sample used for wavetable synthesis contains 100 frames (or 100 digital samples). The rate at which it is played is 10 frames per second which is the frequency of the sound being played. It results in a sound being produced of a particular frequency (say F1)
Now if we change the speed to 20 frames per second then it means doubling the frequency of the produced sounds (say F2)
If the sample contained a tone of 32.7032 Hz, it will sound like note C1 in first case. In the second case since the frequency is doubled (by changing the speed of the playback) it will sound like a note of 65.4064 Hz which is note C2.
You can see now that same digital sample can be used to play two notes one octave apart just by changing the speed of playing the stored sample.

No comments: