FM Synthesis: History & Backgroud

A Brief History

As mentioned in Introduction II, John Chowning brought a copy of the MUSIC IV software from Bell Labs to Stanford, where he founded the CCRMA, and started experiments in sound synthesis. Although frequency modulation was already a method used in analog sound synthesis, it was Chowning who developed the concept of frequency modulation (FM) synthesis with digital means in th late 1960s.

The concept of frequency modulation, already used for transmitting radio signals, was transferred to the audible domain by John Chowning, since he saw the potential to create complex (as in rich) timbres with a few operations (Chowning, 1973).

For one sinusoid modulating the frequency of a second, frequency modulation can be written as:

\[ y(t) = \sin(2 \pi f_c + I_m \sin(2 \pi f_m t) ) \]

\(f_c\) denotes the so called carrier frequency, \(f_m\) the modulation frequency and \(I_m\) the modulation index. [Fig.1] shows a flow chart for this operation in the style of MUSIC IV.

[Fig.1] Flow chart for FM with two operators (Chowning, 1973).

In many musical applications, the use of dynamic spectra is desirable. The parameters of the above shown FM algorithm are therefor controlled with temporal envelopes, as shown in [Fig.2]. Especially the change of the modulation index over time is important, since it results in percussive sound qualities. In musical applications, multiple carriers and modulators, referred to as operators, are conncected in different configurations, for generating richer timbres.

[Fig.2] Flow chart for dynamic FM with two operators (Chowning, 1973).

FM synthesis is considered an abstract algorithm. It does not come with a related analysis approach to generate desired sounds but they need to be programmed or designed. However, there are attempts towards an automatic parametrization of FM synthesizers (Horner, 2003).

John Chowning, composer by profession, combined the novel FM synthesis approach with digital spatialization techniques to create quadraphonic pieces of electronic music on a completely new level. In Turenas, completed in 1972, artificial doppler shifts and direct-to-reverberation techniques are used to intensify the perceived motion and distance of panned sounds in the loudspeaker setup. The sounds used in this piece are only generated by means of FM, resulting in a characteristic quality like the synthetic bell-like sounds beginning at 1:30 or the re-occuring short precussive events.


  • John Chowning. Turenas: the realization of a dream. Proc. of the 17es Journées d’Informatique Musicale, Saint-Etienne, France, 2011.
  • Andrew Horner. Auto-programmable FM and wavetable synthesizers. Contemporary Music Review, 22(3):21–29, 2003.
  • John M Chowning. The synthesis of complex audio spectra by means of frequency modulation. Journal of the audio engineering society, 21(7):526–534, 1973.
  • Envelopes: Exponential

    For percussive, plucked or struck instrument sounds, the envelope needs to model an exponential decay. This is very useful for string-like sounds but most importantly for most electronic musicians, it is the very core of kick drum sounds.

    In contrast to the ADSR envelope, the exponential one does not contain a sustain portion for holding a sound. The only parameter is the decay rate, allowing quick adjustment. Alternative to an actual exponential, a modified reciprocal function can be used for easier implementation. The factor $d$ controls the rate of the decay, respectively the decay time:

    $$ e = \frac{1}{(1+(d t))} $$

    The following example adds a short linear attack before the exponential decay. This minimizes clicks which otherwise occur through the rapid step from $0$ to $1$:

    Your browser does not support the HTML5 canvas tag

    Attack Time:

    Decay Time:

    Controlling SC with the Mouse

    A quick way of control is often needed when testing and designing synthesis and processing algorithms in SuperCollider. One quick way is to map the mouse position to control rate buses. Combined with a touch display, this can even be an interesting means for expressive control. This example first creates a control bus with two channels. The node ~mouse uses the MouseX and MouseY UGens to influence the two channels of this bus:

    // mouse xy controll with busses
    ~mouse_BUS = Bus.control(s,2);
    ~mouse   = {,,1));,,1));



    Use the mouse example with the previous sawtooth-filter example to control pitch and filter characteristics.

    FM Synthesis: Interactive Example

    The following example is a minimal FM synthesis with two operators - one modulator and one carrier:

    Carrier (Hz):

    Modulator (Hz):

    Modulation Depth (Hz):


    Time Domain:

    Frequency Domain:

    Audio Buffers

    Most systems for digital signal processing and music programming process audio in chunks, which are defined by a so called buffer size. These buffer sizes are usually powers of 2, usually ranging from $16$ samples - which can be considered a small buffer size - to $2048$ samples (and more). Most applications, like DAWs and hardware interfaces allow the user to select this parameter. Technically this means that a system collects (or buffers) single samples - for example from an ADC (analog-digital-converter) - until the buffer is filled. This compensates irregularities in the speed of execution for single operations and ensures a jitter-free processing.


    The choice of the buffer size $N$ is usually a trade-off between processor load and system latency. Small buffers require faster processing whereas large buffers keep the user waiting until a buffer has been filled. In combination with the sampling rate $f_s$, the buffer-dependent latency can be calculated as follows:

    $$ \tau = \frac{N}{f_s} $$

    Round trip latency usually considers both the input and output buffers, thus doubling the latency. For a system running at $48\ \mathrm{kHz}$ with a buffer size of $128$ samples - a typical size for a decent prosumer setup - this results in a round trip latency of $5.5\ \mathrm{ms}$. This value is low enough to allow a perceptually satisfying interaction with the system. When exceeding the $10\ \mathrm{ms}$ threshold it is likely that percussions and other timing-critical instruments experience disrupting latency.

    Buffers in Programming

    In higher level programming environments like PD, MAX, SuperCollider or Faust (depending on the way it is used), users usually do not need to deal with the buffer size. When programming in C or C++, most frameworks and APIs offer a processing routine which is based on the buffer size. This accounts for solutions like JUCE or the JACK API, but also when programming externals or extensions for the above mentioned higher level environments. These processing routines, also referred to as callback, are called by an interrupt once the the hardware is ready to process the next buffer.

    Contents © Henrik von Coler 2021 - Contact