Shepard Tones

A Shepard tone is an auditory illusion of endlessly ascending (or descending) pitch. Like a barber pole that appears to climb forever, a Shepard scale cycles through pitch steps while never actually getting higher or lower in any absolute sense. The illusion works because each “tone” is actually a stack of frequency components spread across many octaves, shaped by a fixed amplitude envelope that masks the recycling of components at the edges of the audible range.

🎯 Simple version: Imagine a spiral staircase where you keep climbing but never reach a higher floor. A Shepard tone does this with sound — it sounds like it’s always going up (or down), but it never actually gets higher or lower. The trick: it’s not one pitch, it’s many pitches spread across octaves, fading in at the bottom and out at the top so you never notice the recycling.

How Shepard Tones Work

Each “tone” in a Shepard scale is not a single frequency. It is a set of frequency components spaced exactly one octave apart — for example, 62.5 Hz, 125 Hz, 250 Hz, 500 Hz, 1000 Hz, 2000 Hz, 4000 Hz, and 8000 Hz. That is eight components, each double the frequency of the one below.

The critical ingredient is a fixed amplitude envelope — a bell curve (Gaussian) centered on a constant frequency (typically around 500 Hz). Components near the center of the envelope are loud; components far from the center are quiet. The envelope does not move. Only the individual frequency components move.

When the scale ascends by one chromatic step (multiply all frequencies by 2^(1/12) ≈ 1.0595):

  1. Every component shifts up slightly in frequency
  2. Components above the envelope center get quieter
  3. Components below the envelope center get louder
  4. The highest component eventually fades to silence and wraps around to the bottom
  5. A new low component fades in from silence at the bottom

Because the amplitude envelope stays fixed, the overall brightness (spectral centroid) of the sound never changes. Your ear detects that individual components are moving up, but the macro-level spectral shape is frozen. The result: perpetual ascent with no destination. A sonic barber pole.

The Mathematics

A Shepard tone at base frequency f consists of N frequency components at octave intervals:

Component frequencies:  f, 2f, 4f, 8f, 16f, ... , 2^(N-1) × f

Each component’s amplitude is set by a Gaussian envelope centered on a fixed reference frequency f_center (typically ~500 Hz), with a standard deviation of sigma octaves (typically ~2 octaves):

amplitude(freq) = exp( -( (log2(freq) - log2(f_center)) / sigma )^2 )

This means a component at exactly f_center has amplitude 1.0, and components further away (in octave distance) fall off as a Gaussian bell curve.

To ascend by one chromatic step, multiply the base frequency by 2^(1/12):

f_new = f × 2^(1/12) ≈ f × 1.05946

All N components shift accordingly. After 12 steps, f has doubled — but since octave-spaced components are perceptually equivalent (octave equivalence), and the amplitude envelope has not moved, the tone sounds identical to where it started. One full “revolution” of the barber pole.

The frequency wrapping boundaries keep components in the audible range:

If freq > f_upper (e.g. 8000 Hz):  wrap to freq / 2^(num_octaves)
If freq < f_lower (e.g. 30 Hz):    wrap to freq × 2^(num_octaves)

Why It Works

The illusion exploits a conflict between two pitch cues the brain uses:

1. Local pitch movement (individual components). Each frequency component is clearly moving up (or down) by a semitone per step. The cochlea detects this unambiguously — the excitation pattern on the basilar membrane shifts.

2. Global spectral shape (overall brightness). The spectral centroid — the “center of mass” of the frequency spectrum — stays fixed because of the stationary Gaussian envelope. The brain also tracks spectral centroid as a cue for “how high” a sound is.

When these cues conflict — individual components rising but overall brightness unchanged — the brain resolves the ambiguity by prioritizing the local movement cue. You hear “going up” because the components you can hear clearly (near the center of the envelope) are all moving up. The recycling at the faint edges is below perceptual threshold.

This is closely related to harmonic template matching. The auditory system groups octave-spaced components and assigns a single pitch percept. Because all grouped components move in the same direction, the percept moves too — endlessly.

The phenomenon also depends on equal loudness perception: the Gaussian envelope must be calibrated so that the fade-in and fade-out regions are truly below audible threshold at their extremes, accounting for the ear’s frequency-dependent sensitivity.

The Interactive

Shepard Tone Demo
Speed: 1.0 /s
Current base: — Hz | Step: 0/12 | Direction: stopped

Translation Table

PhizMusic Western Other Systems
Shepard tone Shepard tone, Shepard scale Same term used universally
Frequency components (octave-spaced) Octave-spaced partials
Gaussian amplitude envelope Spectral bell curve, amplitude weighting
Pitch circularity Pitch paradox, auditory barber pole Tritone paradox (Diana Deutsch) — related
Chromatic step (×2^(1/12)) Semitone, half step 100 cents

Connections

Suggested References