Sound is a mechanical pressure wave propagating through a medium β air, water, steel, or anything with atoms close enough to push on each other. There is no sound in vacuum because there are no molecules to transmit the pressure variation.
π― Simple version: Sound is air vibrating. Fast vibration = high pitch. Big vibration = loud. Your ear hears ratios: the jump from 100 to 200 Hz sounds like the same βstepβ as 1000 to 2000 Hz.
At rest, air pressure is roughly uniform β about 101,325 Pa at sea level. A vibrating object (a guitar string, a speaker cone, your vocal cords) pushes adjacent air molecules together, then pulls back, creating alternating regions of compression (higher pressure) and rarefaction (lower pressure). These pressure variations propagate outward at the speed of sound:
v_sound β 343 m/s in dry air at 20Β°C
The propagation speed depends on the mediumβs density and elasticity, not on the waveβs frequency or amplitude. All audible frequencies travel at the same speed in a given medium.
Frequency is the number of complete pressure oscillation cycles per second, measured in hertz (Hz).
f = 1 / T_period
where T_period is the time for one complete cycle (not to be confused with pulse period T used in the rhythm system).
A vibrating string completing 440 cycles per second produces a wave at 440 Hz. Human hearing spans roughly 20 Hz to 20,000 Hz, though the upper limit decreases with age and noise exposure.
Frequency determines pitch β the perceptual quality of βhighnessβ or βlowness.β Higher frequency = higher pitch. But the relationship between physical frequency and perceived pitch is not linear β it is logarithmic. More on that below.
Amplitude is the maximum deviation of pressure from the equilibrium value. Measured in pascals (Pa), it determines how loud a sound is perceived to be.
Because the ear responds to an enormous range of pressures β from the threshold of hearing (~20 Β΅Pa) to the threshold of pain (~20 Pa), a factor of one million β amplitude is commonly expressed on a logarithmic scale in decibels (dB SPL):
L = 20 Γ logββ(p / p_ref)
where p_ref = 20 Β΅Pa (the approximate threshold of hearing at 1 kHz).
| Sound | Approximate level |
|---|---|
| Threshold of hearing | 0 dB SPL |
| Whisper | 30 dB SPL |
| Normal conversation | 60 dB SPL |
| Rock concert | 110 dB SPL |
| Threshold of pain | 130 dB SPL |
Every +6 dB roughly doubles the pressure. Every +10 dB roughly doubles the perceived loudness (at moderate levels). The mapping from dB to perceived loudness also depends on frequency β the ear is less sensitive to very low and very high frequencies at moderate levels. That frequency-dependent sensitivity is covered in equal-loudness.md.
A single sound can be viewed three ways. All contain the same physical information, just displayed differently.
A waveform (oscillogram) plots pressure variation versus time. The horizontal axis is time (seconds); the vertical axis is pressure deviation from equilibrium.
Pressure
^
| /\ /\ /\
| / \ / \ / \
|--/----\--/----\--/----\---β Time
| \/ \/ \/
|
From a waveform you can read:
T_period): the time between repeated patterns β frequency = 1/T_periodA spectrum plots energy (or amplitude) versus frequency. Each vertical bar or peak represents a sine-wave component present in the sound.
Amplitude
^
| |
| | |
| | | |
| | | | |
+--+-----+--+-----+----β Frequency (Hz)
fβ 2fβ 3fβ
A pure sine wave shows a single peak. A complex tone shows peaks at multiple frequencies β the fundamental and its overtones. The spectrum reveals the internal βrecipeβ of a sound in a way that the waveform hides.
An equalizer (EQ) display groups frequencies into bands β typically octave-wide or third-octave-wide β and shows the total energy in each band as a bar height. This is a coarsened, simplified spectrum.
Energy
| β
| β β
| β β β β
| β β β β β
+--+--+--+--+--+--β Frequency band
Low Mid High
The EQ display throws away detail (you canβt see individual harmonics) but gives a quick picture of overall spectral balance β βlots of bass, moderate mids, less treble.β
All three representations describe the same physical event. The waveform and spectrum are connected by a mathematical operation called the Fourier transform (see fourier-analysis.md). The EQ display is a further simplification of the spectrum.
Hear pure sine waves at different frequencies, and compare ratio-based perception.
Here is a critical fact about human hearing that shapes all of music: we perceive pitch as proportional to the logarithm of frequency, not frequency itself. This means we hear frequency ratios, not frequency differences.
Consider:
The 100β200 jump and the 1000β2000 jump sound like the same musical distance because both are the same ratio (2:1). The additive 100 Hz difference is irrelevant to perception. This is a manifestation of the Weber-Fechner law β a general principle of sensory perception stating that perceived intensity is proportional to the logarithm of stimulus intensity.
This is why:
c = 1200 Γ logβ(fβ/fβ)Logarithmic pitch perception is not arbitrary or cultural β it is a consequence of how the cochlea performs frequency analysis (see ear-cochlea.md). The basilar membrane maps frequency logarithmically: equal distances along the membrane correspond to equal frequency ratios.
For completeness: wavelength (Ξ») is the spatial distance between successive compressions. Related to frequency and speed of sound:
Ξ» = v / f
At 343 m/s:
Wavelength matters for acoustics and instrument design (see instrument-physics.md) but plays no direct role in pitch perception β we hear frequency, not wavelength.
| PhizMusic | Western | Notes |
|---|---|---|
| Frequency (Hz) | Pitch | PhizMusic uses the physical quantity directly |
| Amplitude (dB SPL) | Dynamics (pp, p, mf, f, ff) | Western uses subjective labels instead of measurements |
| Waveform | β | No standard Western music theory equivalent |
| Spectrum | β | No standard Western music theory equivalent |
| Pressure wave | Sound, tone | βToneβ implies pitched sound; pressure wave is more general |