Auditory masking is the phenomenon where one sound makes another sound inaudible or harder to detect. Human hearing is not a complete capture of acoustic reality; it is a selective system with limited resolution in frequency and time.
π― Simple version: A loud sound can hide a quieter sound so you cannot hear it, even if both are physically present. This is why MP3 files can remove some audio details with little perceived loss.
When a strong tone/noise component is present, nearby frequencies require higher level to be audible. On the basilar membrane, overlapping excitation reduces detectability of weaker neighbors.
Key properties:
Conceptual threshold relation:
audible if target_level > hearing_threshold(f) + masking_shift(f | masker)
The masking shift depends on frequency distance from the masker and masker intensity.
Masking is also time-dependent.
Forward masking often dominates practical listening and coding applications because onset transients leave a short-lived neural suppression tail.
A masking curve describes raised threshold around a masker frequency.
Threshold
^
| /
| / \____
|__________/ \__________
+---------------------------------> Frequency
masker
At higher masker levels:
This is one reason dense mixes can feel crowded even when no single source is extremely loud.
Lossy codecs estimate masking thresholds and spend fewer bits on components predicted to be inaudible because they are masked by stronger nearby components or recent transients. This enables large data reduction (often around 10:1) while preserving perceived quality for many listening contexts.
In arrangement and EQ decisions, overlapping sources can mask each other:
Practical fixes include spectral separation, dynamic control, and arrangement spacing.
Ambient and mechanical noise can mask useful sound cues in halls, classrooms, and public spaces, affecting clarity and intelligibility requirements.
Masking shows that perception is a reconstructed model, not an exhaustive measurement. The auditory system prioritizes salient structure under finite neural bandwidth. What we βhearβ is therefore a filtered interpretation of the physical wavefield.
| PhizMusic | Western/Engineering | Notes |
|---|---|---|
| Simultaneous masking | Frequency masking | Same phenomenon |
| Forward masking | Post-masking | Target after masker |
| Backward masking | Pre-masking | Target before masker |
| Critical-band masking | Auditory filter overlap | Cochlear filter-bank framing |
| Perceptual bit allocation | Psychoacoustic coding | Basis of MP3/AAC efficiency |