I used to think that audio files were basically a collection of snapshots of a spectrum analyzer with various frequencies at various levels. Then I realized, That isn’t how audio works!
On any given track, all the various frequencies superimpose to have a single oscillating voltage. Think about it: if you have only two frequencies — a very low bass and a high-mid, for example — the speaker will be movinng roughly slowly, but if you zoom in, it is oscillating to the frequency of the higher frequency even as the the speaker moves relatively slowly. That was a terrible illustration.
There are two parts to a bitrate: sampling frequency and “word size.” The sample rate dictates the frequencies available, and according to some law named after some dude, all you need is a sample rate that is about double the maximum frequency being sampled.
But the word size is the kicker. Even people who have lost higher frequencies in their hearing can still hear the difference between 16 and 24 bit. The bigger word size makes for a larger dynamic range. This is why REALLY compressed music doesn’t benefit from analog or higher bitrate recording.
It boils down to this: Higher bitrates allow WAY higher dynamic range, and they do not attenuate the highs as much. Anyone can hear the dynamic range differences. And if you are young and relatively unscathed-of-ear, listen to the drum metalwork for detail disparities.
Harangue over!
Wait — I am going to go some more.
In continuing the drivel with my wife, I realized a couple ways of explaining it better.
First, the whole speaker thing can be simplified to this: even though a lot of instruments can be recorded on or mixed down to a single track, your speaker only moves one direction at once. Thus, no matter frequencies you are hearing in one ear at a time (with headphones, at least), only one speaker is making all of them — by superimposing the sounds.
Also, the word size provides greater resolution for the dynamics. The difference between loud and soft can be more apparent and natural.
It is all about accuracy. If you think in terms of a photo, the sample rate is like the pixel size and the word size is like the color reproduction. There comes a point where the naked eye can’t see any smaller pixels, but the color reproduction can still greatly affect the “naturalness” of the image.
Ghetto (referring to ghetto-rigged quality — no commentary on any genre) electronic music tends to not have a lot of dynamic range, so higher bitrates don’t matter. But some of my friends who previously claimed to be oblivious to good audio quality were stunned when they heard just the drums and bass from a good recording on my record player (and, speaking of ghetto, I have that running mono through my tube guitar amp with 12-inch guitar-voiced speakers). They were shocked by the realism courtesy of the the detail afforded by analog reproduction of sound. Dynamics and frequency response. Vinyl forever!
Harangue (part deux) over!
“Even people who have lost higher frequencies in their hearing can still hear the difference between 16 and 24 bit.”
I suspect that the author just plain ol’ made this up. All the studies and research papers I’ve read have have repeatedly demonstrated that no difference can be detected between 16-bit and 24- or 32-bit recordings, and that very few people can tell a difference between 14- and 24-bit (if you want to know what 14-bit sounds like, albeit with more noise than “real” 14-bit, play back a 16-bit recording at exactly 1/4 volume using a 16-bit digital volume control and boost the levels back up with gain in the analog domain).
One study I’ve read (I think it was from the IEEE) demonstrated that, when presented with two recordings – one being played directly from an LP and the other from a 16/44.1 recording of the same tracks, the participants (who were self-professed audiophiles) were not able to tell one from the other.
The latter study is especially damning for vinyl fans, and I’m sure you can find it with a bit of googling.
So at first glance bit-depth is the same dynamic range. But due to the Nyquist-theorem and the clever use of dithering a analog signal can be recorded and reproduced EXACTLY as it is at any bit-depth. The quantization-error that arises from digitalization can be evenly distributed to a noise floor over all frequencies with dithering. Therefore bit-depth can be translated in to signal-to-noise-ratio. In effect you don’t more dynamic range with 24 bit audio, just the headroom to kill a human being with the audio-energy from normal room noise. Does that makes sense?
Vinyl = overrated junk.