Jump to content
  • 0
IGNORED

Is bit depth about dynamic range or data?


audiojerry

Recommended Posts

  • 0
4 hours ago, Teresa said:

 

That is also the way I understand it. For a 16bit 44.1kHz music file samples are taken 44,100 times each second. In this case the sample is a 16 bit word represented by 1's and 0's, but just need to be two different states such as lands and pits on CDs. That 16 bit word has to completely describe the music that occurred during that 44,100th of a second. If this is not correct I also would like to know. 

Maybe this will help:

 

Digitizing an analog function (like the sequential voltages that comprise the waveform of moving electrons coming from the recording console) is like making a movie by stringing still images together.  The first motion pictures could only show jerky motions because they included so few stills per second. The image quality was poor because optics and film technology were both crude. Stability and reproducibility of the film path were inconsistent, so action might look too fast or slow - and you can definitely see jitter :) 

 

Movie film got bigger, moved faster, and recorded better images as it evolved.  A 16mm frame had a fraction of the information contained in a 35mm frame of the same scene shot on the same emulsion, and film technology advances improved accuracy of individual images - so quality improved because the “bit depth” increased.  The mechanical speed of film’s advance through camera and projector smoothed motion by increasing the visual sampling rate.

 

As a kid, you probably drew a series of cartoon-like sketches of a stick figure on the pages of a small pad of paper, moving one of its arms up a bit in each successive sketch. When you flipped the pages, you saw what looked like a “motion picture”.  The more pages in your pad, the more smoothly you could portray the motion by making the successive position changes smaller and cramming more of them into the same flip time.  This is another example of sampling rate.

 

You could use a pencil to make crude stick figures, or you could make more refined drawings. The less information you put into the sketches, the less like a person and the more like a bunch of moving lines the moving image looked.  If you drew artful images of a boy waving, your “movie” looked more like a boy waving - it was more accurate because there was more information in it, i.e. it had greater “bit depth”.

 

Bit depth determines the accuracy of the instantaneous value of the analog function being represented - and in this, accuracy means variance in the actual value. That parameter is voltage at most stages of audio ahead of the analog renderer that turns the signal back into air pressure waves so you can hear it (the usual output power metric is based on current).  More accurate instantaneous values in the string mean more accurate capture, storage & reproduction of the analog input signal.
 

One argument against going above 16 bits for consumer audio files is that we “can’t hear” the difference.  This would be analogous to limiting display resolution of optical media to the highest we can see. It’s a tricky path without definitive answers.  This may be too obscure or abstract to help visualize the concept, although I hope not. But the basics are simple: sampling rate is analogous to the number of stills in a second of movie film running at standard speed, and bit depth is a measure that reflects the accuracy of each still.  The digital clock is analogous to speed control of the the motors that move film through the camera & projector. That speed has to be stable and exactly the same for both, or the movie won’t look right.  The same effects manifest in digital audio, for directly analogous reasons.

 

One major difference in this analogy is that a single static sample of the audio signal makes no sound - only the sequential confluence of all samples can generate audible output. But each frame of a movie is a picture in itself. So, unlike a single sample from an audio signal, it contains information usable in isolation from the rest of the reel.

Link to comment
  • 0
1 hour ago, mansr said:

A sample of audio is more like a single pixel in an image than a still from a video.

I'm offering a loose functional analogy meant to be illustrative for Teresa and not a literal description.  I thought the movie analogy was more useful because each frame is an instantaneous sample of the changing visual "signal" and the end product is a dynamic sequence of these samples. As the dynamics of motion picture production and control are similar in many ways to those of an audio waveform, it just made a lot more sense to me than your example. I could be wrong - I look forward to feedback in this thread to help me improve my communication skills.

 

I suppose that pixels in an image can illustrate the same concept in a different representation, the main differences being that pixels are not samples or "complete" representations of anything.  They're components that combine to form a static image just as linked dyes combine to form the color image on emulsion based film.  And there are many different kinds of pixels that vary in shape, size, ability to display color etc.

 

For Teresa et al:  there are similarities that may help you understand the subject that started this discussion.  Each pixel in an image can display multiple colors within a designated set, e.g. RGB or cyan-magenta-yellow-black (because not all pixels are functionally alike). An 8 bit color image can carry 2^8 colors - it's like having a box of 256 crayons that divide the entire visible color spectrum into 256 parts by frequency. No matter how many shades of color are in the source, the pixels in the screen will use the closest of the 256 colors it can display to each color in the source image.  If the exact shade of red is between two in the "crayon box", it will use the closest one.  A 10 bit image can display 1024 different colors, so it can render an image closer to the original in color composition. 

 

The accuracy of color rendition is somewhat analogous to the accuracy of voltage representation in a single sample of a digitized audio waveform, in that the exact value is limited to a given number of decimal places.  So it's "rounded" up or down to fit within the limits of that digital field.  The more bits available per sample, the more accurately the value can be recorded (i.e. the more significant digits it contains and the smaller the potential difference - no pun intended - between the actual value and its digital approximation).

Link to comment
  • 0
16 hours ago, mansr said:

Signal to noise ratio and dynamic range are the same thing.

That's true for the equipment but not the program material, which is an important functional difference.  Program material rarely has a DR equal to the SNR of the equipment through which it's being played.

 

Unless the DR of the program is equal to or greater than the SNR of the system (which is virtually unheard of today), the desired listening level will determine the effective SNR.  If the recording is a quiet piece with limited DR, e.g. concerti for solo violin or guitar, the listener may like to listen at sufficient volume to make system background noise intrusive. Tchiakovsky's 4th has a wide DR, so I set the volume control lower to avoid excessive peak SPL.  This also lowers background noise, so the audible SNR is higher. 

Link to comment
  • 0
40 minutes ago, davide256 said:

bit depth does determine dynamic range.

I'm not suggesting otherwise and apologize if I gave that impression.  I was just trying to convey in simpler terms that the way in which it does this is to enable more accurate capture of instantaneous signal levels in the source waveform, which obviously encompasses both the bottom and the top of the DR.  And that accuracy is, in large part, determined by errors resulting from fitting each sample to the size of the "word" (i.e. bit depth).  This is quantization error, if I remember this all correctly. 

 

The other common confusion I see stemming from this is failure to understand that the bit depth of the recorded file determines only the DR of the recording.  It determines the DR of the source file you're playing, not the SNR of your playback equipment.

Link to comment
  • 0
4 minutes ago, mansr said:

I don't like the motion picture analogy since there each frame contains, in isolation, readily identified representations of the objects in the scene. In a single sample from an audio recording of an orchestra, there is not a part for the violin, another for the oboe, etc. There is just a single number, the isum of the air pressures contributed by each instrument at one instant.

Very interesting thoughts - thanks!

 

I see the digitized waveform a bit differently, in that each and every instrument being played is present (if being played at the time of capture) in each and every sample. The single instantaneous value being captured is the summation of all values for all parts being played.  We can't separate them within an individual sample because there's no dynamic context - the samples by themselves contain data but no information, and are a perfect example of the difference between the two, in my opinion.

 

But sequenced as they were when captured, they define a complex waveform in which the individual parts can be identified by ear and in a Fourier transformation. And we could determine the contribution of each instrument to the value of that sample with a little (OK, more than a little...) mathematical manipulation.  Of the 1.3V in the 12,273,418th sample of a string trio piece, we might see that 0.2V were the violin, 0.4 were the viola, 0.5 the cello, 0.15 the natural intermodulation of the three, and 0.05 the cumulative noise.

 

Just thinkin'...........😉

Link to comment
  • 0
1 hour ago, The Computer Audiophile said:

 

The magic word in this is "available" (at 36 seconds into the video).  Bit depth determines the dynamic range available for use during recording, i.e. the maximum DR of recordings captured by the system.  This is independent of the source program itself and of playback equipment.  I suspect you could record a rock band playing a song that has a DR of 4 dB (which is typical in some genres) at an average level of 0dB with an 8 bit system and hear little if any difference compared to a 16 or 24 bit capture. The noise floor would be 20 dB below the signal in an 8 bit file, and the signal would be sufficiently loud and sufficiently compressed to render any differences in accuracy inaudible.

 

Low bit rates create an artificially high noise floor by "compressing" all signal that's within the lowest quantum level range in each sample to the same amplitude, which makes the noise as loud as any musical content that's also within that range.  Signals above the top of that range are unaffected by the noise, although they too are compressed within their ranges (which is not mentioned in the video).  

Link to comment
  • 0
2 minutes ago, audiojerry said:

But so far, I don't feel like my original question has been answered - at least to my benefit. As best as I can determine, bit depth captures both musical data and dynamic range - it is not exclusively one or the other.

I agree with you.  Digital recording breaks the continuous frequency spectrum of analog sound into quanta of frequencies and compresses all levels within each quantum to a mean.  This reduces dynamic contrast within each quantum and has to affect the liveliness of reproduction to some degree.  Higher bit depth creates more quanta, so there should be less of this effect.  I must admit that I'm not sure I can actually hear it in most recordings of the same material made at 16 and 24 bits - but theoretically, it makes sense to me.

Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now



×
×
  • Create New...