IGNORED

The relation between amplitude and sample rate

Recommended Posts

Audio data comprises of samples (snapshots) of sound. Each sample represents the amplitude (level) of that sound. Subsequent samples will create frequencies out of that, but this is not important for the story (but can be worked out, when it becomes important to the subject afterall).

The sample rate determines how many snapshots are taken per second (remember, this is digital) and the bit depth determines how many variations can be stored for the level. For redbook CD the sample rate is 44100 and the bit depth is 16. Since 16 bits can represent a decimal value in between 0 and 65535, 65636 different levels may occur in the samples registration, and one sample can carry one of these values.

Again not important is that in real life those values are spread over the plus and minus voltage they represent, and for 16 bits this would be in between decimal +32767 and -32768, the maximum values representing -for a normalized situation- 2VRMS.

When the bit depth increases, each additional bit allows for 2 times more level possibilities. We won't go into the bits and bytes, but for the subject it is important to just know that an additional 8 bits create a headroom in the level possibilities of 256 times more, opposed to the 16 bits which is our base for today. This means that the output voltage (which in the end it is all about) is represented by 16,777,216 different possibilities, or 8388607 for plus voltage and (-)8388608 for minus. Here too 2VRMS would be the normal range, so keep in mind that only the number of different levels -hence the resolution in it- can vary much more (256 times) opposed to our base, the 16 bits.

The separation in plus and minus is not really important, but for the visualization of things it is good to know that 1000s of subsequent samples may stay in the plus range before crossing zero, then to minus for a longer time, and back and so forth. And no matter where the "wave" is, it is varying all the time in amplutide, and you may well say that each higher frequency modulates upon a lower. No, this too is not important, until someone comes up and says it is.

For the ease of thinking, let's visualize the samples on the x-axis (horizontal), and the level each sample represents on the y-axis (vertical. The x-axis represents the time domain, and the y-axis represents the amplitude domain.

(you see only a fraction of the maximum amplitude range used here, representing a few mV, and each sideways step you see represents one sample, while the smallest vertical steps represent the minimum possible steps for amplitude).

Now, down to the matters !

While our base is redbook, thus 44100 samples per second, and 65536 possible amplitude values, what happens if we upsample that to 88200 (twice the sample rate) and start to use 24 bits in the mean time (256 times more possible amplitude values) ?

The question springs from here : http://www.computeraudiophile.com/content/Multi-Bit-DACs-vs-Delta-Sigma-DACs#comment-22682 and the merits of that are :

Note that due to the logarithmic nature of the ear, although 24 bits has 256 times as many "steps" as 16, it won't sound 256 times better...

and as response to that :

Any idea how to utilize those 256 times more resolution at a native sample rate that is 4 times more than redbook, that called the horizontal resolution ?

... which is exactly what this thread is about (but in here we upsample two times only).

I say that when the sample rate is doubled, which means we provide ourselves an additional headroom on the x-axis of 1 sample, there is no room to utililize the additional 256 possible vertical steps. We can only utilize one out of that !

Only when the sample rate would be 256 times higher as well, the headroom of the 24 bits opposed to 16 can be utilized.

While we are theortically able to register the level information at a 256 times better resolution, we can't put it anywhere because we can only pick one, and although that one will be 256 times better (more close to reality) than with 16 bits, the steps from the one sample to the other are nearly as rough as before, though 2 times better.

So, there is a relation between the time domain and the amplitude domain, and it seems to tell that when the time domain increases its resolution by a factor of two, the amplitude domain also only "needs" a factor of two, which means one bit only. BUT :

The latter would only be true for linear interpolation, or IOW when reality would show that each additional sample (for resolution) would depict an aplitude value that would be right in the middle of the adjacent two (old) samples. And this is not true, because we know that linear interpolation creates piles of harmonic distortion.

- Maybe when linear interpolation is turned into a logarithmic approach no HD would come from this ?

-> Which sure needs more than 1 additional bit.

So, there will be a relation between the sample rate and the number of levels needed to represent the jump from one sample to the other - in an appropriate level. When 16/44100 is taken as a base which does this correctly (by the grace of the recording/ADC doing that correctly), how many additional bits will be needed when we make the sample rate twice as high, in order to pertain the same level of harmonic distortion ? "24 bits will do" is nothing for an answer, because there just will be a relation.

(by now I have the feeling I can work it out myself, if only the logarithmic scale is used)

Robin

Share on other sites

people will probably need to refer to the thread Peter has linked to here to understand some of the following stuff.

Right, so if we sample something, how can the number of bits be related to how fast we do it? Well, if we consider that we are recreating a band limited signal, for the reconstruction filter to work, the precision of the band limited signal is dependent on amplitude and timing ( i.e. you can see how jitter can cause distortion in the amplitude domain - noise in the amplitude domain may cause problems in the time domain ) - this is why the source ( e.g. ADC ) is incredibly important.

The next issue we have, at the DAC, is that even if the source is 16 bits, by performing filtering on it digitally, we need bits - my examples earlier showed how important the filter is. If your filter is only 16 bits, you will introduce noise into the base band ( audio band ) because the processing will only operate at the same level (16 bits). Additionally, you will not be able to provide enough rejection of the images,

so how many bits do we need? The short answer is - as many as we can get in the time we have!

Share on other sites

Peter, you are discussing what happens when you go up from 16/44 in either sample rate or bit depth. But why should we assume that 16/44 has an optimal relationship between time and amplitude? Remember, that standard was chosen as a balance of commercial concerns and what was actually possible at the time, or what looked like it would be possible soon.

So what is the optimal relation between sample rate and amplitude, in order that we are not wasting resolution on either axis?

Paul Stubblebine[br]Paul Stubblebine Mastering, San Francisco[br]The Tape Project, LLC[br]serious student of the audio arts

Share on other sites

• 1 year later...

so how many bits do we need? The short answer is - as many as we can get in the time we have!

Reading this through again after wanting to link to it elsewhere, maybe, just maybe I can clarify the workout of this better today.

The above quote is correct I think; The more (amplitude) levels we have available to us, the more accurate a next sample will register its amplitude level. But somehow my brain wants to see this from the other given base : the bit depth itself and what it allows for. Tough to explain ...

We have all this many level possibilities but we're not utilizing them;

Once a sample comes along, we make giant leaps in that all so nice inherent granularity of level. All it says is that any sample rate is suitable for the levels we have. Thus, drop that sample once per second, do it 44100 times per second, 192000 times ... the level "picked" will always be accurate. Still the jump from picked level to picked level is depicted by the sampler's speed. The higher the sample speed, the more the available granularity in the levels will be utilized. Until the sample speed is so high, that two adjacent samples will pick a level which is the same as the 1st sample, no matter the difference in level is very steep. When this happens we need another bit again so a value in between can be chosen.

Yes, by itself this is related to the logarithmic nature of it all, or IOW this won't be about "a value in between", but about a value which can be properly chosen regarding the logarithmic nature. Now it becomes difficult for me, because suddenly 256 steps might be too few, even when we're already using those 256 additional steps (24 bits vs. 16).

I think this makes my whole subject rather moot, because increasing the sample rate by a factor of 2 (start out with 44.1) may need more than 256 times additional level granularity already. This is quite the opposite of my thinking at starting this little thread.

There's also this angle :

When we look at the higher frequencies (like 18KHz) of a 44.1 sampled wave, it is clear to me that 256 times more level granularity isn't going to help a thing to make that wave look better (mind you, before filtering). This is just the very poor relation of the sample speed opposed to the number of levels available. Thus, that 18KHz is close to the 2 samples we have available at 22.05KHz, while still 65536 levels are available; they won't help a thing, relatively spoken; The picked levels are off up to 50% (maybe someone can make 100% of that), and whether we can choose a level with 65536 granularity or 256 times more, the 50% really will only change into 49.99% or so.

The subject becomes really difficult when we see that the reconstruction filter may make a nice fluent sine of our testsignal (which is periodic and no music), now suddenly the bits will matter, but only when the filter is really really long (and rings forever).

I give up for now ...

Share on other sites

This thread would seem to provide the perfect prelude

to a question I have thought about for years: Why not

change the base to decrease the amplitude step between

bits when we add bit depth to the specification?

I know the practical answer (hardware fixes the step

represented in base 2), but, you can see that a change

to base 1.9 (for example) would easily reduce the db

range represented by 24 bits toward something less

radically in excess of our hearing and provide finer

granularity. In this case the first bit would represent

1, the second 1.9, the third 3.61 the fourth 6.859, and

so on, using integer powers on a smaller base. The

equivalent base two amplitudes would be 1, 2, 4, and 8.

Those numbers are the base 10 equivalents, not the db

steps.

I realize that increased headroom is important to

signal processing meaning that greater bit depth than

the output signal is beneficial, so that would be needed

in the processing chain, though not in the final file.

I would like to hear why folks think this might not

represent a path to increased sound quality.

thanks,

Skip