Jump to content
IGNORED

The Optimal Sample Rate for Quality Audio


Recommended Posts

Timing resolution is not the same thing as timing performance. Timing information is not restricted to ongoing phase differences at low frequencies, but can also be obtained from the envelopes of high-frequency sounds (Henning, 1974).

The localization of sound sources by humans has been studied alot in auditory neuroscience. One thing that plays a very important role in sound localization is transient information (i.e. the initial attack of a sound), which can be smeared over time due to filter ringing and which, especially if the transient is very steep, cannot always be accurately reproduced by low, or medium, resolution digital audio. As I already mentioned in another thread a while ago, the white paper on the HDCD standard very very clearly shows how and why Redbook falls completely short on this.

On top of this, binaural hearing not only relies on interaural time differences (ITDs), but interaural level differences (ILDs) also, as well as monaural spectral cues related to the cavities in the external ear, reflections and shadows of the head, the torso, and reverberations of objects such as the walls in a room.

Sound localization is largely influenced by visual cues and other neural interactions, for example information processed in the cortex. Interaural differences also play a role in the identification of sounds against background noise. The more familiar (or the more natural) a sound, the more accurately the human brain can usually localize and detect it.

Simply due to the way psychoacoustics work, blind listening tests cannot provide any solid proof as to whether two sounds are completely identical. For example, if someone listens to music in Redbook format, then listens to the same music in Hi Res right after that, the Hi Res version might reveal subtle sounds that were inaudible in the Redbook version but when the same person switches back from Hi Res to Redbook, these subtle sounds can suddenly be heard in Redbook even though they were inaudible at first. So, because psychoacoustics can never be ruled out of the equation, those who still claim measurements are the be-all and end-all in sound quality are ultimately wrong IMHO.

If you had the memory of a goldfish, maybe it would work.
Link to comment
Much of psychoacoustics was determined with blind tests.

That's true.

To say they can provide no proof of two sounds being identical is rather ridiculous. Perhaps some phenomena will require particular conditions, but that doesn't prevent a blind test being possible.
I never said a blind test would be impossible. It's the outcome of the test that would tell you the exact oppositte of what you would expect it to be. Rule number one is everything audio can be steered. During a blind listening test, people always listen in a different way than they would listen to the same sounds under a different circumstance. This has all been explained at the beginning of this video:
I am not sure I believe your redbook hi rez example of what might be called innoculation. Once innoculated with the sacred hirez you then hear more in redbook? Even it could be tested blind if desired. I do understand one could hear a more clear rendition of sound, and once picking something out then go back to a more muddled result, and knowing what to listen for find it more easily. But it has to be in the less clear rendition at an audible level to be discerned. An interesting test would be an example of a recording available in the same form, same mastering where you could hear something first in hi-rez and then pick out the previously inaudible artifact in the lo rez form. I think that is conjecture on your part based upon the idea hi rez is higher resolution and easier to hear into for the listener. Something which may or may not be true in my opinion.
See the bottom of this page: TAS 194: Meridian Audio's Bob Stuart Talks with Robert Harley | AVguide

...And then continue to read on the next page. People aren't delusional, it's just the way the human brain works.

As for measurements being the be and end all, I don't know that really being the aim. Listening itself will always be the final arbiter. But one must include all that is known about how listening can be fooled or well you will get fooled.

That was my whole point from the start. Well, actually... Even if you do include all that is known about how listening can be fooled, you will still be fooled. This is because if you knew everything about how listening can be fooled, it would have probably won you the Nobel prize. lol

Measurements can provide a consistent background in many areas. Measurements are an excellent short cut as in ruling out known factors. It also can usually be adapted to measure any new result coming to light to help figure out just what is going on.
A shortcut inside the human brain? Hmmmm... :D
It can sometimes eliminate factors as well.

Meaning, it will usually eliminate the wrong factors until someone comes up with a whole new way of looking at things. That's science!

If you had the memory of a goldfish, maybe it would work.
Link to comment
Intuitive leaps in science and engineering may have a grip on the public imagination, but to me, anyone who announces that there is a "whole new way of looking at things" has a virtually 100% chance of being a crackpot.

Again, my point exactly. The temporal resolution for monaural sound is about 2 ms, yes indeed but for binaural sound you're looking at an entirely different picture. Binaural measurements were achieved using only smallband tones. Complex, natural sounds are wideband. So then, who knows which way of looking at things will give you the least chance of being a crackpot?

Last time I checked, the Nyquist-Shannon theorem only applied to infinite duration, continuous signals. With a time limited, discrete signal you'll typically get both quantization and interpolation errors, which can have an audible impact. A modern filter not only causes post-ringing, but also changes the characteristics of background noise (i.e. noise that's already there before the filter is applied). Hence, the steepness of a filter does matter. Even, if using the best oversampling and upsampling techniques.

If you had the memory of a goldfish, maybe it would work.
Link to comment
Sure, within those constraints it is impossible. Or, as a computer scientist would say, "hard" :)

 

But if you remove some of the constraints - allow massive oversampling and a practically-endless number of taps (both for filtering and for compensation), you have a situation where throwing CPU power at the problem is actually a solution.

Not exactly. Like I said, the problem is not just the post-ringing, but also the inaccurately reproduced steep transients and the way the filter will change the characteristics of the noise.

 

Then there is the entirely separate debate about how harmful the ringing actually is - as long a it is post- and not pre-ringing.

To my ears, it's extremely well audible each time I listen for pleasure and for lengthy periods of time (I listen to rock, prog rock, hard rock and heavy metal alot and yes, I like it loud). Each time I listen just to try and tell the differences and to make a judgement about how harmful the ringing actually is, whether I am doing so in blind, double blind or sighted listening tests, the outcome is usually inconclusive. So, my personal listening experience matches precisely what Bob Stuart told Robert Harley in the interview that I linked earlier in the thread. By the way... I do not own, nor have ever owned, any Meridian Audio products and I am in no way being biased here.

If you had the memory of a goldfish, maybe it would work.
Link to comment
That interview covers a fair number of different topics, so I am not sure which part you are referring to.

It starts at the bottom of page 2:

Robert: There’s not a linear relationship between the objective magnitude of a change and the musical significance of that change.

And it continues onto the next page:

Robert: That brings to mind a conversation we had at CES about why blind listening tests may not be reliable. You said that when exposed to sound, our brain builds a model over time of what’s creating that sound. The rapid switching in blind testing doesn’t allow that natural process to occur, and we get confused.

The replies from Bob Stuart made so much sense to me that, recently, I started reading up on psychoacoustics myself ("Psychoacoustics - Facts and Models" 3rd ed. by Hugo Fastl & Eberhard Zwicker, and "Auditory Neuroscience" by Jan Schnupp, Israel Nelken & Andrew King). Especially the chapters on binaural hearing have been very informative to me, even though I have to admit my scientific and technical knowledge is mostly limited to the world of IT.

If you had the memory of a goldfish, maybe it would work.
If you had the memory of a goldfish, maybe it would work.
Link to comment
Almost all DACs do oversampling to reach some much-higher-than-RedBook rate (e.g., 352.8 or 384kHz)
So? The ESS SABRE³² Reference ES9018 chip can upsample 24-bit 192 kHz material to no less than 1536 kHz, even (...and it uses a 32-bit internal data path to go with that).
If you had the memory of a goldfish, maybe it would work.
Link to comment
Is that something special? ;)

Of course it isn't. That was actually my point, even my $1K DAC (which I consider relatively cheap) can do it. :)

I already support the same with 64-bit floating point internal data path and 32-bit integer output. Or alternatively up to 24.576 MHz 1-bit Delta-Sigma.

Yeah but IMO that's overkill if your DAC is connected directly to your power amp, using no EQ nor preamp nor analog attenuation. The theoretical 144 dB of dynamic range you'll get with just 24-bit integer output already provides sufficient headroom due to thermal noise kicking in at around -120 dB, and 32-bit float internal data path ought to be just as good as 64-bit float internal data path for just upsampling 24-bit 192 kHz material. Or am I wrong?

If you had the memory of a goldfish, maybe it would work.
Link to comment
Of course. More is always better. Just as with cars - 6 cylinders is better than 4. 8 cylinders is better than 6. 12 cylinders is better than 8. 24 cylinders is better than 12. 32 cylinders...

I was talking about the internal upsampling from 192 kHz to 1536 kHz. As for car analogies, I think the best car should have no cylinders at all... just a single big jet turbine will nicely fit the job. =P

If you had the memory of a goldfish, maybe it would work.
Link to comment
Absolutely - if you own your own oil well :)

 

I have seen the Rover effort - and have one of the offsprings of that effort in my garage, but in hindsight, it was less than a resounding success. :)

Hehe.

I know of no converter in the $1000 range that does 4x rates. Not talking what's on the spec sheet but what comes out of my speakers. Even at $2000, the best I've heard isn't even spec'd for 4x rates - but does a wonderful job at 96k.

I wasn't trying to suggest my DAC sounds fully transparent at 192 kHz, just that I think it sounds slightly better at 192 kHz when compared to 96 kHz. So, I have no other choice but to disagree with Dan Lavry's paper.

If you had the memory of a goldfish, maybe it would work.
Link to comment
It seems to me you prefer 192 over 96.

Did I misunderstand?

I would be exaggerating alot by saying I can't enjoy listening to music in 96 but yeah, on my system it seems like 192 adds that extra hint of convincing realism. It sounds somewhat more involving, with slightly better perceived dynamics and detail, as well as an improved three-dimensionality of the soundstage.

If you had the memory of a goldfish, maybe it would work.
Link to comment
I don't know which DAC spdif-usb has
It's the Eastern Electric MiniMax DAC Plus (with the vacuum tube removed from it and using its in-built M2Tech OEM async USB input). It's based on the ESS SABRE³² Reference ES9018 chip and with the tube removed IMO it sounds surprisingly closer to a Weiss DAC202 than the price difference would suggest. The Weiss is obviously more accurate and more analytical but I'd say the Weiss also can be more fatiguing in the long run exactly because of that. lol There is absolutely nothing wrong with the 96 kHz performance of the Plus. In fact, I think it's better than for example Wyred4Sound DAC2, by a mile or so. :)
If you had the memory of a goldfish, maybe it would work.
Link to comment
I think the ES9018 internally upsamples everything to 192K.

I think it upsamples everything to 1536 kHz but I also think it's extremely difficult, if not completely impossible, to find an affordable DAC that can make 96 kHz sound better than 192 kHz sounds on my DAC.

All affordable DACs that use the ES9018 sounded harsh to me, or both harsh and clinical, but not the Plus. None of the affordable DACs that don't use the ES9018 sounded to me like they could even begin to compete with the Plus, simply because the ES9018 chip typically always outputs such a maniacal amount of detail (true detail, not brightness) that it makes the dual mono Wolfsons sound completely muffled in comparison. For only $1K, that IMO is something really very special.

The Dagogo review Doug Schroeder wrote about the Plus was published after I had already purchased my Plus (I paid the introductory price for it). I still can't believe a single word of what he wrote in that review TBH. Increasing the value of a system by $50K? Go away, please... I keep telling my ears the cake is a lie, but my ears don't want to listen to me. Instead, they believe the review! :)

If you had the memory of a goldfish, maybe it would work.
Link to comment
The latest I saw claims that faster sampling yields better stereo location (time resolution). The argument is false. Faster sampling offers the ability to process wider bandwidth, but has no impact what so ever on stereo location!

8. Faster sampling for capturing bandwidth that we do not hear (ultrasonic) is not wise. If we did not hear it (or feel it) we don’t need it. If we did hear it (or feel it) it is not ultrasonic, it is audible bandwidth (by definition).

I disagree. As I stated earlier, temporal resolution is not the same thing as timing performance. While it is true we cannot hear ultrasonics (hence the term "ultra"), experiments in auditory neuroscience have indicated the localization of sound objects (and their identification against a noise background) relies not only on interaural timing differences (ITDs), but interaural level differences (ILDs) and spatial cues in the spectrum also.

The energy leakage of a window function in a filter is the root cause of ringing. Apodizing can help to reduce pre-ringing to a minimum. However, it also introduces new problems, such as nonlinearity, as well as doesn't fix the problem of post-ringing. The more familiar (or the more natural) the characteristics of a sound, the more susceptible we are to subtle differences. The measurable distortion in gear based on vacuum tubes is typically much bigger than in solid state gear. However, this does not necessarily also mean vacuum tubes are "completely useless"...

Moreover, the invisible flicker in the backlight of an LCD TV is invisible. However, this does not also mean it cannot cause sore eyes or a headache in the long run. The impact of audio quality on humans stretches much farther than what the mathematicians and the electronics engineers think, and IMO the bottom line is people shouldn't mix psychoacoustics with marketing.

If you had the memory of a goldfish, maybe it would work.
Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now



×
×
  • Create New...