Jump to content
  • 2
IGNORED

Has there ever been a conclusive listening test?


Norton

Question

Has there ever been a methodologically robust blind listening test of digital  audio formats, codecs etc that resulted in subjects being able to distinguish between, identify and/or express a preference for particular formats to a statistically significant degree? For example, has any test demonstrated conclusively that subjects could distinguish between 16/44 and 24/96, and/or were able to identify which was which, and/or expressed a preference for 24/96?

Link to comment

9 answers to this question

Recommended Posts

  • 0

You can download the full text PDF here:

https://www.researchgate.net/publication/257068631_Sampling_Rate_Discrimination_441_kHz_vs_882_kHz

 

They were comparing 88.2 vs 44.1 and 44.1 downsampled.  All sample rates were 24 bit.  The gear used was top quality.  You will rarely buy recordings that are this pure and unmolested in some way.  

 

I might pick a few nits with a couple statements in their conclusions.  There are a couple oddities in the results.  

 

None of this argues against my idea that 88 or 96 might be audible, but not enough to be a big difference.  If you did everything at 44 or 48 you'll be losing very little if anything.  It is not like a recording where the only difference is the sample rate produces one wonderful recording at the higher rate, and a horrid hard to listen to result at the lower sample rate.  

And always keep in mind: Cognitive biases, like seeing optical illusions are a sign of a normally functioning brain. We all have them, it’s nothing to be ashamed about, but it is something that affects our objective evaluation of reality. 

Link to comment
  • 0

Thanks, I thought I’d wait for any contrary replies before responding.  My impression was just so, that I’d never read of a conclusive listening test comparing formats, although it’s not a subject I’d particularly seek out.

 

But it does beg the questions: is this because:

 

a. all formats beyond ( and maybe including) RBCD  are just a “confidence game” where the consumer equates numbers with SQ in the abence of a legitimate quality reference point,  what the cutural theorist Roland Barthes called the quantification of quality; or 

 

b. because listening tests have been flawed to date?  I’m guessing that many involve an often small number of subjects having to make relatively  quick decisions with unfamiliar ( and maybe disliked)  music and equipment.  I wonder whether something more conclusive might result if subjects were allowed to thoroughly familiarise  themselves with the samples on their home system for a week or two before then being tested blind, again at home. Subjects and sample set used broadly aligned by musical taste. Of course the system would then be a variable.

 

It does also point up though that it is highly selective to use inconclusive listening tests  to dismiss any one format in particular, when likely the same would result for any format beyond higher quality MP3.

 

Link to comment
  • 0
2 hours ago, Norton said:

Thanks, I thought I’d wait for any contrary replies before responding.  My impression was just so, that I’d never read of a conclusive listening test comparing formats, although it’s not a subject I’d particularly seek out.

 

But it does beg the questions: is this because:

 

a. all formats beyond ( and maybe including) RBCD  are just a “confidence game” where the consumer equates numbers with SQ in the abence of a legitimate quality reference point,  what the cutural theorist Roland Barthes called the quantification of quality; or 

I think (a.) just about describes it as it is.  

2 hours ago, Norton said:

 

b. because listening tests have been flawed to date?  I’m guessing that many involve an often small number of subjects having to make relatively  quick decisions with unfamiliar ( and maybe disliked)  music and equipment.  I wonder whether something more conclusive might result if subjects were allowed to thoroughly familiarise  themselves with the samples on their home system for a week or two before then being tested blind, again at home. Subjects and sample set used broadly aligned by musical taste. Of course the system would then be a variable.

 

It does also point up though that it is highly selective to use inconclusive listening tests  to dismiss any one format in particular, when likely the same would result for any format beyond higher quality MP3.

 

There have been some tests, and your wondering about spending more time being better etc. etc., and the results such as they are have shown shorter listening segments works more reliably and to smaller levels of discernment. 

 

On the flip side, actual levels of degradation do respond to training.  MP3 is a good example.  Artifacts of the encoding can be demonstrated to listeners at very low sample rates like 92 kbps.  At this level they are obvious.  Then at 128 kbps.  And then 160 kbps so on and so forth.  Listeners can then hear artifacts with some suitable material at the higher rates when they would have missed them prior to the training. 

 

One could use that approach with sample rates.  Start with an 8 khz rate which everyone could listen to and hear as different vs 88.2 or 96 khz rates.  Then raise the rates a little at a time and see where people no longer hear a difference vs those hirez rates.  Is it only at those high rates or lower?  I don't know of that being done by someone like McGill University or some other academic outfit.  That would be a much better approach.  Test some young trained listeners and see where they don't hear a difference anymore.  

 

I don't consider that I was using an inconclusive listening test to dismiss a format.  There have been many inconclusive tests.  That leads me to think the reason is there is very little to no difference.  A  preponderance of evidence if not proof.  There have been many blind listening tests done with MP3 which is detected vs 44.1 uncompressed sound.  So the method works.  MP3 was developed by such testing.  

 

To my knowledge testing like that wasn't used to create the CD standard.  Rather experience with even earlier digital recording methods with sample rates of 30khz, 32 khz, and 37 khz showed they weren't enough. So they just designed the CD standard by spec to what was known about human hearing limits, and how PCM systems worked.  It would appear even at this late date they didn't choose too badly whether it is capable of 100% audible fidelity or just 99%.

 

 

And always keep in mind: Cognitive biases, like seeing optical illusions are a sign of a normally functioning brain. We all have them, it’s nothing to be ashamed about, but it is something that affects our objective evaluation of reality. 

Link to comment
  • 0
13 minutes ago, esldude said:

I don't consider that I was using an inconclusive listening test to dismiss a format

 

Thanks, my final comment above was not aimed at anything in this thread, but rather at a wider pointing to lack of discernment in listening tests as being evidence of a specific format not being worthwhile, carrying the erroneous implication that such listening tests did validate other formats. 

Link to comment
  • 0

I had wanted to include this test with its unusual method and results.  I couldn't find it, but tripped over someone else discussing it today.

http://www.extra.research.philips.com/hera/people/aarts/RMA_papers/aar07pu4.pdf

 

 It is for a surround setup test of mechanical sounds. It has even more curious results. Over both conditions of the test the average results indicate no winner in DXD vs 44 rates when using direct analog as a reference. The two conditions were a system with 100 khz bandwidth on the microphones and playback gear, and one having microphones and playback gear limited to 20 khz.  DXD is 355 or 384 khz rates at 32 bits. 

 

They had a good surround setup with really good gear.  They put microphones in an anechoic chamber and had microphones for each channel of the surround rig with no processing available as a direct analog real time feed.  They concurrently had a DXD ADC/DAC available to listen to and a 44 khz ADC/DAC to listen to.  Listeners could use the direct analog as a reference and had to choose which of two unknown digital systems were closest to the analog reference. 

In the 100 khz system listeners selected at a significant level a preference for 44 khz sounding like analog direct over DXD. These were forced choice tests you had to select one or the other. In the 20 khz limited system listeners had a significant level of preference for the DXD system. Odd and curious results don't you think? And not one that unambiguously points to high sample rates being better in either paper. Would have been nice to include an 88 or 96 rate shootout as well. 

There is an edge case where the results make some sense. With filtering very near the audible band, and using microphones and speakers that also trail off at that point you have in a sense compounded filters at that frequency. And there are interactions that can be audible when that happens especially when part of the filtering is not linear which with speakers and maybe microphones that is the case. Which would point to 88 or 96 moving those effects apart so there may sometimes be a benefit to those. It doesn't necessarily indicate more is better or that there is any benefit to more than 96 rates. And even then it may only sometimes be a difference depending upon the quality of the recorded sound and playback gear. 

And always keep in mind: Cognitive biases, like seeing optical illusions are a sign of a normally functioning brain. We all have them, it’s nothing to be ashamed about, but it is something that affects our objective evaluation of reality. 

Link to comment
  • 0

Thanks, that certainly is an unusual test with maybe a counter intuitive  outcome, although the author still seems to consider that the results justify the use of DXD over 16/44 for archiving. 

 

Overall, it doesn’t do much to dispel my  impression that either 1. listening tests as conducted to date are simply not a meaningful tool to compare digital formats or 2. that there really is no significant difference between formats.  I tend to think the former is more likely, the author does note:  “One can argue that providing unfamiliar music or sonic environment has made it more difficult for listeners to effectively judge audio quality”.  I suspect that factor has played a significant role in the apparent lack of conclusive results hitherto from listening  tests in general.

Link to comment
  • 0

I have found the server and DAC influence the sound way more than the source resolution.  While I might think a high resolution rip sounds slightly this or that to the native CD rip, a better DAC or server upgrade always portrays a obvious heard improvement,  or shall I say heard alteration in sound quality.  

Link to comment
  • 0

Norton,

About ten of us earlier did a listening test of The Portland State University Chamber Choir's latest "The Doors of Heaven." All of of us are Portland State alumni, like the music and are familiar  with  Magnepan 1.7i speakers. The group preference out of FLAC 24/88.2, CD, MP3 and iTunes was the iTunes. This album was recorded by Stereophile editor John Atkinson.

 

I've done and been around enough recordings that the extra room of 24 bits is nice to work with but listening to good CD recordings has always been enough. I might be biased a bit because the only way I'm going to listen to Americana and Alt Country is to record them in high resolution myself.

 

And as I remind people always "You are going to find high resolution a very hard sell."

Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now



×
×
  • Create New...