Jump to content
IGNORED

Does BIAS affect audio test results?


Recommended Posts

10 hours ago, pkane2001 said:

An attribute test should probably follow a discrimination test, since if the subjects can't tell the difference between DUTs, any attributes they rank for each one are probably not valid. But the real question in my mind, what are the attributes that would be useful to rank in an audio attribute test? Here are some ideas, just based on terms frequently thrown around in audiophile circles. Some of these need to be defined, first, IMHO, and yet I think this would make for an interesting test. I'm listing them in no particular order, and mixing well-defined attributes with those that may be completely undefined:

  • Treble response (brightness)
  • Bass response
  • Midrange, vocals
  • Full sound
  • Impact/Slam/Dynamics
  • Microdynamics
  • Upfront or laid back presentation
  • Depth of sound stage
  • Width of sound stage
  • Sound source separation
  • Air
  • Transparent/Veiled
  • Ambiance
  • Realism
  • ???? Any others ????

 

 

 

 

 

 

Nice one, Paul ... 🙂

 

I would rank the value of them for assessment, in this order:

 

  1. Realism
  2. Full sound
  3. Impact/Slam/Dynamics
  4. Transparent/Veiled
  5. Sound source separation
  6. Air
  7. Ambiance
  8. Depth of sound stage
  9. Width of sound stage
  10. Microdynamics
  11. Upfront or laid back presentation
  12. Treble response (brightness)
  13. Midrange, vocals
  14. Bass response
Link to comment

I ranked bass last, because nothing I have come across has flagged that getting spectacular bass response flatness has anything to do with the item at the other end, realism. Subjectively intense, powerful bass lines automatically emerge when the other attributes in the above list are at a high level, IME - a system can sound like "it has no bass!"; or knock one over with the gutsiness of what's happening in the bottom end - but the FR characteristics do not change one iota between those two states ...

Link to comment
11 hours ago, pkane2001 said:

 

Anecdotally, from studies (including those by Floyd Toole), and in my own, personal testing, lack of bass leads to a less satisfying and  less realistic presentation. Perhaps it's not true bass, but more the mid-bass that's important. For example, a recording of a piano sounds unrealistic to me if I don't hear the resonance of the lower frequencies produced by a concert grand. A good piano is like a sports car engine: if it sounds thin, it isn't satisfying :) Your second most important attribute is 'full sound', which to me, does seem to require a sufficiently competent lower-frequency presentation.

 

Having listened, somewhat recently, to a rig which used DEQX DSP, active amplification, and 2 extremely well built subwoofers to guarantee that the FR was as flat as, all the way to 20Hz - a frequency sweep showed that the response to the inaudible regions was definitely there - this didn't do an ounce of good for bringing forth the richness of a pipe organ CD I brought. The fact all the bass frequencies were there didn't stop the overall sound being "thin", and unsatisfying ... QED.

 

Quote

 

I think it's important to consider the relative scale that each of us assigns to various attributes, and I'm certain the scale is not the same for everyone. For me the tonal balance is more important than soundstage depth/width/separation, for example. As a long-time audiophile, I do enjoy 'spatial fireworks' in audio reproduction, but these are more often a curiosity, a nice-to-have. I notice them when they are present, but don't miss them that much when listening to good music. Proper tonal balance is something I simply couldn't live without, on the other hand. 

 

Agree with all of that, Paul 🙂.

Link to comment
  • 2 weeks later...
On 6/14/2020 at 8:38 AM, pkane2001 said:

 

This is my personal opinion, but it's hard to go from 'huge, obvious, night-and-day differences' to 'minor, often impossible to tell'. And yet, that's what blind tests often reveal: the differences are either extremely minor, or impossible to detect. Accepting the results that so clearly clash with person's life-long experience and many years of hobby pursuit is hard and sometimes painful.

 

My experience has shown that the circumstances that blind tests are conducted in will be enough to completely undermine the optimisation that the hobbyist has applied to his particular situation. IOW, the blind tests never actually test what they purportedly are attempting to reconcile - the abyss in understanding between the two camps remains just as wide and deep as it ever has; and will remain so, until greater overall understanding evolves ...

 

Link to comment
11 hours ago, pkane2001 said:

 

You've obviously had a huge amount of experience conducting blind tests, Frank! Can you please share any of the results and procedures here, so we can try to reproduce your findings?

 

Paul, what I mean here is that the organisation of the gear so that a blind test can be carried out will often suffer from the observer effect - will take require a bit of care to ensure this is not the case.

Link to comment

Paul's doing an excellent job of developing tools that help with understanding what might be happening when the SQ is below par - I applaud what he has contributed to the community! 👍

 

Eventually, methods will be developed so that numbers can be applied, relatively straightforwardly, to what people are hearing - I'm always looking out for any serious research into getting closer to this goal; now and again something looks promising, but then it disappears into obscurity - there is little overall interest, it seems; far easier to keep beating the "it's all in their heads!" drum, 😜.

Link to comment
On 5/23/2020 at 12:25 AM, pkane2001 said:

This thread is specifically about bias in audio testing, how it may affect the results, and any mitigation strategies that can help deal with it.

 

 

Specifically about the last point, Paul, I'll mention the general methods I use when assessing:

 

First of all, the word "preference" is meaningless to me - that concept is alien to how I think ... either a system is acceptably accurate to the recording, or it's not. If the former, then if comparing two genuinely very highly performing systems then I might favour one over the other - but this situation has never arisen for me.

 

What I find I'm always doing is determining whether a setup is acceptably accurate. And how I go about that is as follows: I have a range of recordings that have very distinct characteristics; they have 'signatures' which may prove difficult for a particular rig to produce acceptably - or, they may have little trouble doing so. When faced with a new replay setup, I'll throw an almost random set of recordings at it, see what first impressions tell me. Then, based on what the feedback there is, I'll narrow down on just a few recordings which most clearly highlight where I feel the rig is not working at its best ... I will play these at various volumes, explore what further information that gives me. The better a particular system functions, the more 'difficult' the recordings I will use, the louder I will listen to them, and the 'deeper' into the sound image I will focus, to see if the finest details still come across correctly.

 

IOW, I'm always attentive to any signs of misbehaviour; any failures of the playback to retrieve what's on the recording to an acceptable standard ... by this approach I feel that bias plays no part - the concept is to merely identify what is incorrect in the playback.

Link to comment
2 hours ago, pkane2001 said:

 

While maybe that's true about you,  Frank, it may not be true for most people :)

 

I agree it's not true for most audio enthusiasts - generally, the intent is for certain, prescribed recordings to sound impressive, and most else much less so - cutting myself off from so much good music is not particularly appealing, 😉.

 

2 hours ago, pkane2001 said:

 

When you say there is no preference except for what is accurate, this is certainly not true in all cases, and for one, it would be great if you could demonstrate that this is true with some objective evidence.

 

That it's true that most people would prefer, or not prefer, accurate playback?

 

2 hours ago, pkane2001 said:

Do we even know what "accurate reproduction" means for the whole audio system, including speakers, the room, and the listener? How can you be sure that what you think is "accurate reproduction," is not in fact, seriously distorted? Just because you think it is, doesn't make it so. As you can see in many of the studies cited here, even highly trained professionals fall for simple sighted bias and prefer something based on the appearance and expectation rather than on the sound, despite being aware of bias and being taught to avoid it.

 

Accurate reproduction can only be assessed at the interface between the speakers and the listening environment - headphones are of course the classic means for doing this, but this is not the ideal listening situation for many people. How I get around this is by what I've mentioned many times - listening to an individual speaker as if it were one half of a headphone ... strangely enough, the physical principles in place are actually rather similar ... 😜.

 

How it works in practice is, that if one can use a single speaker like a headphone driver, then the SQ is of a sufficient standard. Seriously distorted replay is completely obnoxious to listen to in this manner; makes it trivially obvious that the replay chain is faulty.

 

2 hours ago, pkane2001 said:

You may also find interesting @Archimago's last blind test results. There's a result with a p value of better than 0.05 that points to the test subjects preferring a little distortion (-75dB THD) with their music over the completely undistorted version. This is not proof of anything, but it does point to the possibility that what sounds better to many isn't necessarily what is most accurate.

 

 

 

I mentioned in his thread on this forum about that test, why this something like this might be happen - purely speculation on my part; I haven't listened to his samples on a well enough performing rig to check further.

Link to comment
12 hours ago, Summit said:

 

Okay, before setting up and conducting a test you have to decide what to measure and how it can be measured. Sound quality is subjective in nature and is therefore very difficult to measure because people have preference as well as bias. Sound quality also depends on many external factors like the quality of the room, audio system, recordings and how they are setup etc..

 

 

Yes, decide what you want to measure ... IME many ambitious rigs are about developing various types of tonality seasoning, and then of course subjectivity is everything in the assessment of what one hears. But "sound quality" is not subjective, in my book - it is the degree to which there is no significant audible adulteration of what's on the recording by the playback chain - how one assesses is by listening for faults in the sound, clearly evident misbehaviour of the replay.

 

To use the dreaded car analogy, 😜, most audiophiles compare by saying things like, I prefer the MB ambience over the BMW variety ... I say, I note that car 1 develops a slightly annoying vibration while accelerating, at a certain engine speed; whereas car 2 doesn't. Therefore, car 2 is the better quality car.

Link to comment
5 hours ago, Summit said:

 

Sound quality is not subjective per se, but the listening evaluation normally is.

 

I thought tonality seasoning was your thing 9_9.  

 

Yes, it's all about how you listen ...

 

My thing? What I'm about is hearing what's on the recording; and audiophiles as a group are deeply locked in a belief system that insists that there is a hierarchy of "bad" to "good" recordings, where the the worst of the former are completely unlistenable to .,.. and so instinctively gravitate to rigs that have quality signatures which reinforce that thinking - the seasoning has a huge range of excuses for why it's done; and an obvious example which is held in high esteem is "tube rolling" ... don't like how the recording sounds? OK, change some active devices until the distortion added balances the nature of the recording ... the consumer becomes part of the mastering chain.

 

I won't be happy until the latest rig produces the same subjective presentation that I got 35 years ago when the system back then was working at peak level - I know that I'm close to excluding all playback chain personality when the experiences match - as I would have thought would be a reasonable take on the situation, 😉.

Link to comment
6 hours ago, pkane2001 said:

The actual paper with analysis and detailed statistics is behind a paywall, but I don't think they analyzed false negatives. Since the results in their blind test was clear, consistent preference that was statistically significant, I don't think a false negative was a possible outcome there:

 

Toole, F.E. and S. E. Olive, “Hearing is Believing vs. Believing is Hearing: Blind vs. Sighted Listening Tests and Other Interesting Things”, 97th Convention, Audio Eng. Soc., Preprint 3894 (1994 Nov.)

 

I found it to be available here, https://www.academia.edu/27512201/Hearing_is_Believing_vs._Believing_is_Hearing_Blind_vs._Sighted_Listening_Tests_and_Other_Interesting_Things?auto=download

 

Link to comment
  • 3 weeks later...
2 hours ago, Chris987654321 said:

It’s a really sad time in audio history with high tech dominating the discussion, mainly by utilizing the double-blind test in the favor of selling tech to maintain salaries worthy of the engineer’s higher educations. The proclaiming that biases are at the root of enjoyment so we must eliminate them to accurately know something, is short sighted. 

 

 

 

If one is 'biased' towards the subjective quality of live music, versus that of a hifi system, then indeed it's at the root of what's important ... 😉.

 

IME all the listening testing is completely off course - asking which sound "they prefer". What's actually needed is to ask, which sound is closest to presenting the qualities that identify live music, irrespective of whether it sounds "nice", "not rough", and all the other silly adjectives that seem to be used, 😁.

 

The science of audio has got in its head that stereo reproduction is always going to be a lame duck; and therefore treats people's subjective reaction at a very coarse level - here, biases will be a huge factor, because the SQ in the testing is so low; people will grab onto anything to give themselves a hint as to whether one is better than another, 😉.

Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now



×
×
  • Create New...