Summit Posted May 23, 2020 Share Posted May 23, 2020 The short answer is yes BIAS will affect the audio test result. But it’s not that easy that those BIAS testes therefore should be seen as some kind of litmus paper for SQ. All type of subjective SQ tests* also have an effect on the results, because the test methods force us to listen and come to conclusions about the quality of a product in an (for many) unnatural way. The tests themselves have to be done very different to how many people normally listen to and evaluate audio gear** to be scientifically correct. It’s not so much the stress IMO, as it is about the many repeat and short time to hear how it sounds like, compared to then listen for longer time in a familiar system. BIAS tests are good if we for example want to know if a particular cable can sound different to another cable. Not so good if we also want to know which cable people prefer the SQ of. So in other words BIAS tests can be done for “Discrimination testing” but not for “Preference testing”. Threshold test are not for audio gear, it’s for testing the human hearing. *A/B test, ABX test, DBT, blind testes etc. **Familiar system, room, many well-known but different sounded records. Link to comment
Summit Posted May 23, 2020 Share Posted May 23, 2020 2 hours ago, pkane2001 said: None of the studies I mentioned so far were used for discrimination testing: all were testing for listener preference. And none required short samples and fast switching during the test. As SAM said, this is an often used argument against blind testing, but perhaps that's an issue only with those who don't know how blind tests are conducted (and I don't mean just subjectivists here, objectivists often have their own misconceptions). This is why I'd like to have this discussion where actual findings and facts can be discussed and referenced instead of generally used but unsubstantiated arguments. Testing for listener preference is one thing, it can be done by A/B test, ABX test, DBT, blind testes etc. To test for BIAS is another thing. How would we know if the test group or some in the test group really liked one gear over another or if it was because of confirmation bias? All type of subjective tests have an effect on the results, not only blind tests. I see that you find my reasoning to be misconceptions and unsubstantiated arguments so I will not post here anymore. Link to comment
Summit Posted June 13, 2020 Share Posted June 13, 2020 Big Sound 2015 blind listning test with some participants that you may know. https://www.innerfidelity.com/category/big-sound-2015 https://www.innerfidelity.com/content/big-sound-2015-wrap-what-i-learned https://www.innerfidelity.com/content/big-sound-2015-roy-romaz-nails-it https://www.head-fi.org/threads/schiit-happened-the-story-of-the-worlds-most-improbable-start-up.701900/page-516#post-11921090 pkane2001 1 Link to comment
Popular Post Summit Posted June 27, 2020 Popular Post Share Posted June 27, 2020 23 hours ago, pkane2001 said: There are often objections raised that blind audio test results are invalid because it's too difficult to do them right. That's often used by those who haven't tried blind tests or want to disprove the results of such tests, and usually without evidence. You keep saying the same thing over and over again. I know of no genuine objections raised that blind audio test is good for testing bias. The question is if those bias testes and how they are conducted are good for testing the subtle SQ difference of audio gear. Audio gear that many people feel that they need to evaluate for a long time to get to know how they sound. I have said it before and will tell you again that those quick change between gears is not how I and many other audiophile are listening to music and how we normally evaluate which gear to buy. The Auditory memory is very short so what those 5-30 second test tell us is how one audio gear sound compared to another, which is completely different to when you listen for a long time and your mind has the time to calibrate and you can evaluate how the sound of a gear is compared to a reference which is real none recorded sound. None of the problem associated with blind testes are because they are done blind, it's because how they are done. I have made a few blind testes with friends in the past and the fast swiping of gear was no good. This is not limited to blind tests per se and the problem is the same no matter if a sighted or blind A/B test. Blind test there we could first get to know the sound of the gear for a couple of hours and then listen to a whole song before changing gear was much more telling. Teresa, Audiophile Neuroscience, Richard Dale and 1 other 2 2 Link to comment
Summit Posted June 27, 2020 Share Posted June 27, 2020 5 hours ago, pkane2001 said: Your main objection seems to be that longer term audio evaluation is more sensitive than short term to small differences. That's an often brought out hypothesis, but I've yet to see any sort of objective evidence to support it. Can you cite some studies that demonstrate this increased sensitivity? By the way, your hypothesis in no way invalidates blind testing, it simply proposes a method of doing it. That is correct my problem have never been about testing SQ blind, I have done it myself a few times. The problem is the methodology commonly used in all type of tests to determine sound quality of different gear. The biggest problem is that these tests are not conducted in the way that I and many audiophile listen and evaluate audio gear sighted. To be able to hear subtle difference between audio equipment I need to be familiar with the room, audio system as well as the recordings. A blind test or a sighted A/B test is only difficult to do if the point is to achieve statistical significant proof by many fast repeated switching of gear. Our Auditory memory is very short so a test there they change gear every 5-30 second test won’t let us hear the SQ just the change between the gear, and to some degree their overall sound signature. To properly evaluate any audio gear we need to be able to first “calibrate” to the sound and then compare it to our long-term memory of real none recorded sound. Echoic memory or other short time memories are not valuable for this task. Then I test two audio gear in my stereo I will compare how the bass, drums, the guitar, piano, voices etc. sound to how they normally sound like in real life and to do that I need to use my long time memory. Even when I compare two audio gear I will use my memory of how those instrument sound compare to both the other audio gear and the references, the non-recording memories I have. I believe that to understand why most audio tests are flawed we need to understand how we hear and compare sound. Here is a start. “Each type of memory is tied to a particular type of brain function. Long-term memory, the class that we are most familiar with, is used to store facts, observations, and the stories of our lives. Working memory is used to hold the same kind of information for a much shorter amount of time, often just long enough for the information to be useful; for instance, working memory might hold the page number of a magazine article just long enough for you to turn to that page. Immediate memory is typically so short-lived that we don’t even think of it as memory; the brain uses immediate memory as a collecting bin, so that, for instance, when your eyes jump from point to point across a scene the individual snapshots are collected together into what seems like a smooth panorama.” https://brainconnection.brainhq.com/2013/03/12/how-we-remember-and-why-we-forget/ Teresa 1 Link to comment
Popular Post Summit Posted June 28, 2020 Popular Post Share Posted June 28, 2020 15 hours ago, pkane2001 said: Well, these are all conjectures or possible explanations for why it may work this way. And I don't necessarily disagree with anything you said here (I'm open to hearing evidence for one way or the other). But I still would like to see some studies or properly conducted blind tests that demonstrate that longer term listening can be more sensitive than shorter-term, 8-10 seconds switching when evaluating minor differences. That's certainly not the way the industry conducts subjective listening tests, although I've not found a clear indication of whether it's because short-term switching is just easier to conduct and more convenient, or because echoic memory limits our ability to evaluate minor audio differences beyond a few seconds. In conversations with some audio testing professionals, they did indicate that shorter-term, quick-switching was the way to detect minor SQ differences. But, again, that's just someone saying it, and what I'm looking for is objective evidence for whether it's true or not. Okay, before setting up and conducting a test you have to decide what to measure and how it can be measured. Sound quality is subjective in nature and is therefore very difficult to measure because people have preference as well as bias. Sound quality also depends on many external factors like the quality of the room, audio system, recordings and how they are setup etc.. So if the test method (time, place, audio system etc.) would be significantly different to how it is normally done *, mustn’t it be questioning if the test was conducted correctly, so it actually measure the participants ability to hear sound quality difference between the tested audio equipment, and the participants bias? I told you in my first post in this thread that BIAS tests can be tested for “Discrimination testing” but not for “Preference testing”. Preference testing has to be set up and be conducted very different to Discrimination testing. The method used in all published blind tests I seen is maybe suited to measure discrimination, but not for preference of sound quality or people’s ability to hear them. “Discrimination testing is a technique employed in sensory analysis to determine whether there is a detectable difference among two or more products.” Let me exemplify which type of hypotheses that can be tested by the method made for Discrimination testing. Example 1: There are no audible difference between USB cables no matter price, design or model. Example 2: People cannot identify a genuine native 96 KHz 24 bit audio recording from an audio recording that has been downs sampled to 41.1 KHz 16 bit. The difference between detecting a difference and forming an opinion about which audio gear that sound best and most true is very different. One is simple the other is more complex. To detect a difference like in the example above or for aspects like loudness, bass quantity etc we use our short time memories, but when we evaluate SQ we use our long time memory. You ask for objective evidence. I have presented logical “evidence” that the test method commonly used is significantly different to how audiophiles normally evaluate sound quality between audio gears. This alone should be enough to questioning all these sound quality testes IMO. Tests which result goes against normally conducted test at home of millions of audiophiles around the globe. To me it’s clear that they believe that they can measure people’s subjective preference and that it can be made by conduct them like it was a Discrimination test. And the reason to this is (conjectures) probably because they want their studies to be scientifically made with statistical significant results. I have also point you to how our brain works and how it can affect audio tests. Even if am not an expert in how the brain and our hearing works I know that the understanding of the brain/hearing and how it works have change considerably the last 10-15 years. It’s true that I cannot quote a study that explicit state that these test are fundamentally flawed. My aim is however not to present proof to why these studies are flawed scientifically. All things that can affect the result of the study/test should be presented and explained and if the affect will inflict/influence the result the method has to be changed so that the test itself has no influence on the result. *The best way to set up and conduct a test is to do it like people normally listen and evaluate sound quality. I know of no one that goes to a HIFI shop and listen to a song by changing audio gear every 5-8 second. Then I and many other audiophiles are evaluating audio gear we listen for hours, days or sometime even weeks before buying it. Many audiophiles only buy gear that they can lend home to evaluate in their own audio system before buying it. The same for reviewers. They often listen to the reviewed gear and compare it to their reference gear for weeks and sometimes month before publishing their verdicts. sandyk, pkane2001, Audiophile Neuroscience and 1 other 3 1 Link to comment
Summit Posted June 29, 2020 Share Posted June 29, 2020 18 hours ago, fas42 said: Yes, decide what you want to measure ... IME many ambitious rigs are about developing various types of tonality seasoning, and then of course subjectivity is everything in the assessment of what one hears. But "sound quality" is not subjective, in my book - it is the degree to which there is no significant audible adulteration of what's on the recording by the playback chain - how one assesses is by listening for faults in the sound, clearly evident misbehaviour of the replay. To use the dreaded car analogy, 😜, most audiophiles compare by saying things like, I prefer the MB ambience over the BMW variety ... I say, I note that car 1 develops a slightly annoying vibration while accelerating, at a certain engine speed; whereas car 2 doesn't. Therefore, car 2 is the better quality car. Sound quality is not subjective per se, but the listening evaluation normally is. I thought tonality seasoning was your thing . Link to comment
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now