Jump to content
IGNORED

Audio Blind Testing


Recommended Posts

20 hours ago, GUTB said:

I have nothing to add to the topic of audio blind testing -- except to say that it won't work unless there are very large differences in sound. Not to any degree of acceptable mathematical rigor, anyway. This is due to the ear-brain issues that are difficult to control for.

 

I attempted to take the concept of ABX seriously, some years ago, and determined that the tools available were so poor, or were unavailable - so, lost interest fast ... . Specifically, Foobar's extension has so many issues that it was a waste of time - a badly made hammer is useless, as a tool.

 

In other areas where hardware is involved the difficulties are horrendous, especially in the areas where I operate - altering part of the system where, say, hard wiring is an essential means that the exercise is an impossibility.

 

The point, yet again, is to completely forget about whether something is different - this is useless as a means of advancing the status, performance of an audio system; a complete waste of time. The only criterion should be whether one can detect whether the playback is audibly faulty or not - and then resolve any failings ... if a luxury car has a rattle in it, then the only concern is to get rid of the rattle - not, "do you prefer this quality, or that quality, of annoying noise?"; or, "if you increase the thickness of the carpet it makes it harder to hear the rattle!" .. I shake my head a lot of the time when I read comments about audio ...

Link to comment
4 minutes ago, fas42 said:

 

I attempted to take the concept of ABX seriously, some years ago, and determined that the tools available were so poor, or were unavailable - so, lost interest fast ... . Specifically, Foobar's extension has so many issues that it was a waste of time - a badly made hammer is useless, as a tool.

 

In other areas where hardware is involved the difficulties are horrendous, especially in the areas where I operate - altering part of the system where, say, hard wiring is an essential means that the exercise is an impossibility.

 

The point, yet again, is to completely forget about whether something is different - this is useless as a means of advancing the status, performance of an audio system; a complete waste of time. The only criterion should be whether one can detect whether the playback is audibly faulty or not - and then resolve any failings ... if a luxury car has a rattle in it, then the only concern is to get rid of the rattle - not, "do you prefer this quality, or that quality, of annoying noise?"; or, "if you increase the thickness of the carpet it makes it harder to hear the rattle!" .. I shake my head a lot of the time when I read comments about audio ...

 

A very valid approach. You can look at your system as a constant journey to fix problems. In my case, the big problem to be addressed is lack of dynamic force. Other problems are that my soundstage isn’t fully unfurling and little bass response (which is admittedly done on purpose to avoid destructive room modes).

Link to comment

I’ve recently conducted a personal “blind” a/b test of MQA vs. CD on Tidal. Here’s how I did it. On my iPad BlueSound app, I opened Tidal and looked up the CD and MQA versions of an album. I then added the same track from each version to the play queue and then kept adding the same track from each version until I had a 20 track playing queue consisting of the alternating versions.  By clicking on the album cover in the playing queue, the display changes to a page that shows the queue scrolling horizontally across the bottom of the page. What you’ll see when you do the queue sequencing like this is just a horizontal list of identical thumbnail images of the album cover. You can’t tell from the thumbnails which individual track is the CD version and which is the MQA version. The only way to visually tell the difference is to actually play one of the tracks. A little circle appears elsewhere on the page with either “CD” or the MQA symbol. This can be easily covered by a Post It or something else.

 

To start the “randomization” I just randomly scrolled through the playing queue at the bottom to approximately the midway point in the queue and selected one of the identical thumbnails. As long as I avoided scrolling to a starting point that showed either the first or last item in the queue it was impossible to tell which version had been selected. After listening to the first randomly selected track, I then listened to it’s “neighbor” (which because of the a/b sequencing of the queue I knew was from the other format). I went back and forth several times until I was ready to pull the Post-It off and see which track was which format. Worked like a charm for allowing me to do a “blinded” personal a/b comparison of CD and MQA versions in Tidal!

 

In case anyone is interested, I tried this on four different tracks, two of which I had previously listened to “sighted” and two of which I had never before listened to on Tidal in either format. I was able to identify which was which 100% of the time. In the two previously “sighted” tests, I did not even need to do the “B” test to correctly identify which format the “A” test came from. In the two unsighted tests, I needed to repeat the A/B test twice on one of the tracks and three times on the other track before I was sufficiently confident in my decision. Not statistically signifcant for you? You’ll get no argument from me on that. However, that was not my personal goal in doing this little experiment. I simply wanted to challenge the strong sense I had developed over the past month from extensive “sighted” comparisons that I could consistently recognize and distinguish between the CD and MQA versions of recordings. The experiment satisfied me that I’m, indeed, really hearing the difference and not just submitting to confirmation bias in my general preference for MQA recordings. (Apologies for introducing a very controversial topic in the middle of another very controversial topic!)

Link to comment
15 hours ago, sandyk said:

  I agree.

The only way is to have somebody else not connected with the listening decisions part, to manually swap out the devices under test behind the scenes.

 

And of course, that person shouldn't know which device is being "hooked-up" any more than the listeners shotld. Otherwise it's not a true DBT.

 

The co-inventor (with Ben Muller) of the ABX comparator, one Arny Krueger,  invented the thing (by his own admission) in order to prove that everything (except speakers) sounded the same. This is a point that he argued on Usenet for years against me and others (and notoriously against John Atkinson in a famous, public debate). After debating with his nonsense for years about amplifiers, DACs, disc players and vinyl setups, I came to the conclusion that he couldn't hear. He made so many ridiculous assertions, that as far as I'm concerned, the man has no credibility whatsoever!  Some of his assertions were: the original Dynaco Stereo 120 Solid-State amplifier from the 1960s sounded exactly like the then latest Amps from Krell, Pass, Audio Research, etc. This was at a time when it common knowledge that anyone could see (and ostensibly hear) the nasty crossover notch from a sine wave on the oscilloscope caused by the 2N3055 output transistors on the ST120 being too slow to switch fast enough and too fragile to be biased far enough into class AB to eliminate the notch. The amp sounded awful and was only tolerated by the audiophile public because it was cheap, and powerful (for the day) and the audio press was touting the "transistor sound" as being a good thing! He also asserted that the (then) latest $100 Japanese receivers from Costco sounded exactly like "so-called high-end amplifiers" costing upwards of one-hundred times as much! He also maintained that he was still using the original Sony CDP-101 player from 1982, and that it sounded just exactly like the latest high-end players from MSB, dcS, etc. and that they were a rip-off! Another of Krueger's classic idiotic assertions was that the latest turntable/arm/cartridges were no better than those of the 1960's and that absolutely no progress had been made in that field! While it is true that some decks from those days can still be satisfying performers when restored (Garrard 301, 401, Thorens TD-124, TD 125, the AR turntable (sans arm), etc), arms and cartridges and decks have improved in leaps and bounds. I've had turntables from the '60's, '70's, '80's  up to the present, and I can tell that the best vinyl rigs of today will knock the sox off of the best that any 20th century playback rig had to offer (not to say that these older decks can't sound good, but they simply cannot retrieve from the grooves the level of SQ that today's best vinyl rigs can. It's an eye-opening experience to hear what even old LPs can sound like on a state-of-the-art rig from Walker, VPI, or Air Force or Clearaudio (to name a few)!

My point is how can an ABX comparator designer like Krueger make a totally transparent comparator when he can't hear the difference between transparent and non-transparent or the differences between the equipment likely to be tested by it? 

George

Link to comment
41 minutes ago, GUTB said:

 

A very valid approach. You can look at your system as a constant journey to fix problems. In my case, the big problem to be addressed is lack of dynamic force. Other problems are that my soundstage isn’t fully unfurling and little bass response (which is admittedly done on purpose to avoid destructive room modes).

 

That's the idea! :P:D

 

Immediately one can consider what needs to be addressed, having stated those concerns - as an example, a lack of dynamic force implies that distortion levels are building too fast as the volume rises, which could be due to a number of factors, each of which can be looked at in turn. Personally, I would now consider whether the speaker are sufficiently stabilised in their location, whether the power supplies of the amplifier were sufficiently sorted, and how well the components were isolated from each other's impact on power supply noise - as a couple of starters.

Link to comment
15 hours ago, GUTB said:

. @gmgraves admits to have faced it himself, but he chooses to believe the differences he heard were in his head after being influenced by the stress of blind testing.

 

Actually, I came to the conclusion that these interconnect cable differences were imaginary after they disappeared in DBT after DBT. It was then that I realized what a strong influence expectational and conformational bias plays on the human brain. Senses are easily corruptible and can only be trusted to a certain degree (one example of trustable opinions is those that result from long-term listening over many different times of day, and different moods). The truth is that people see and hear what they expect to see and hear and often what they want (consciously or subconsciously) to see and hear. Criminal science has decided that no evidence in a court of law is more unreliable than the well-meaning eye-witness. Unfortunately, the court systems (at least in the USA) have yet to evolve to the point where eye-witness testimony is accorded the diminished weight it deserves in an actual trial. 

George

Link to comment
39 minutes ago, knickerhawk said:

To start the “randomization” I just randomly scrolled through the playing queue at the bottom to approximately the midway point in the queue and selected one of the identical thumbnails. As long as I avoided scrolling to a starting point that showed either the first or last item in the queue it was impossible to tell which version had been selected. After listening to the first randomly selected track, I then listened to it’s “neighbor” (which because of the a/b sequencing of the queue I knew was from the other format). I went back and forth several times until I was ready to pull the Post-It off and see which track was which format. Worked like a charm for allowing me to do a “blinded” personal a/b comparison of CD and MQA versions in Tidal!

 

Good one! I've had ideas of doing a variation of this, to satisfy those who just have to have "somethin' scientific" before they believe anything. There's clearly audible differences between MQA and without - the point then is whether MQA is just "distorting" the raw version to make it "nicer" for a high percentage of people.

Link to comment
53 minutes ago, knickerhawk said:

I’ve recently conducted a personal “blind” a/b test of MQA vs. CD on Tidal. Here’s how I did it. On my iPad BlueSound app, I opened Tidal and looked up the CD and MQA versions of an album. I then added the same track from each version to the play queue and then kept adding the same track from each version until I had a 20 track playing queue consisting of the alternating versions.  By clicking on the album cover in the playing queue, the display changes to a page that shows the queue scrolling horizontally across the bottom of the page. What you’ll see when you do the queue sequencing like this is just a horizontal list of identical thumbnail images of the album cover. You can’t tell from the thumbnails which individual track is the CD version and which is the MQA version. The only way to visually tell the difference is to actually play one of the tracks. A little circle appears elsewhere on the page with either “CD” or the MQA symbol. This can be easily covered by a Post It or something else.

 

To start the “randomization” I just randomly scrolled through the playing queue at the bottom to approximately the midway point in the queue and selected one of the identical thumbnails. As long as I avoided scrolling to a starting point that showed either the first or last item in the queue it was impossible to tell which version had been selected. After listening to the first randomly selected track, I then listened to it’s “neighbor” (which because of the a/b sequencing of the queue I knew was from the other format). I went back and forth several times until I was ready to pull the Post-It off and see which track was which format. Worked like a charm for allowing me to do a “blinded” personal a/b comparison of CD and MQA versions in Tidal!

 

In case anyone is interested, I tried this on four different tracks, two of which I had previously listened to “sighted” and two of which I had never before listened to on Tidal in either format. I was able to identify which was which 100% of the time. In the two previously “sighted” tests, I did not even need to do the “B” test to correctly identify which format the “A” test came from. In the two unsighted tests, I needed to repeat the A/B test twice on one of the tracks and three times on the other track before I was sufficiently confident in my decision. Not statistically signifcant for you? You’ll get no argument from me on that. However, that was not my personal goal in doing this little experiment. I simply wanted to challenge the strong sense I had developed over the past month from extensive “sighted” comparisons that I could consistently recognize and distinguish between the CD and MQA versions of recordings. The experiment satisfied me that I’m, indeed, really hearing the difference and not just submitting to confirmation bias in my general preference for MQA recordings. (Apologies for introducing a very controversial topic in the middle of another very controversial topic!)

 

First, that is not blind test. the reason being you are comparing high res vs CD quality. Second, you do not know if the MQA and CD files are from the same master, so this is a HUGE problem with this test. Before you do any test likse this please read on how to actually perform a blind test. You have done at least 4 blunders that basically negates your conclusion.

Current:  Daphile on an AMD A10-9500 with 16 GB RAM

DAC - TEAC UD-501 DAC 

Pre-amp - Rotel RC-1590

Amplification - Benchmark AHB2 amplifier

Speakers - Revel M126Be with 2 REL 7/ti subwoofers

Cables - Tara Labs RSC Reference and Blue Jean Cable Balanced Interconnects

Link to comment
30 minutes ago, Ralf11 said:

claiming it DOES exist with no evidence is a LOT worse

 

 

 

Everyone accepts that speakers sound different, because different materials are used in the design and assembly of the drivers - no-one gets excited when someone claims that speaker A sounds different from speaker B. Yet, this all evaporates when we move from the nominally mechanical world of speakers, to the nominally electrical world of the other components - so, do we believe that electrical behaviour is a magical aspect of nature that behaves precisely as the textbooks state, by very simple rules, under all circumstances?

 

I don't - the world of audio is beset with parts that have all sorts of parasitic behaviours - they don't behave just like the textbook says - because, hey, they are made by humans, using imperfect manufacturing processes - I would be immensely surprised if some electrical part was "perfect" - which includes cable. The more I've gone into it, the messier it gets - a good system is always a balancing act of compromises: get right what is most important; be less fussy with the rest.

 

Claiming that some part of an audio system couldn't be better in how it functioned would be a true absurdity, IMO. The argument is really whether it always does its job well enough so that it never is audible, in any situation - my experience is that the better a system gets, the more a nuisance the residual behaviours become - because their impact is no longer hidden under the "noise" of the more obvious shortcomings of the rig.

 

Now, if someone claimed that a luxury vehicle couldn't develop an annoying rattle, under any circumstances - then I would be extemely skeptical ...

Link to comment
1 hour ago, fas42 said:

Things like interconnect cable differences will always fail in DBT - that's because the time factor aspect is never part of the test -

 The time factor is very relevant due to the length of time between cable changeovers.

 The problem here, is that even  if you switch using a relay type comparator, you are introducing other variables such as plugs and sockets and additional length compared with using just the cables themselves. 

 

How a Digital Audio file sounds, or a Digital Video file looks, is governed to a large extent by the Power Supply area. All that Identical Checksums gives is the possibility of REGENERATING the file to close to that of the original file.

PROFILE UPDATED 13-11-2020

Link to comment
25 minutes ago, sandyk said:

 The time factor is very relevant due to the length of time between cable changeovers.

 The problem here, is that even  if you switch using a relay type comparator, you are introducing other variables such as plugs and sockets and additional length compared with using just the cables themselves. 

 

I'm thinking in terms of it taking time for the cable materials, to settle down, stabilise after they've been manhandled (womenhandled?), and the metal to metal contacts in the path to be in a long term stable state - I hate this sort of thing, so use cheap, every day wire which I hardwire into the setup - and then leave it ... problem solved ...

Link to comment
2 hours ago, botrytis said:

 

First, that is not blind test. the reason being you are comparing high res vs CD quality. Second, you do not know if the MQA and CD files are from the same master, so this is a HUGE problem with this test. Before you do any test likse this please read on how to actually perform a blind test. You have done at least 4 blunders that basically negates your conclusion.

 

Go buy a $400 Pro-Ject or even a $200 DragonFly Red, and listen to several good MQA recordings vs the non-MQA version. The difference is NOT slight. I find it 100% believable someone could ace a blind test.

Link to comment
24 minutes ago, GUTB said:

 

Go buy a $400 Pro-Ject or even a $200 DragonFly Red, and listen to several good MQA recordings vs the non-MQA version. The difference is NOT slight. I find it 100% believable someone could ace a blind test.

 

As I said , IT DEPENDS on the master. Unless you know the recordings are from the EXACT same master, it is utter non-sense to say one is better than the other. The problem is even though MQA purports to be accurate with time delays, etc, it is not SONICALLY ACCURATE. The filtering adds noise above the Nyquist frequency used for the PCM files. This noise, when added to files above the 48KHz resolution can actually add sympathetic noise at the midrange or higher frequencies. This has been shown in numerous tests of MQA files, therefore, it doesn't matter if the files are time accurate when they are not sonically accurate. The amount the regular files are off can barely be determined by human hearing. So, why push a file system, like this, is utterly flummoxing to me.

Current:  Daphile on an AMD A10-9500 with 16 GB RAM

DAC - TEAC UD-501 DAC 

Pre-amp - Rotel RC-1590

Amplification - Benchmark AHB2 amplifier

Speakers - Revel M126Be with 2 REL 7/ti subwoofers

Cables - Tara Labs RSC Reference and Blue Jean Cable Balanced Interconnects

Link to comment
2 minutes ago, botrytis said:

 

As I said , IT DEPENDS on the master. Unless you know the recordings are from the EXACT same master, it is utter non-sense to say one is better than the other. The problem is even though MQA purports to be curate with time delays, etc, it is not SONICALLY ACCURATE. The filtering adds noise about the Nyquist frequency used for the PCM files. This noise, when added to files above the 48KHz resolution can actually add sympathetic noise at the midrange or higher frequencies. This has been shown in numerous tests of MWA files, therefore, it doesn't matter if the files are time accurate when they are not sonically accurate. The amount the regular files are off can barely be determined by human hearing. So, why push, a file system, like this, is utterly flummoxing to me.

 

Let's inject some reality for a second.

 

Everyone knows that SACDs sound better than CDs. It's widely postulated that the reason for this isn't because SACD as a format is that much better than CD, but because SACDs tend to be mastered better. If we accept the premise that mastering makes the difference and SACDs generally have better mastering -- then we can't escape the conclusion that SACDs are still generally better.

Link to comment
32 minutes ago, pkane2001 said:

 

 

What does time have to do with the blind aspect of the test? If you think that a blind test must only be done with a rapid A/B switching, then you might be right, but who said you have to do that?

 

If you can tell a difference in a fully sighted test, then try to repeat exactly the same test, blind. Same time frames, same comparisons,  same material, except that you don't know which cable is in the system. If you fail to distinguish them while blind, you can be fairly confident that whatever differences you heard while testing sighted was not real. And if you're afraid that the pressure of a blind test or performance anxiety will skew the test -- take a Zanax.

 

 

 

As a following post emphasised, it's to do with all the materials in the path stabilising - as the most straightforward example, the metal to metal contacts at either end of the cable are initially "clean", from the wiping of the contact surfaces - then they steadily build up corrosion contaminants, which affects the sound. Slowly the construction of the cable comes into play, altering the spectrum of distortion artifacts.

 

Not something I've done myself, but an audio friend had normal, and pricey, audiophile cable. Clear difference between the two, used normally, he said; then he took on my suggestion to hardwire the links in the system ... and the differences between the everyday, and "special" cable went away! That convinced him ... I am not interested in listening for different types of distortion, I want no distortion - hence, create air tight connections everywhere that matters.

Link to comment

not sure how time is being used in all the posts

 

but if the time difference between listening sessions is considered that may well make it more different for a listener to compare - most sensory phenomena (and their processing in the brain or elsewhere) are 'designed' to do comparisons

 

One good way to do listening comparisons is listen to one short passage on A, and then on B

- this is trivial to setup using 2 different CD players; not so easy with 2 different speakers

 

*** ...not to rule out extended sessions

Link to comment
14 minutes ago, Ralf11 said:

One good way to do listening comparisons is listen to one short passage on A, and then on B

- this is trivial to setup using 2 different CD players; not so easy with 2 different speakers

 

*** ...not to rule out extended sessions

 

An easy method for picking audible variations in versions of source files, that I have used on occasion, is to bring both into an editing program like Audacity, make sure they are fairly closely synchronised - and pick a likely area that may vary, say a few seconds worth. Set up a loop of playing that spot, continuously, and solo, select one track only - let it build up a rhythm of sound in your mind, almost like a mediatative thing - and then, switch the track being played in the loop. Often times, the difference will hit you like a solid thump - or, it may not vary a beat in the sense of it ... you have an answer, just like that.

Link to comment
54 minutes ago, fas42 said:

 

As a following post emphasised, it's to do with all the materials in the path stabilising - as the most straightforward example, the metal to metal contacts at either end of the cable are initially "clean", from the wiping of the contact surfaces - then they steadily build up corrosion contaminants, which affects the sound. Slowly the construction of the cable comes into play, altering the spectrum of distortion artifacts.

 

Regardless of any supposed 'settling' or 'break-in' issues related to cables, the point is that a sighted test is subject to exactly the same constraints as a blind test as far as time is concerned. What would cause a cable longer to stabilize in a blind test compared to the same cable in a sighted test?

 

Out of curiosity, do you have any references to measurements or other objective studies of  cable stabilization time?

Link to comment
1 hour ago, pkane2001 said:

 

Regardless of any supposed 'settling' or 'break-in' issues related to cables, the point is that a sighted test is subject to exactly the same constraints as a blind test as far as time is concerned. What would cause a cable longer to stabilize in a blind test compared to the same cable in a sighted test?

 

Out of curiosity, do you have any references to measurements or other objective studies of  cable stabilization time?

 

We could be talking hours, days for conditions to stabilise - makes it difficult to run ABX under those circumstances! Other times, the variation is almost immediate - depending upon precisely what is is causing a 'problem'.

 

Things taking a long time to stabilise are a curse in any field, I used to run my system 24/7 decades ago, because of this behaviour - a waste of power, etc, but I hated the loss of quality that occurred every time there was a switch on from cold, which took ages to settle down.

 

I can't recall coming across any measurement data, or other studies - usually anecdotal, which would draw an "Ah Hah!" of recognition from me ... the best I can do here.

 

As I've said a number of times, my goal would be for a system, assembled from scratch, to reach acceptable quality about 5 minutes or so after power on - I've never achieved this ... perhaps down the track ...

Link to comment
18 minutes ago, fas42 said:

 

We could be talking hours, days for conditions to stabilise - makes it difficult to run ABX under those circumstances! Other times, the variation is almost immediate - depending upon precisely what is is causing a 'problem'.

 

Things taking a long time to stabilise are a curse in any field, I used to run my system 24/7 decades ago, because of this behaviour - a waste of power, etc, but I hated the loss of quality that occurred every time there was a switch on from cold, which took ages to settle down.

 

I can't recall coming across any measurement data, or other studies - usually anecdotal, which would draw an "Ah Hah!" of recognition from me ... the best I can do here.

 

As I've said a number of times, my goal would be for a system, assembled from scratch, to reach acceptable quality about 5 minutes or so after power on - I've never achieved this ... perhaps down the track ...

 

Frank, that's a very tenuous explanation for why a DBT might fail to detect a difference. You are basing this on anecdotal evidence of cable 'stabilization'. And, you are missing the point, again: the same cable behavior applies whether or not you are doing sighted or unsighted testing. So, any findings of different sound between cables (or no difference) is just as valid or invalid in both cases.

 

There's nothing special that makes a DBT test more susceptible  to timing due to cable stabilization than a sighted test. Unless, of course, you claim some quantum mechanical effect related to a wave function collapse in a sighted test ;)

 

 

Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now



×
×
  • Create New...