Jump to content
IGNORED

Who's afraid of DBTs


Recommended Posts

Guess you can either continue complaining or try to help. If you know a better way of going to the bottom of the issue just post.
I already did - I gave examples of the controls I would use in this post - maybe you missed it?
And _if_ it's true that the positives/negatives are so scarce the next question is somewhat harder. Do we rellay need those? You dont normally go into that stuff unless the 'normal' DBTs show serious issues. Like truly skewed and unconsistent results. Or it's a super duper important matter like people's lives. Or you are testing a fringe new science or theory. To my knowledge none of those applies to audio. But feel free to take care of that question.
Huh??? You're post is hard for me to tease apart - care to make your points step by step?
Or just keep insulting me & others for the n-th time. Which is already boring.
If pointing out the obvious is insulting, then sorry I know it's embarrassing when an obvious problem is pointed out in something that you have been advocating so strongly & this can be taken as an insult but leave your emotions at the door - this is science, Yo!
Link to comment
This is a logical flaw often used by DBT advocates - it's the equivalent of saying, "hey, don't look at or examine the validity of our test, look at how bad the other test is"

 

Comparing the results of two flawed tests (one skew towards false positives & one skewed towards false negatives) does not make a convincing argument for which test is giving the valid answer.

 

Oh come on. You do nothing but complain around here. Tests are impossible to setup. A formal test done by one of the most well known experts is wrong, useless and pointless. And so on. Why do you even hang around?

Some accuse me of high nose and complaining too much but I doubt I ever touched your level of professional complainer.

Link to comment
Guess you can either continue complaining or try to help. If you know a better way of going to the bottom of the issue just post.

And _if_ it's true that the positives/negatives are so scarce the next question is somewhat harder. Do we rellay need those? You dont normally go into that stuff unless the 'normal' DBTs show serious issues. Like truly skewed and unconsistent results.

 

How can you know whether the results of any measurement are skewed unless you independently evaluate the measuring tool(s)? So no, you don't need any evaluation of the efficacy of DBT in the audio context unless you're interested in determining whether the results are skewed.

One never knows, do one? - Fats Waller

The fairest thing we can experience is the mysterious. It is the fundamental emotion which stands at the cradle of true art and true science. - Einstein

Computer, Audirvana -> optical Ethernet to Fitlet3 -> Fibbr Alpha Optical USB -> iFi NEO iDSD DAC -> Apollon Audio 1ET400A Mini (Purifi based) -> Vandersteen 3A Signature.

Link to comment
By the way: In order to avoid garbage (things that are not studies published in peer reviewed journals), you will want to use Google Scholar, rather than just Google. Also, you will want to use the phrase "blind testing" rather than DBT, since the latter acronym has multiple meanings.

 

What you will find, given the reasonable assumption that the Internet works the same way for both of us, is that the sole reference to false negatives in DBT comes in a comment on the Meyer and Moran study I referred to earlier in the thread, not in the study itself. So no, no study devoted to the efficacy of DBT in audio that discusses and tries to measure the "false negative problem," or the protocol's "sensitivity."

 

Scholar is good if you get any results. But the selection is very limited in comarison to the 'big wild net'. Tried that scholar too, not much info.

My google search does filter out other DBT meanings quite well. At least on my device. If you have better suggestions post those instead of complaining. Otherwise my search should be a good start. Feel free to tweak it.

Link to comment

Just wanted to mention:

 

None of what I've said should be taken as criticism of DBT in audio that anyone is doing for purposes of eliminating bias in his or her decision making about purchases, for example, or as a matter of exploration and curiosity. The latter two are things I would especially encourage. If they are not actually "doing science," they are surely in the scientific spirit.

One never knows, do one? - Fats Waller

The fairest thing we can experience is the mysterious. It is the fundamental emotion which stands at the cradle of true art and true science. - Einstein

Computer, Audirvana -> optical Ethernet to Fitlet3 -> Fibbr Alpha Optical USB -> iFi NEO iDSD DAC -> Apollon Audio 1ET400A Mini (Purifi based) -> Vandersteen 3A Signature.

Link to comment
Scholar is good if you get any results. But the selection is very limited in comarison to the 'big wild net'. Tried that scholar too, not much info.

My google search does filter out other DBT meanings quite well. At least on my device. If you have better suggestions post those instead of complaining. Otherwise my search should be a good start. Feel free to tweak it.

 

Wasn't complaining at all, just trying to help focus the search.

One never knows, do one? - Fats Waller

The fairest thing we can experience is the mysterious. It is the fundamental emotion which stands at the cradle of true art and true science. - Einstein

Computer, Audirvana -> optical Ethernet to Fitlet3 -> Fibbr Alpha Optical USB -> iFi NEO iDSD DAC -> Apollon Audio 1ET400A Mini (Purifi based) -> Vandersteen 3A Signature.

Link to comment
Problems with Blind ABX testing - advice needed - Hydrogenaudio Forums

 

and specifically :

Problems with Blind ABX testing - advice needed - Hydrogenaudio Forums

(and beyond)

 

Edit : Oh f*ck. That's the one pointed out by trithio already.

 

You !

Well, just is so.

Haha, that specific link is to Arny Kreuger's post who, despite what he says, demonstrated in his ABX null results just how concerned he is about false negatives - not one iota - his ABX results were random guesses (he didn't do a listening test) & only that he took so little time for each trial, this fact would not have been picked up. How many others have not listened (due to fatigue, loss of focus, gaming) but there is usually no way to know this from examining ABX results - in this case the giveaway was the timing of each trial.

Link to comment
Oh come on. You do nothing but complain around here. Tests are impossible to setup. A formal test done by one of the most well known experts is wrong, useless and pointless. And so on. Why do you even hang around?

Some accuse me of high nose and complaining too much but I doubt I ever touched your level of professional complainer.

Thanks, I take that as a compliment - I analyse & find issues/flaws in things - I would call that the scientific approach - knowing the limitations/sensitivity of your test. If you find my analysis flawed then let's talk about that rather than your emotional charge resulting from my analysis.

 

Now what "formal test done by one of the most well known experts is wrong" are you talking about, DBTs?

 

If you consider that my referencing the BS ITU standard for how audio DBTs should be carried out & comparing those recommendations with how they "actually are" carried out - if you call that complaining, you have a strange lexicon - I call it stating the obvious. But if you have never read the BS ITU standards document entitled "Methods for the subjective assessment of small impairments in audio systems" ITU-R BS.1116-2 (2014) then I guess I can see how you might interpret my analysis as complaining. Have you ever read this or other BS ITU documents?

Link to comment
Haha, that specific link is to Arny Kreuger's post who, despite what he says, demonstrated in his ABX null results just how concerned he is about false negatives - not one iota - his ABX results were random guesses (he didn't do a listening test) & only that he took so little time for each trial, this fact would not have been picked up. How many others have not listened (due to fatigue, loss of focus, gaming) but there is usually no way to know this from examining ABX results - in this case the giveaway was the timing of each trial.

 

If one reads what he's written there, it's very good with respect to various ideas for doing well controlled DBT. But it says nothing about how one might go about evaluating false negative rates for well controlled DBT.

 

Pretty easy to tell the difference between ideas for doing well controlled DBT, and ideas for evaluating false negative rates for well controlled DBT - what one might call the "inherent" false negative rate for the protocol. The latter must propose some independent means of measuring positive and negative responses to the phenomenon of interest. This is elementary stuff. One can use various means to minimize the error of a clock, for example, but it's only by evaluating the clock against another timer that one can get information about the baseline error of the clock in excellent working condition. Or if you look at some of Peter or John Swenson's posts, or some of the posts in response to a thread about measuring jitter I originated a year or two back, see what they say about the desirable range of error/accuracy in test apparatus versus the error/accuracy of the equipment one is measuring.

One never knows, do one? - Fats Waller

The fairest thing we can experience is the mysterious. It is the fundamental emotion which stands at the cradle of true art and true science. - Einstein

Computer, Audirvana -> optical Ethernet to Fitlet3 -> Fibbr Alpha Optical USB -> iFi NEO iDSD DAC -> Apollon Audio 1ET400A Mini (Purifi based) -> Vandersteen 3A Signature.

Link to comment
Wasn't complaining at all, just trying to help focus the search.

 

Sorry just tired. And you already have the most informative posts here anyway. But since you got nada from scholar (presumably after trying hard) and I also got nada, we may have to go the wild net route.

 

And also, I am not doing any formal science. Just trying to get a feel on those FNs/FPs. Never really tried cause I dont think they are necessary. Audio is well known and well studied science. All available devices are built on ~100 years old theories like electromagnetism and waves. And it's not life & death. So I dont really see why would that kind of triple-sure assurance is needed.

Some people just dont want to accept that 99,99% is more than enough for all practical purposes and spend all their energy looking for the mythical 100%. Maybe nobody told them that 100% hardly exists outside the math class.

 

But anyway, I'm all for looking. Bring those FNs.

Link to comment
Thanks, I take that as a compliment - I analyse & find issues/flaws in things - I would call that the scientific approach - knowing the limitations/sensitivity of your test. If you find my analysis flawed then let's talk about that rather than your emotional charge resulting from my analysis.

 

Now what "formal test done by one of the most well known experts is wrong" are you talking about, DBTs?

 

If you consider that my referencing the BS ITU standard for how audio DBTs should be carried out & comparing those recommendations with how they "actually are" carried out - if you call that complaining, you have a strange lexicon - I call it stating the obvious. But if you have never read the BS ITU standards document entitled "Methods for the subjective assessment of small impairments in audio systems" ITU-R BS.1116-2 (2014) then I guess I can see how you might interpret my analysis as complaining. Have you ever read this or other BS ITU documents?

 

http://www.itu.int/rec/R-REC-BS.1116-2-201406-I/en

Was it that hard to post a link for everyone?

Link to comment
Maybe it is time to more explicitly define what actually the HUGE difference can be between us all;

All what I am going to say has already been said, but much of it has been in between the lines.

 

Say that this thread has 40 different contributors. What I now dare say is that 37 of those use audio gear they struggle with. What do I mean ? well, that without even knowing it so explicitly, you are working on getting rid of the annoyances. Of course I have been there myself too and say that maybe 8 years ago I could finally take that hurdle, into more nirvana - now trying to approach the real thing. When I say I took that hurdle 8 years ago, it means that something like 35-40 years in advance of that I lived with the annoyances. Of course I didn't even know what they were exactly, but say too harsh voices, better not play classical, and indeed, I can't even give more examples because it really is all "automatic". However, when that hurdle has been taken, you work on improvement of basses, cymbals, difficult voices, equally loud piano notes, squeeze out the very very real thing. I feel that the latter was reached only a few months ago, but still not sure (and proven it is not so by cymbal levels which luckily are not at real levels (ehm would be 110dBSPL a piano playing at 90dBSPL which is also real level for that)).

 

If you read back the two quotes I dug out for this little story, then you can see where the differences can be, while the 37 mentioned actually can't get at all what I (the 3 others) am talking about :

 

The 3 out of 40 are driven by getting rid of the annoyances.

 

Yes, that sounds strange. I mean, the other 37 were just the same ? No. No, because I was trying to make clear that the 37 are not even aware of this. Now of course, all 37 will claim they are, but I tell you it is not so. There is so much wrong that there is head nor tail and all the annoyances together make that you may have fine foot tapping music, but take so much for granted that the only focus will be on that foot tapping. And why not, because it is about that !! or ?

 

I can tell you, things change once you seem have enabled a first real instrument. In aftermath I can't give examples I'm afraid, but say you now encounter the richness of a piano and you are sure that this is how a piano can be. Looking deeper we all know that almost as many different pianos exist as pianos exist, but still. You'll get the gist.

Then it starts to occur to you that whatever high C is always jumping out (it is louder). Oh notice, that too at first you won't notice at all, because you lived with that for years or decades. It is just "music reproduction". Oh is it. Well, it isn't once it starts to occur to you that nastiness is there in a voice at exactly that same high C. So sh*t, something must be going on there. And from there it can go fast.

Point is : your focus now moved to those elements of "wrong" and once you are in that stage, whoo, you are in trouble.

 

So now you are driven by getting rid of the annoyances and life is less easy than it was before. And I am very serious.

The more of these annoyances you get rid of, the more profound will be those remaining. This is logic.

 

If you're still there, then you may start to believe how it can happen that those 3 out of 40 can easily listen to whole albums and more, without even comparing anything. All they do is catching less or more annoyances, which, mind you, the focus is on very much to begin with. Change an interlink, change the software player, change a filter setting, ALL changes the albums in a way that you don't even recognize them any more. And I am not making up anything here.

 

And this is opposed to a few out of that 40 who never even heard a single difference in whatever they do (but changing speakers and such will, so I am not talking about that).

 

I think this is the context of it all and it is good to recognize that these differences between our systems (it is just that) exist. So if I count myself in that 3 out of 40 then I really really don't need AB at all. Whatever the change, it is heard throughout and heard from second 2.

 

This is nice, but the whole point is and remains : is this change for the better. Remember, the whole perception has changed by the tweak concerned, and it is not so easy to detect whether it is all the way for the better. So now the "get annoyed ?" days start, which for me coincidentally are 5. These days are NOT to be happy about the better sound or detect where it is. They are for one thing only : do I get annoyed.

 

Lastly, and now I'm possibly in a league of 1 out of 40 : I already get annoyed from a same flavor; Some times this is very hard to detect, assumed that a flavor now not is an annoyance as such like a jumping out piano note. It is another dimension instead. Do cymbals now sound better metallic ? yes, I think they do. But wait, don't they sound *all* smaller now ? yes they do.

So a cymbal has a size and what you perceive from it can be the exact size (just believe me).

 

I don't need any AB to detect such a difference. They are of the right size, are to small, went towards the better direction for size, or all sound like plastic now while before they were metal. I am afraid that 37 of you have no single clue how super easy this is.

But take that hurdle first.

Yes, that was easy said.

 

Like the way you phrased it. Those 'annoyances' with computer audio and trying to find solutions is the exact reason I stumbled on this site in the first place. Digititus by another name, and wow is it hard to fix. But many are not bothered by it for a variety of reasons, and hence the many heated debates.

 

Nearly threw in the towel many times to go back to turntables and yes even CD/SACD players. :)

Link to comment
If one reads what he's written there, it's very good with respect to various ideas for doing well controlled DBT. But it says nothing about how one might go about evaluating false negative rates for well controlled DBT.
Yes, as I implied - he pays lip-service to the idea but his actions & his lack of any serious suggestions about how this could be incorporated in a ABX, the DBT he claims to have invented, shows just how serious he is about this. His ABX results are the best demonstration of his lack of concern, bad intent & patently throw the spotlight on the need for such negative controls - so inadvertently, he copper-fastened the case, if ever a case had to be made.

 

Pretty easy to tell the difference between ideas for doing well controlled DBT, and ideas for evaluating false negative rates for well controlled DBT - what one might call the "inherent" false negative rate for the protocol. The latter must propose some independent means of measuring positive and negative responses to the phenomenon of interest. This is elementary stuff. One can use various means to minimize the error of a clock, for example, but it's only by evaluating the clock against another timer that one can get information about the baseline error of the clock in excellent working condition. Or if you look at some of Peter or John Swenson's posts, or some of the posts in response to a thread about measuring jitter I originated a year or two back, see what they say about the desirable range of error/accuracy in test apparatus versus the error/accuracy of the equipment one is measuring.

 

Seeing as the DBT proselytisers aren't coming up with suggestions for how false negatives could be incorporated in DBTs (due to either lack of interest, imagination or as the thread title says, "fear"?), let's start a useful discussion (maybe in a separate thread?) of what & how such controls can practically be incorporated.

 

I've already given some thoughts in a previous post:

The way I see it you would do some
pre-screening
with controls using known audible differences to verify that the test subject, equipment, procedures, etc are sensitive enough to differentiate differences. This is just to "prove" the whole setup is capable of revealing small differences or to calibrate the test. If it's not, there's no point in doing the test.

 

The second thing is to use hidden controls at certain points within the test to check whether the test subject's level of discrimination has dropped below an acceptable level i.e they are no longer listening & simply guessing. I have ideas how this might be done but the important aspect of this is that the test subject doesn't know that this is a control.

 

I will expand on the second paragraph - the "hidden" controls have to be exactly that "hidden" - they must not alert the listener to the fact that they are control signals. The idea being to test the listener's focus at various random points during the test (I believe it's well known that earlier results from such tests are more likely to find differences - it would be interesting to do a statistical analysis across many tests to establish this & it's extent). So the control has to be hidden - the only way I can think of doing this is to add a certain level of known artefact to the actual signal being listened to the rest of the time & use this as the control. In other words if two devices are being tested for audible differentiation, every now & then some agreed audible distortion would be added to one of these devices & it slotted into that test trial to check if the listener actually picked up an audible difference. Of course the tester would need to know which trials this applied to & the results of those trials analysed. If the listener didn't had a low rate of differentiation of those controls it would show that the test was producing false negatives.

 

A similar approach could be taken if two audio samples were being compared - randomly a distortion would be introduced into one sample, unknown to the listener

Link to comment

So are you suggesting that you are familiar with the document & have read & absorbed it before this thread?

If so, the concept of hidden controls, false negatives, how to run a proper DBT, should not come as a surprise to you. But if you haven't read it ..........

Link to comment

Can I ask :

 

Wouldn't a negative be an annoyance ?

And if so, would a missed one be a false negative ?

 

Just asking because I really don't know.

Lush^3-e      Lush^2      Blaxius^2.5      Ethernet^3     HDMI^2     XLR^2

XXHighEnd (developer)

Phasure NOS1 24/768 Async USB DAC (manufacturer)

Phasure Mach III Audio PC with Linear PSU (manufacturer)

Orelino & Orelo MKII Speakers (designer/supplier)

Link to comment
Can I ask :

 

Wouldn't a negative be an annoyance ?

And if so, would a missed one be a false negative ?

 

Just asking because I really don't know.

Peter, a false negative is a result where there is a known, audible difference (between two samples) but the listener doesn't hear it.

 

It's very often the case that unless a listener knows what a particular distortion/artefact sounds like, he will not hear it in a piece of music while others can easily hear it. But tiredness, loss of focus, distraction, lots of reasons could result in missing real audible differences & produce a false negative result

Link to comment
Can I ask :

 

Wouldn't a negative be an annoyance ?

And if so, would a missed one be a false negative ?

 

Just asking because I really don't know.

 

Actually, an annoyance would be a "positive." A "positive" is an "I hear it" if you are testing audibility of a phenomenon such as jitter, or a correct choice that "X" is "A," or "X" is "B," if we are speaking about the "ABX" form of double blind testing. (That is, you have two different known things - let's say two different manufacturer's speakers, call them A and B - then there is another speaker not shown to you but heard, which is either A or B, and you must identify which.) A "negative" is a response that "I don't hear it" when testing audibility of a phenomenon like jitter; or an incorrect response in an ABX test.

 

False positives automatically fall out of the mix because if you think you hear something when you actually do not, or you say you can hear something when you actually don't, your accuracy will be no better than chance in a properly constructed test; and the definition of a successful result is a level of positives *above* chance, at some agreed-upon level of confidence/significance.

 

Edit: So then what do I mean by "inherent level of false negatives"? I mean the degree, if any, to which the DBT protocol itself will automatically result in a certain number of subjects saying "I don't hear it," when in another reliable form of testing they would say "Yes, I do hear it."

One never knows, do one? - Fats Waller

The fairest thing we can experience is the mysterious. It is the fundamental emotion which stands at the cradle of true art and true science. - Einstein

Computer, Audirvana -> optical Ethernet to Fitlet3 -> Fibbr Alpha Optical USB -> iFi NEO iDSD DAC -> Apollon Audio 1ET400A Mini (Purifi based) -> Vandersteen 3A Signature.

Link to comment

Merrill and Jud, thank you. I will try to wrap my head around that when I'm fresh tomorrow.

So far I seem to have difficulty with making your both responses consistent. Merrill's response on its own I could still understand. But together with Jud's I am not sure.

It is sufficient when one of you tells that you both tell actually the same.

Lush^3-e      Lush^2      Blaxius^2.5      Ethernet^3     HDMI^2     XLR^2

XXHighEnd (developer)

Phasure NOS1 24/768 Async USB DAC (manufacturer)

Phasure Mach III Audio PC with Linear PSU (manufacturer)

Orelino & Orelo MKII Speakers (designer/supplier)

Link to comment
Merrill and Jud, thank you. I will try to wrap my head around that when I'm fresh tomorrow.

So far I seem to have difficulty with making your both responses consistent. Merrill's response on its own I could still understand. But together with Jud's I am not sure.

It is sufficient when one of you tells that you both tell actually the same.

Yes, we are both saying exactly the same thing - I'm talking about a single listening trial result & jud is saying that we don't know what number of the trials in an overall null result are because of "false negatives".

Link to comment
Actually, an annoyance would be a "positive." A "positive" is an "I hear it" if you are testing audibility of a phenomenon such as jitter, or a correct choice that "X" is "A," or "X" is "B," if we are speaking about the "ABX" form of double blind testing. (That is, you have two different known things - let's say two different manufacturer's speakers, call them A and B - then there is another speaker not shown to you but heard, which is either A or B, and you must identify which.) A "negative" is a response that "I don't hear it" when testing audibility of a phenomenon like jitter; or an incorrect response in an ABX test.

 

False positives automatically fall out of the mix because if you think you hear something when you actually do not, or you say you can hear something when you actually don't, your accuracy will be no better than chance in a properly constructed test; and the definition of a successful result is a level of positives *above* chance, at some agreed-upon level of confidence/significance.

 

Edit: So then what do I mean by "inherent level of false negatives"? I mean the degree, if any, to which the DBT protocol itself will automatically result in a certain number of subjects saying "I don't hear it," when in another reliable form of testing they would say "Yes, I do hear it."

 

Or you can simply say:

False Positives (FS) = you heard differences that do not exist.

For example you blind tested two identical components and heard 'differences'. Or even better, there was absolutely no change in the audio chain between trials but you 'heard' some.

 

False Negatives = you did not hear differences that are known to exist.

For example someone introduced clear, measurable and normally audible forms of distorsion in the playback chain and you failed to hear them.

 

Assuming of course that we are only talking about audio and testing for differences. Otherwise the FS / FN definitions may get quite complex.

Link to comment
See, that was helpful. Thanks for saving me the time.

 

Small constructive suggestion from my side : And now let's not respond to this like "If you had known ... then " etc. etc.

Maybe this breaks through some negative circle (hope you understand).

Lush^3-e      Lush^2      Blaxius^2.5      Ethernet^3     HDMI^2     XLR^2

XXHighEnd (developer)

Phasure NOS1 24/768 Async USB DAC (manufacturer)

Phasure Mach III Audio PC with Linear PSU (manufacturer)

Orelino & Orelo MKII Speakers (designer/supplier)

Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now



×
×
  • Create New...