Jump to content
IGNORED

MQA is Vaporware


Recommended Posts

12 hours ago, Rt66indierock said:

 

Don't forget to volume match your comparisons and check your masters. There is a discussion earlier about The Nightfly and different masters and of course a volume issue. 

 

Agreed. Especially in the remastered cases there are often quite audible differences. The Nightfly is an example where the MQA version is obviously louder. Joni Mitchell's Hejira and Herbie Hancock's Maiden Voyage are examples where the non-MQA version is obviously louder. I'm generally finding that in the new releases where track times are the same and there is otherwise no obvious reason to believe different masters were used, sound levels are too close to audibly distinguish. 

Link to comment
14 minutes ago, Rt66indierock said:

 

But you still need to measure the differences. A .2 dB difference would be hard to distinguish audibly without very specific training and still give you impression the louder track was better in many ways. 

 

And I have measured tracks I've used for my comparisons. I am not finding any pattern that consistently favors one format or the other using a spectrum analyzer app on my iPad. No doubt, you will dismiss my sound level checks using an iPad app as far too crude and use that as an excuse to reject my personal findings. So be it. I know what I hear and I know that the character of the differences I'm consistently detecting are not volume related. Since you previously stated in this thread that you haven't actually done any close comparisons yet, we'll have to table this discussion until you've done your own blinded tests. Then we can have a more interesting discussion. 

 

14 minutes ago, Rt66indierock said:

 

I spent 15 years (from 19 to 34) moon lighting as a consultant in the broadcasting industry because I had that training. 

 

Well, then, that settles it. We should all just fall silent until the Rt66indierock returns his verdict. 

Link to comment
7 minutes ago, church_mouse said:

knickerhawk's recent postings of his experience of MQA via his Bluesound set up intrigued me enough for me finally to use the offer of a 3 month free trial of Tidal which came with Audirvana 3.  I have been a Qobuz Sublime subscriber for about 18 months, though I tend to use it for testing out music before I buy it from somewhere.

 

knickerhawk's postings were sufficiently enthusiastic about the improvement in sound he heard using MQA that it made me feel it was time I gave MQA a shot, though limited to the software unfolding only offered by Audirvana - my DAC is a non-MQA Mytek Manhattan 1.

 

For my listening, I tried to compare files I own with MQA matches on Tidal.  That was far from easy as it seems that very little of my music is yet available in MQA – identical masters, who knows?  After 5 albums (Max Richter – Vivaldi Four Seasons Recomposed, Talk Talk – Colour of Spring, Tori Amos - Unrepentant Geraldines, Fleetwood Mac – Tusk Remastered, Kraftwerk - 3-D Catalogue), the jury was in.

 

The result of my listening (both through speakers and headphones) was, to my surprise, the reverse of knickerhawk's experience.  In every instance, the MQA album sounded worse to my ears, in my set up.  A generalised conclusion is that I found MQA to be strangely overblown in the bass, splashy and harsh in the top, with a narrow, up-front presentation and a loss of detail.  Listening to the MQA versions became quite tiring and unenjoyable. (I even tried some non-MQA music through the Audirvana/Tidal interface to ensure what I was hearing was not due to the Tidal streaming process itself).

 

Obviously, this is an entirely personal assessment, based solely on my own listening to MQA in my set up, with my ears.  (If I had any expectation bias, I think it was to expect to hear an improvement with MQA, knickerhawk's comments being so positive.)

 

NOTE - My set up is very basic – Mac-mini feeding Manhattan via Firewire, feeding AVI DM5 active speakers, or a Stax SR 404 Signature + SRM 006tII headphone set up, or a pair of Oppo PM3 via the Manhattan's headphone amp.  Audirvana is configured to upsample only 44.1 by multiples of 2 (effectively to 176.4 due to the Apple Firewire limit) using the SoX converter. My music is held on a NAS.

 

Thanks, church_mouse, for sharing your experience. I hope you're not forever pissed at me now.  :$  We have some overlapping taste. I'm a Max Richter fan and like the Four Seasons album he did, so I'll go give that one a try and report back. Of course, this boils down to some mix of personal preference and system. YMMV as they say, and I respect that and applaud you for trying and sharing. One thing I will note is that your description of the signature of the MQA sound is not that different from what I'm experiencing as well. However, what is coming across to you as "splashy" and "harsh" and too "up-front" etc. is coming across to me as revealing and actually presenting more detail. The structure of our responses, so to speak, don't seem too far off, though. Thanks again, and sorry for getting your hopes up for a personal improvement.

Link to comment
29 minutes ago, esldude said:

 

Are you using a microphone on the ipad to check the spectrum while playing tracks?  Or are you downloading tracks and using the analyzer on the file?

 

If the former, then guess what, it is FAR TOO CRUDE.  And no amount of knowing what you hear will make it otherwise.  

 

I'm using the "FAR TOO CRUDE" approach.

 

29 minutes ago, esldude said:

 

You can't match to within .2 of a db using such an app while playing back.  If you can't do that, then you can't match levels sufficiently.  And that is indeed that.  You nor anyone can get around being effected by liking louder as better, even in tiny amounts you don't notice.  You can't separate out the character of differences you are consistently detecting from a volume difference if one is there. You can claim it, but it is not so. 

 

Here's what I DO notice: I notice that my iPad app detects that sometimes the MQA version is slightly louder according to the app and sometimes the 16/44 is slightly louder and sometimes it detects no difference. In these slight/no detected variations, I'm still pretty consistently hearing the MQA signature, and in my system I usually prefer that sound.

 

29 minutes ago, esldude said:

 

I don't know why this is so hard to get across to people.  No one is attacking you personally.  No one is impugning your hearing ability.  It is simply being pointed out what is needed to make a good comparison. Levels MUST be MATCHED.  

 

Here's my beef: Rt66indierock keeps referencing Fagan's Nightfly and the fact that the MQA version is louder. It's obviously louder! And I mean OBVIOUSLY louder! If John Darko didn't immediately detect that and adjust levels accordingly before performing his in-depth comparison, then he's either got some major equipment issues or needs to visit an audiologist. Perhaps I'm being overly sensitive here, but I'm getting a little tired of the preaching about an issue that is well understood. Again, perhaps I'm just being overly sensitive here, but I'm also sensing a veiled implication that ALL MQA tracks are louder and, therefore, that's what's misleading anyone who prefers the MQA sound. I disagree with that premise, at least based on the pretty extensive sampling I've done over the past couple of months.

Link to comment
3 minutes ago, esldude said:

 

Matching levels is step #1.  You have NOTHING worth reporting without it.  Nor does John Darko or anyone else.  So complain, moan whatever.  If you don't do it in a way that is accurate enough you are simply wandering in randomness. 

 

Yes, loudness differences that are obvious are one thing.  I do not know if anyone is saying all MQA is louder.  Some of it has been found to be so.  When one selection is so much louder like with the Nightfly I immediately wonder if it is the same master or not. 

 

How can you do the matching?  Using sound level meters on music itself is simply not good enough.  You even seem to be describing that difficulty.  

 

Matching by ear?  Simply not good enough.  Most people will get within 1 db by ear.  Some can get a little closer.  Unfortunately .2 db or more is enough to make something sound of better quality.  At such level differences you will not hear it as louder.  It will sound like higher quality, more detail, more bass, more space etc. etc.  Yet will be nothing except a tiny bit louder. 

 

This is a step you need to do.  Otherwise everyone listens, everyone comes to their conclusions that they heard, everyone feels better, and more than likely everyone is fooling themselves.  Sorry, I didn't make it this way.  It is the way it is.  

 

So when others complain of your approach it isn't personal.  It is a very valid complaint that very reasonably calls into question whichever conclusions you make. 

 

Now if you care about what I going on about at all you are waiting for me to tell you how to match levels.  In the case of MQA, it isn't easy.  You can't look at files and determine it without decoding them (if I am wrong someone please correct this).  You can't grab a digital out, because it isn't decoded.  You could use an ADC to record the output of a DAC, and likely get close enough.  This is highly inconvenient and most people don't have an ADC for doing this. 

 

Please turn down your preach level (it's quite audible) and I'll turn down my sarcasm level and then maybe we can have a useful discussion about what I actually wrote regarding the range of sound level differences picked up by the crude methodology I used and the fact that, even when the non-MQA version is measurably louder, it didn't adversely affect my ability to identify (and prefer) the MQA version.

 

My premise here, is that - after having listened carefully to dozens of tracks at the same volume setting for each format - I have established a consistent ability to detect the MQA version (when blinded). I happen to prefer that MQA sound signature in my system but preference really isn't the determinant issue here. Rather, it's the fact that I can consistently identify the MQA version. Now, consider that there are three possibilities here regarding sound levels:

  • All of the MQA tracks were louder
  • All of the non-MQA tracks were louder
  • Some MQA tracks were louder and some non-MQA tracks were louder

The first option would explain both why I could detect a difference and preferred the MQA tracks. The second option would explain why I could detect a difference but doesn't explain the preference (especially considering the reasons for my preference are those usually associated with louder versions). This result would be an interesting and confounding to testing expectations. The third option would explain that my ability to detect (and prefer) the MQA versions is not correlated with sound level differences between the formats. 

 

3 minutes ago, esldude said:

 

Now it is unfortunate I can't tell you how to do it.  I can tell you if you don't manage level matching you can't pass Go, you can't conclude much of anything.  Sorry, but those are just the facts. 

 

It is simply incorrect for you to proclaim that the only way to obtain valid test results is by level matching. That is the case when dealing with a single sample or a homogeneous population of samples in which the loudness of one set is greater than the other and you do not know which is louder. It is not the case when dealing with a random sampling population. There is more than one way to skin this cat...

Link to comment
2 hours ago, firedog said:

 

Knickerhawk, your “test” method is simply invalid. The fact that you don’t recognize it doesn’t validate your mistaken method. `if you start with an invalid method of testing, none of your conclusions mean much. 

 

Both you and esldude are demanding preciseness in my testing but neither of you are being very precise in your criticism of it. You are broadly claiming that no listening test can be valid unless sound level matching within .2 dB is enforced in the test based on the fact that subjective preference can be influenced by application of as little as .2 dB difference in playback of the same track. Therefore, we know that there is a danger zone between .2 dB and the normal human threshold of audibility that needs to be controlled for. You and esldude seem to be arguing that the only valid way to control for this subliminality zone of influence is to measure the sound levels of the A and B samples to within .2 dB accuracy. Anything less accurate than that is, as you put it, "simply invalid" or as esldude put it, "fantasy."

 

My contention is that you do not need that level of accuracy to obtain significant results that can prove listener preference is based on something other than sound level. How? Well, let's consider what should be a statistically significant way of achieving valid listening results without accuracy to .2dB. Let's say we start with a population of 125 tracks to be tested. We have an MQA version and a non-MQA version for each track. We first want to toss out tracks with differences in loudness  between the two formats that are audible to our human subject. One way to do this is to run a repeated random blind A/B test where the subject is asked to pick the louder version. We can throw in a control with slight volume attenuation applied on a random to confirm the accuracy with which our subject is listening for loudness. Let's say that the subject can't detect sound level differences in 100 of the tracks. Now we need to determine if there is any subliminal sound level differences within the remaining 100 tracks. Our problem is that we only have a sound level measuring methodology accurate to, let's say, .8dB. If it turns out that some of the tracks are louder on the MQA side and some are louder on the non-MQA side and maybe some are equal, then we're in luck. Let's say it breaks down nicely to 1/3 for each of these possibilities. Now we run the subject through the main blind A/B tests for identifying preference.

 

Do you now see where this is going??? If preference is based primarily or exclusively on loudness, then the known louder tracks should be statistically preferred to the known quieter tracks and the preference within the unknown cohort will be mixed. In this scenario we can't conclude anything meaningful about preference when loudness is completely eliminated as a variable (i.e, reduced to no more than .2dB difference). However, if preference does not follow loudness to a statistically significant degree in the known loudness cohort, then we know with statistical significance that something other than loudness is dominating preference - the point being that we've achieved this result even though we haven't level matched all the way to .2dB.

 

Now, if we can't agree on the validity of the approach described above, there's no point in moving on to a discussion of how close my personal testing comes to being "valid" or merely "fantasy." I'll willingly slink away with my tail between my legs when you demonstrate the error in the test design described above.

Link to comment
1 minute ago, Samuel T Cogley said:

FWIW, my attempts to at something like ABX for MQA were not successful.  With all the MQA DACs that I've tested, there is a "tell" with MQA material where the DAC doesn't emit sound until a full second or two into a song.  Sorry if I missed a testing scenario that accounts for this.

 

That's interesting. I haven't detected that "tell" with my Bluesound streamer/dac. In fact, run-in times can vary either way by a fair amount and I'm suspecting its sometimes related to the differences in listed track times as well (i.e., maybe the track time differences aren't always signs of a different master being used, just different padding at the beginning/end of tracks???). For instance, check out the first track run-in time on Nik Bartsch's Ronin "LLyria" album. In the MQA version the track time is listed as 6:50 and in the CD version it's listed as 6:58. That difference is clearly at the beginning of the track. The MQA version starts the music immediately and the CD version starts with about 8 seconds of silence. The first time I launched the CD version it took so long that I mistakenly hit the play button again!  

Link to comment
1 hour ago, Ralf11 said:

knick - it is true that statistical analysis can sometimes compensate for experimental design

 

but how many replicates did you perform?

 

and what statistical design did you use?

 

Let's see if we can establish consensus about the testing scenario I outlined in my previous post before we wade into the details of what I did in my personal testing. Do you agree that the scenario I described is an example of how it is possible to obtain valid preference results from a blind A/B test even though you have not verified sound leveling to within .2dB?

Link to comment
12 minutes ago, esldude said:

You are making an assumption the loudness between .2 and .8 db differences will be random and evenly distributed between MQA and non-MQA.  That is a supposition.  We don't know if it is true.  

 

It is supposition but a reasonable one if you have a (statistically large enough) population that is falling into a normal distribution for the samples that vary by more than the .8 dB measurable limit in my hypothetical. Regardless, it still doesn't matter if you have enough samples outside of the unknown cohort. With a large enough sampling outside of the unknown cohort and the preference results obtained from inside of the unknown cohort you should be able to statistically determine the distribution of louder MQA and louder non-MQA tracks within the unknown cohort. In the unlikely case of a distribution of actual preference results from within the unknown cohort that's inconsistent with the predicted normal distribution, you would need to further investigate the cause.

Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now



×
×
  • Create New...