Jump to content
IGNORED

MQA technical analysis


mansr

Recommended Posts

The difference being that the renderer has to upsample for Coltrane, but seemingly has to downsample for the 2L track. This downsampling is a surprise...it seems to me that the software decoder should not upsample 44.1 or 48kHz material. Perhaps a bug in the software decoder?

 

Would be good to get Bob's input on this.

 

It could also be that the MQA DAC displays the original rate regardless of how it actually goes about playing it.

Link to comment
Madonnas Like a virgin album also seem a bit odd, Explorer2 leds indicate 172.4 or 192 kHz playback but according to this it was recorded in 16/44.1 or 16/48 format. I thought the name Tidal Masters was referring to offering original masters :)

 

Archimago has a few words about Tidal's Like a Virgin "master": Archimago's Musings: COMPARISON: TIDAL / MQA stream & high-resolution downloads; impressions & thoughts...

Link to comment
That makes sense to explain that one situation, but the rest all follow a logical pattern. Regardless, the hardware MQA is not 'downsampling' in the Nielsen case (not that upsampling is the correct term either for what MQA is doing, or at least purporting to do), rather the software is decoding/unfolding/whatever to 88.2 while the the DACs are not.

 

We don't know that. All we know is what the indicators say. The metadata in the MQA stream probably (I haven't deciphered it yet) includes the original sample rate fed into the encoder. The DAC could be indicating this even if it upsamples/unfolds it to double rate or higher before sending to the DAC chip. After all, something has to upsample it eventually.

 

For the Nielsen MQA album, the software decoder (both Tidal's and BlueSound's which could be the same) and hardware decoders (from both the Meridian and Mytec DAC) are interpreting the metadata differently. One is wrong. I've now checked 20 different tracks (from different albums), none do what the Nielsen tracks do. Considering the 'noise' that started this whole discussion, maybe the difference is due to error handling between the software/hardware decoders? Maybe the hardware decoders fault out if presented with faulty data while the software decoder forges ahead filling in the gaps where errors occur?

 

MQA tracks recorded in 44.1k or 48k are probably quite rare. The Nielsen album is the only one I know of for sure.

Link to comment
I had not checked similar recordings, but yes, those all do have the same behavior. Tidal's software decoder/renderer plays them at 88.2/96 while the Brooklyn in passthrough (with blue light) plays them at the original bitrate. So either the software decoder or the DACs are handling these files incorrectly.

No, the Brooklyn displays original rate. It is barely even meaningful to talk of what it plays at since everything is upsampled to several MHz eventually.

Link to comment
I had thought about this but the second stage upsampling that DACs do is normally pretty basic, zero-padding I thought? I have to admit I know very little about this. But hence the popularity of HQPlayer.

Most basic DAC chips do band-limited interpolation (usually cascaded doubling) up to 352.8 kHz followed by an 8x zero-order hold (repeating samples) to a final rate of 5.6 MHz, which forms the input to the sigma-delta modulator. High-end chips such as ESS are considerably more sophisticated. Audiophile DAC boxes often upsample the input to whatever the chip accepts in a separate DSP using their own filter designs. Either way, while upsampling is always inherent in the conversion process (except on NOS DACs), they indicators reflect the rate of the external input. If MQA decoding is viewed as part of the D/A conversion process, just as regular upsampling is, it makes sense to indicate the original sample rate. Alternatively, consider playing a 320 kbps mp3. You'd still display 44.1 kHz if that is the sample rate of the encoded audio. MQA could sensibly be regarded the same way.

Link to comment
I think it shouldn't be doing it... But maybe the decoder always outputs 2x rate, regardless of input. But the hardware decoder knows from the stream metadata that it originated from 44.1 and displays that in the DAC display while still in reality putting out 88.2.

 

How many times do I have to repeat that something in the DAC will always be upsampling? Displaying the original rate is really the only logical choice. If some part of the processing produces a double-rate stream, that's just an implementation detail.

 

Decoding (and encoding) H.264 video involves interpolating the image to 8x resolution (though only a small part at a time). One would still expect video players to report the actual pixel resolution that went into the encoder. The MQA situation is no different.

Link to comment
Agreed. Everything in the patents suggests that deblur DSP is applied at the very start of the chain. What the patents didn't mention is this upsampling for 44.1/48k material. Additional deblur is also done at the final DAC stage, I'm not sure I saw anything in the patents about this.

 

Nothing prevents them doing things not in the patents.

Link to comment
I've been saying this all along. Why not provide a simple white paper with a high level overview of the process? Save us from reading patents and Q&A and technical analysis and reverse engineering....90% of what I have stated above is all out there, but spread across many different sites and documents!

 

But reverse engineering is so much fun.

Link to comment
And even the MQA upsampling ratio is pathetically low. I'm running at least 256x upsampling filters and for half of the cases 512x. MQA deals with tiny numbers like 2x or 4x.

 

The MQA resampler supports factors 1/2, 2, and 4. The decoder typically does a 2x upsampling. The renderer can then do another 4x for a total of 8x. Low indeed. The filters are also unimpressive. They are all minimum-phase-ish FIR filters, the longest with 65 taps.

Link to comment
This is interesting:

 

"Just pulling out the flowchart from Bob's answer, it would appear that as long as the software doing the decoding (decoder 1) inserts the DSP after the decoding but before sending to the DAC for its final hardware MQA voodoo (decoder 2), then everything should be good."

 

[ATTACH=CONFIG]32619[/ATTACH]

 

But sill I don't understand that 'Optional intermediate processing'. If that DSP touches all 24bits then it will change also the MQA metadata and the resulting PCM stream will no more be 'authenticated' as MQA in DAC. Did I miss something?

 

The decoder places metadata in the LSB of the 24-bit output. If any processing preserves the LSB, the renderer should still recognise the stream. This metadata includes filter selection and some things I haven't figured out.

Link to comment
I was thinking the same too, looking at that diagram. The side-chain processing (i.e. custom DSP) must be ran from a MQA controlled function, which must save the MQA meta data, apply the DSP, and then re-apply the meta data. I can't think of any other way to do it.

 

The Bluesound player does that (though not quite the way you're envisioning) in order to apply digital volume and bass/treble controls.

Link to comment

Remember that high-frequency noise? The cause of it is the renderer dithering the output to no more than 20 bits (depending on parameters in the LSB metadata) with a 5-tap filter. This a frequency response plot of that filter when applied at 192 kHz:

 

mqa-ns-filter.png

 

Look familiar?

Link to comment
I think it shouldn't be doing it... But maybe the decoder always outputs 2x rate, regardless of input. But the hardware decoder knows from the stream metadata that it originated from 44.1 and displays that in the DAC display while still in reality putting out 88.2.

The decoded stream metadata includes the original sample rate (prior to encoding). Clearly it must also be present in the metadata of the undecoded file.

Link to comment
Interesting. We are agreed aren't we that the overriding design goal of MQA is claimed to be to introduce no more temporal blur (whatever that means) than 10m of air. Taken literally as an aim to mimic 10m of air as a filter it woud imply no material filtering at all until the frequencies got pretty high.

It's a while since I looked at the MQA blurb/patents so I can't remember exactly how they cacluated the equivalence but I seem to remember that it involved having a very small number of filter taps. I also vaguely recall possibly a filter width of 10 us, although that seems impossible to me in pcm. I can't find the direct link but this article in sound on sound copies some of the MQA graphs and quotes the claims as being to introduce no more than 10us of time smear, possibly less. I don;t really understand how this is possible for any meaningful filter even at 192khz

 

MQA Time-domain Accuracy & Digital Audio Quality |

 

In any event, I think MQA would cheerfully admit to having narrow filters as a design goal.

 

How did you calculate 25 cm- that sounds more like the latency of 10m of air.

 

The trouble with short filters is that they are not steep enough to avoid serious amounts of aliasing. What air has to do with anything completely escapes me. We're talking about a digital signal.

Link to comment
I did not read the texts implied, but what I get from it is that air itself will cause said blur, and probably MQA is (or should be) designed such that MQA does not cause more blur than 10m of air implies. A sort of : we sit closer than 10m to our speakers anyway.

 

That's flawed thinking. The distortions caused by digital filters are completely different from those caused by air.

Link to comment
I'm not intending to advocate MQA's position, but I have spent/squandered enough time to vaguely grasp what they are saying. One has to start from the premise that there exists something called time blur/smear which is crudely understood as correspondign to the impulse response of a filter (which as far as I am concerned has little meaning outside the mathematical equivalence of time and frequency domains.)

They argue that the ultimate aim of an end to end recording/reproduction chain should be to have no more time smear than 10m of air (ie the inevitable time smear experienced by a person 10 away from the source of sound. The problem of course is that involved in audio terms virtually no filtering at all, which is going to give one real problems in satisfying the requirements of the sampling theorem with a sample rate less than 500khz? 1 Mhz? probably more.

The underpinning of the system as i understand it involves accepting aliasing. It advocates one form of perfectionism at the expense of accuracy in the conventional sense.

 

A perfect reconstruction filter is an infinite sinc function. Any finite-length filter is an approximation.

 

Air, on the other hand, is a lossy medium for sound waves with both linear and non-linear distortions, although at frequencies, distances, and levels involved in music reproduction, both are negligible (it starts getting interesting at a few hundred kHz).

 

Equating the distance through air with a corresponding filter length is simply preposterous. If one wished to model the effects of air on a sound wave, a longer filter would be more accurate regardless of the distance modelled.

Link to comment
I quite agree. But it is fair to say that a perfect filter at an audible corner frequency would be a real problem. This is well known by those who devise perceptual codecs. If the nyquist frequency of the system is treated as being audible in some way then there has to be a trade off between the band-limiting requirements of the sampling theorem and the practical problem of audibility of filter ringing.

 

So what is the cut off point. 20Khz, 40Khz, 80 khz? The former is rational and based on considerable evidence. Above there, it's conjectural at best.

 

Indeed, the idea with digital audio is to place the Nyquist frequency above the audible range. The CD standard assumed an upper limit to audibility of 20 kHz and allowed ~2 kHz for the filter transition band. Now audiophiles argue that higher frequencies are in fact audible in some way or other and must thus be preserved. If one believes this to be true, one should not ever resample to a rate where the Nyquist frequency is within the range one deems important. It is mathematically impossible to do so with a finite filter without introducing ringing at the lower Nyquist frequency. Using an aliasing filter to avoid ringing will only push artefacts even further down into the audible range. If you think high frequencies are important, leave the sampling rate high. If you want to use lossy compression to reduce the file size, there are ways of doing that without resampling to a lower rate.

Link to comment
So "fools rush in," but:

 

Since the relationship of time domain and frequency domain accuracy is that of conjugate variables, wouldn't inaccuracy in one domain increase or decrease opposite to inaccuracy in the other? Thus Fokus' description of allowing frequency domain inaccuracy (aliasing) to some extent that is hopefully masked, in order to allow what is thought to be better time domain accuracy (I suppose intended for the sake of transients, percussion, instrumental and vocal attacks - all the inharmonic stuff).

 

You're thinking of a different kind of inaccuracy. Given a signal, it's value is more well-defined over a narrower time interval while it's frequency content is better defined over a longer period. In the extreme, for an infinitesimal instant (a single sample in discrete time), the frequency spectrum is undefined while the value is exact. Over an infinite interval, the spectrum is exact while the value reduces to an average.

 

Now if we settle on some balance between these opposing goals, an error in one domain translates to an equivalent error in the other. I think this was what adamdea was getting at.

Link to comment
I quite agree with you about the time/ frequency uncertainty problem. But that wasn't what I was getting at. the uncertainty tradeoff would have increasing precision in one domain leading to reduced precision in the other.

My point was simply that you cannot get something wrong in the frequency domain without also getting it wrong in the time domain (this is trite: if it is right in one it must be right in the other). If you allow aliasing (or unsurpressed spectral imaging) you must be creating some sort of error in the time domain. However you might in fact be happy to have that error if it allows you to be more accurate in another respect. The NoS dac does not (just) trade off time domain against frequency domain it trades off one sort of time domain error against another. AND SO MUST MQA

 

I think we're saying the same thing. The important distinction is between (in)accuracy and error.

Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now



×
×
  • Create New...