Jump to content
IGNORED

The Ear - DSD under Fire


Nikhil

Recommended Posts

I always think that for solving problem effectively, approaching from different perspectives is the best way to solve problem.

 

Approaching from different perspectives requires the different perspectives to be kept separately in one's mind. Collaboration only works when the people communicate with consistent terminology, otherwise they won't know whether they are agreeing or disagreeing.

Link to comment
The use of Sigma Delta modulators to create one bit pulse streams has a long history going back to the 1950's. The only thing that Yamasaki "invented" was the idea of not doing conversions to/from multi-bit formats. All of the elements of his invention existed previously. In any event, a trade name does not characterize a mathematical format.

 

Yes, that was precisely what he invented - a direct SDM capture and playback (without PCM decimation and oversampling filters) which is the essence of DSD format.

Link to comment
Using the same logic, it is not possible with PCM either. When you add two 24-bit values you get 25-bit result. In order to get back to 24-bit signal you need to do rid of the LSB, this adds noise and/or distortion depending on how you do it.

 

So if you mix values 3 and 5 you should get (2 + 3) / 2 = 2.5, now the closest to correct integer is either 2 or 3, both having same 0.5 quantization error. It becomes much worse when you mix eight channels, like for example (2 + 3 + 5 + 1 + 7 + 9 + 5 + 11) / 8 = 5.375 now 5 is closest integer and still leaves error of 0.375. Since adding eight 24-bit samples results in 27-bit result and you need to get rid of three bits of precision. IOW, you lose 18 dB.

 

When you mix eight channels in DSD, the noise drops 9 dB and even in worst case remodulation would only put 3 dB worth back, so the SNR would improve still 6 dB.

 

 

 

Again with PCM, when you multiply two 24-bit numbers, the result is 48-bit and to go back to 24-bit you need to get rid of bits. In addition, when you reduce volume of 24-bit PCM by 6 dB you lose one bit, so the result has at most 23-bit worth of relevant data. Reducing volume by 12 dB you lose two bits so now you have just 22-bit worth of relevant data.

 

However, worst part in PCM is that you may have lot of redundant value space. For example if you have a song that has only single peak reaching 0 dBFS and all other samples are below -6 dBFS you are not utilizing half of the value space! IOW, you may have the MSB constantly 0 except for single sample. This is not the case with SDM which is always utilizing all the value space!

 

 

 

Certainly not. Even with DSD64 you can get much better SNR in audio band than you get with 24-bit PCM. At DSD128 you can already significantly exceed SNR of 32-bit PCM in audio band.

 

 

 

Little too many conditions... Now here's you first problem, design a pure PCM ADC-DAC chain that has your >136 dB dynamic range and won't have both time/phase distortions vs analog input signals.

 

PCM may seem all nice and dandy when you look at it in pure digital domain. But when you begin to involve analog domain filters and AD/DA converters things change and DSD becomes much better. So it is much better to do everything in SDM.

 

Like I said in the first sentence of my post, you have to have some headroom if you want to do mixing or any DSP other than reducing volume. (Even reversing polarity can cause an overflow.) Having a few bits of headroom is customary when recording live music so as to avoid the risk of clipping and to get cleaner sound by not overloading circuitry. The same thing applies in reverse with DACs. In practice, if people do mixing digitally what they do is to convert the 24 bit input into a higher resolution format, such as 32 bit floating point, thereby avoiding all possibility of overflows while preserving distortion at the 25 bit level (IEEE floating point).

 

I challenge you to publish some test files of the output of your modulators with various signals so they they can be analyzed by others. In the absence of this, I do not believe your claims of signal to noise ratios. Your numbers correspond to the Shannon bound which is an upper limit given by the pigeon-hole principle and represents a limit of performance from an ideal encoder that is allowed to output any possible bit-stream. There are constraints on the output of a modulator that produces a bit stream that is decoded by a conventional low pass filter, and hence the likelihood of a significant coding loss with attendant loss of signal to nose ratio.

Link to comment
DSD is mathematically a form of PCM. DSD64 is 2822.4/1 PCM. "DSD" is a Sony trademark.

 

The theoretical benefits of DSD come from its high sampling rate. The practical benefits of DSD include the high sampling rate, low bit budget for a given sampling rate, and the use of a single switching element which makes low-level linearity possible without precise tolerances on physical components.

 

Tony,

 

I have to say I have been reading your posts with a lot of interest. I had some questions and I am glad to see the answers in your very informative posts. I like your description of function and mechanism - very nicely put. Your commentary on the math behind all this has been very informative.

Custom Win10 Server | Mutec MC-3+ USB | Lampizator Amber | Job INT | ATC SCM20PSL + JL Audio E-Sub e110

 

 

Link to comment
In practice, if people do mixing digitally what they do is to convert the 24 bit input into a higher resolution format, such as 32 bit floating point, thereby avoiding all possibility of overflows while preserving distortion at the 25 bit level (IEEE floating point).

 

But I rarely see such delivered to the end customers. So far it has been at most 24-bit integer PCM. So everything gets squeezed back to the 24-bit pipe.

 

I challenge you to publish some test files of the output of your modulators with various signals so they they can be analyzed by others.

 

I've done it on couple of occasions already. I would also like to make similar challenge to ADC and DAC chip manufacturers.

 

I challenge you to publish measurement results of real world DAC outputs of the results, like I've been doing. That's what really matters. And puts the on-chip stuff on the same line with the things performed in computer.

 

Your numbers correspond to the Shannon bound which is an upper limit given by the pigeon-hole principle and represents a limit of performance from an ideal encoder that is allowed to output any possible bit-stream. There are constraints on the output of a modulator that produces a bit stream that is decoded by a conventional low pass filter, and hence the likelihood of a significant coding loss with attendant loss of signal to nose ratio.

 

No, Shannon bound for DSD64 is equivalent of 64-bit integer PCM, that is 385.32 dB for 22.05 kHz bandwidth or 424.81 dB for 20 kHz bandwidth... So what I talk about is far from it.

Signalyst - Developer of HQPlayer

Pulse & Fidelity - Software Defined Amplifiers

Link to comment

Mixing theory and practicality may be confusing to a layman, but it should not be confusing to a competent practicing engineer. Also, the essential difference between function and mechanism is one that will be understood by lawyers who have familiarity with patent law.

 

Tony,

 

I have to say I have been reading your posts with a lot of interest. I had some questions and I am glad to see the answers in your very informative posts. I like your description of function and mechanism - very nicely put.

 

I hadn't planned to respond, since it's off topic, but is your function versus mechanism statement meant to refer to Section 112 indefiniteness and means-plus-function language, a fraught area of the law on which the federal circuits are split, or something else?

One never knows, do one? - Fats Waller

The fairest thing we can experience is the mysterious. It is the fundamental emotion which stands at the cradle of true art and true science. - Einstein

Computer, Audirvana -> optical Ethernet to Fitlet3 -> Fibbr Alpha Optical USB -> iFi NEO iDSD DAC -> Apollon Audio 1ET400A Mini (Purifi based) -> Vandersteen 3A Signature.

Link to comment
I hadn't planned to respond, since it's off topic, but is your function versus mechanism statement meant to refer to Section 112 indefiniteness and means-plus-function language, a fraught area of the law on which the federal circuits are split, or something else?

 

If the question was for me then no - I was only referring to Tony's earlier post.

 

The essential point of this discussion concerns the difference between "function" and "mechanism". The format of DSD as described mathematically defines the function provided by an ADC or a DAC. A sigma delta modulator is one mechanism for generating a one bit signal that will decode back into a suitable analog signal.

 

But to answer your question - I think all Tony meant was an appreciation for subtle distinctions in language/terminology.

Custom Win10 Server | Mutec MC-3+ USB | Lampizator Amber | Job INT | ATC SCM20PSL + JL Audio E-Sub e110

 

 

Link to comment
If the question was for me then no - I was only referring to Tony's earlier post.

 

 

 

But to answer your question - I think all Tony meant was an appreciation for subtle distinctions in language/terminology.

 

Whoops - Sorry Nikhil, it was clear in *my* mind who I was asking, so I forgot to expressly say it was Tony. :)

One never knows, do one? - Fats Waller

The fairest thing we can experience is the mysterious. It is the fundamental emotion which stands at the cradle of true art and true science. - Einstein

Computer, Audirvana -> optical Ethernet to Fitlet3 -> Fibbr Alpha Optical USB -> iFi NEO iDSD DAC -> Apollon Audio 1ET400A Mini (Purifi based) -> Vandersteen 3A Signature.

Link to comment
Your first comment is incorrect. The mathematical analysis of DSD signals is identical to that for other PCM formats. The signal consists of a set of coded pulses (impulses) at the sampling rate, with the coding using a mapping between the bit(s) assigned to each sample which controls the amplitude of the pulse. The resulting sequence of pulses is then low pass filtered to produce the desired continuous analog signal. The difference in playback between 1 bit and multibit formats concerns the particular mapping between the coded bits and the amplitude of the resulting impulses. However, this is true as well for other forms of PCM, e.g. different rounding methods and ranges are used, e.g. 2's complement vs. 1's complement arithmetic.

 

Are you implying that if one implemented a DSD chain with 24-bits instead of 1-bit, and lowered the sampling rate to, say, 96KHz, one would end up with exactly 24/96 PCM? And likewise if one implemented a PCM chain with 1-bit instead of 24-bits and raised the sampling rate to 5.6448 MHz, one would end up with exactly DSD128? Are so many of us really mistaken in thinking that there is something really so fundamentally different about DSD/PDM?

Link to comment
Are so many of us really mistaken in thinking that there is something really so fundamentally different about DSD/PDM?

 

The mistake was evidently in your own mind in writing « about DSD/PDM » :)

 

Try reading Wikipedia entries for Pulse-density modulation and Pulse-code modulation before...

 

«

an accurate picture

Sono pessimista con l'intelligenza,

 

ma ottimista per la volontà.

severe loudspeaker alignment »

 

 

 

Link to comment
Are you implying that if one implemented a DSD chain with 24-bits instead of 1-bit, and lowered the sampling rate to, say, 96KHz, one would end up with exactly 24/96 PCM? And likewise if one implemented a PCM chain with 1-bit instead of 24-bits and raised the sampling rate to 5.6448 MHz, one would end up with exactly DSD128? Are so many of us really mistaken in thinking that there is something really so fundamentally different about DSD/PDM?

 

Numbers are numbers. All the rest is implementation detail :)

Custom room treatments for headphone users.

Link to comment
The mistake was evidently in your own mind in writing « about DSD/PDM » :)

 

Try reading Wikipedia entries for Pulse-density modulation and Pulse-code modulation before...

 

You are mistaking me with the guy who appears to be claiming that PDM and PCM are fundamentally the same. From those articles (which I've indeed read before) it does not seem to be the case, but I don't know enough of the math to be sure. I am trying to conduct a thought experiment to get to the root of the matter. I am not a math whiz but am far from stupid so I'm confident the issue has not been explained cogently enough for the edification of the majority of audiophiles interested in the topic.

Link to comment
You are mistaking me with the guy who appears to be claiming that PDM and PCM are fundamentally the same. From those articles (which I've indeed read before) it does not seem to be the case, but I don't know enough of the math to be sure. I am trying to conduct a thought experiment to get to the root of the matter. I am not a math whiz but am far from stupid so I'm confident the issue has not been explained cogently enough for the edification of the majority of audiophiles interested in the topic.

 

Just so there will be no doubt about my position: Pulse code modulation is a family of formats, typically described by a sampling rate and bit depth. This description includes the commonly used formats such as 44/16, 44/24, 96/24, 176/24, 192/24, 352.8/24, 2822.4/1 and 5644.8/1. The same mathematics can be used to describe the signals in all of these formats independently of the mechanisms used to encode them from an analog signal and independently from the mechanisms used to decode them back to an analog signal.

 

The possible signals that can be represented in a given format put a limit on the resolution (and type of lack of resolution) of the format itself, assuming an ideal converter. Real converters can, at best, approach the limits of these formats.

Link to comment
You are mistaking me with the guy who appears to be claiming that PDM and PCM are fundamentally the same. From those articles (which I've indeed read before) it does not seem to be the case, but I don't know enough of the math to be sure. I am trying to conduct a thought experiment to get to the root of the matter. I am not a math whiz but am far from stupid so I'm confident the issue has not been explained cogently enough for the edification of the majority of audiophiles interested in the topic.

 

But, leaving aside your idiosyncratic manner of writing, you come across as the guy, perhaps it's your chosen Avatar and history, who having done the math within a 16:9 monitor proposes that 4:3 and 2.35:1 are fundamentally the same because there's equal black...

 

I trust the above inferences was explanatory, cogent and succinct.

 

«

an accurate picture

Sono pessimista con l'intelligenza,

 

ma ottimista per la volontà.

severe loudspeaker alignment »

 

 

 

Link to comment
Just so there will be no doubt about my position: Pulse code modulation is a family of formats, typically described by a sampling rate and bit depth. This description includes the commonly used formats such as 44/16, 44/24, 96/24, 176/24, 192/24, 352.8/24, 2822.4/1 and 5644.8/1. The same mathematics can be used to describe the signals in all of these formats independently of the mechanisms used to encode them from an analog signal and independently from the mechanisms used to decode them back to an analog signal.

 

Put differently, both PCM and PDM are a sequence of pulses, equally spaced in time. Each pulse has a value in the range [-1,1]. In the theoretical realm, we can conceive of these values as continuous. In practice, the precision is limited by the chosen representation. One common representation is 16-bit two's complement binary. Back in the dark ages, 8-bit unsigned was common as was 8-bit μ-law/A-law. The more bits we use, the closer we get to the continuous ideal. At the other extreme, using a single bit allows only the max/min values of 1 and -1. Regardless of the representation used for transmission/storage, once presented with such a signal, the mathematics for analysing, manipulating, or transforming it are exactly the same. We can add signals sample-wise, we can apply Fourier transforms or FIR/IIR filters, etc. Mathematically speaking, the output values from any of these operations are continuous. In a real system, we will use an internal format as accurate as is practical, most often 32-bit or 64-bit floating-point.

 

A sampled signal is a sampled signal and mathematically behaves like a sampled signal regardless of the representation used to encode the samples.

 

Obtaining a sampled signal in a given representation is another matter. If the precision of the sample values is sufficient, simply rounding the real value to the closest representable value is good enough, the rounding error being a form of non-linear distortion. As the precision is decreased, this distortion increases. At 8-bit precision, simple rounding still produces recognisable audio although the distortion is also obvious. At 1-bit precision, a sine wave is turned into a square wave, and any finer details are mangled beyond recognition.

 

This is where noise shaping enters the picture. Noise shaping, in this context, is any technique which turns the non-linear distortion into uncorrelated noise and moves it to an uninteresting part of the spectrum. A dumping ground for this noise is easy to obtain simply by raising the sampling rate. This way we can acquire as much additional spectrum as we need in order to obtain the desired noise level within the band of interest. The noise shaping principles work at any sample precision. The lower the precision, the more noise we need to get rid of, and the more extra bandwidth is required to do so. This is true regardless of exactly how the noise shaping is performed.

 

The most common method of obtaining a noise-shaped low-precision sampled signal is sigma-delta modulation (SDM), typically with an output precision of 1-5 bits. The input can be either an analogue signal or a high-precision digital signal. The implementations obviously differ, but the mathematical analysis is essentially the same in both cases. Once the digital output has been produced, the mechanism that produced it is no longer relevant. It is thus meaningless to talk of "SDM format" data.

 

A sampled signal of a given precision can be trivially converted to a signal of higher precision and lower sample rate simply by lowpass filtering. The filter can be digital or analogue, again the mathematics are essentially the same.

 

As we observed earlier, any mathematical operation on a sampled signal produces an output with (effectively) infinite precision. At some point we will want to coerce this back into a limited-precision format suitable for storage or transmission. Now we have exactly the same problem as when sampling an analogue signal, only this time everything happens entirely in the digital domain. If we choose as output a high-precision format, simple rounding will suffice. For a low-precision output, we will need to apply noise shaping, typically in the form of a sigma-delta modulator. Either way, every such operation introduces a little additional noise.

 

In the event that we need to perform multiple mathematical operations on the signal, we must choose an intermediate storage format with sufficient headroom that the added noise can be ignored. A common choice is to use a few more bits per sample than the final product will have, e.g. using 24-bit intermediates when producing a 16-bit CD. If we insist on using 1-bit intermediates, this too is possible by using a higher sampling rate than intended for the final product. Done right, the end result can be as accurate as we wish whichever route is chosen. The only real difference is that 1-bit intermediates require substantially more computational effort to produce an equivalent result.

Link to comment
For practical purposes, DSD can be considered PWM. All real world PDM converters I know operate in PWM. IOW, adjacent bits of same value just lengthen the pulse but don't cause state transition.

 

Yes, and both PDM and PWM are in the same family, the latter is a special case of the former.

 

It seems that PCM folks want to appropriate DSD as one of their own, but guys please keep your PCM decimation filters and oversampling filters away! :)

Link to comment

From a high level view if the number of bits covering a defined time period is the same then the two formats have the same total information capability and it is left to the implementation to efficiently use this capability. Under ideal conditions there exists a transform that maps the two information spaces.

Custom room treatments for headphone users.

Link to comment
From a high level view if the number of bits covering a defined time period is the same then the two formats have the same total information capability and it is left to the implementation to efficiently use this capability.

 

But the number of bits per unit time do not define the same total information in PDM and PCM, as applied to their use in audio. From each PCM sample to the next, all the constant information in addition to the change information is carried forward. That's conservatively at least 80% redundant data, signal rate of change dependent. Probable more like the high 90's percent. PDM only describes the change information, in affect the derivative of the analog signal in pulse density form, that theoretically simply needs integration to recover the original.

 

The mathematical/DSP explanations put forward in these threads, while theoretically interesting, only confuse the majority, like me, as to the application of these formats to audio reproduction, IMO.

Link to comment
But the number of bits per unit time do not define the same total information in PDM and PCM, as applied to their use in audio. From each PCM sample to the next, all the constant information in addition to the change information is carried forward. That's conservatively at least 80% redundant data, signal rate of change dependent. Probable more like the high 90's percent. PDM only describes the change information, in affect the derivative of the analog signal in pulse density form, that theoretically simply needs integration to recover the original.

 

That's not the whole story. It is correct that PCM samples carry a lot of redundant information. However, with PDM, you are necessarily encoding a vast amount of unwanted information, aka noise. The amount of useful information is comparable between the two encodings.

Link to comment

That's true, which I've pointed out before. I have no knowledge of the percentage of each in terms of unwanted data, and in my business it's not very relevant to me. What is relevant is the necessity of converting from PDM front ended A/D converters into PCM for much of anything more complex than pure editing. It's the conversions from one format to the other that damages the sound quality of either, and the requirement to do so in order to do post processing is dumb.

Link to comment
That's not the whole story. It is correct that PCM samples carry a lot of redundant information. However, with PDM, you are necessarily encoding a vast amount of unwanted information, aka noise. The amount of useful information is comparable between the two encodings.

 

Not following you along these lines. With PCM at an equivalent sample rate, you still encode an enormous amount of unwanted information.

 

It is a cheap price to pay to get the advantages of DSD or high sample rate PCM. It's just that as a listener, you actually pay fewer $$. :)

 

-Paul

Anyone who considers protocol unimportant has never dealt with a cat DAC.

Robert A. Heinlein

Link to comment
That's true, which I've pointed out before. I have no knowledge of the percentage of each in terms of unwanted data, and in my business it's not very relevant to me. What is relevant is the necessity of converting from PDM front ended A/D converters into PCM for much of anything more complex than pure editing. It's the conversions from one format to the other that damages the sound quality of either, and the requirement to do so in order to do post processing is dumb.

 

You hit the nail on the head Tom, it's easy to forget about rounding, ringing, and aliasing introduced by intermediary PCM filters, when one is shifting the discussion to format definitions and such.

Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now



×
×
  • Create New...