Jump to content
IGNORED

DSD encoding with SoX


mansr

Recommended Posts

When I recently decided to take a closer look at the DSD phenomenon to see for myself whether it lived up to the hype, I noticed a dearth of open source tools with support for this format, so I set about rectifying this situation by adding the necessary functionality to SoX. The resulting code is available from https://github.com/mansr/sox with the following features:

 

- DSF read/write

- DFF read-only

- DSD to PCM conversion using the existing resampler

- PCM to DSD conversion

 

The DSD encoding supports four settings with increasing quality and decreasing speed: fast, hq, audiophile, goldenear. Below are RightMark Audio Analyzer results showing the performance. I encoded the 96/24 test signal to DSD64 and back to PCM at each of the quality levels. For reference, JRiver is also included in the analysis.

 

First the tabular report:

rmaa.png

 

Frequency response:

fr.png

 

Noise level:

noise.png

 

Dynamic range:

dr.png

 

Harmonic distortion:

thd.png

Link to comment

Hi, what's the reason for the spike in harmonic distortion at 1KHz? (Sorry, I'm just ignorant of this stuff but happy to learn.)

 

And yes, thanks for the contribution to SoX/open source.

One never knows, do one? - Fats Waller

The fairest thing we can experience is the mysterious. It is the fundamental emotion which stands at the cradle of true art and true science. - Einstein

Computer, Audirvana -> optical Ethernet to Fitlet3 -> Fibbr Alpha Optical USB -> iFi NEO iDSD DAC -> Apollon Audio 1ET400A Mini (Purifi based) -> Vandersteen 3A Signature.

Link to comment
1kHz is the test tone. Harmonic distortion is visible as blips at multiples of thereof, i.e. 2kHz, 3kHz etc.

 

Got it. (Wow, that *was* an ignorant question!)

 

:)

One never knows, do one? - Fats Waller

The fairest thing we can experience is the mysterious. It is the fundamental emotion which stands at the cradle of true art and true science. - Einstein

Computer, Audirvana -> optical Ethernet to Fitlet3 -> Fibbr Alpha Optical USB -> iFi NEO iDSD DAC -> Apollon Audio 1ET400A Mini (Purifi based) -> Vandersteen 3A Signature.

Link to comment

Thanks mansr, this looks to be very useful, but I have one too [an ignorant question, I mean]...it compiles well enough and:

 

sox filename.flac -r 2822400 -b 1 filename.dsf

 

produces, after thinking about it for a few minutes, a file that plays, but can you say what the syntax is for the 'hq' 'audiofile' and 'goldenear' parameters would be? Have had no luck getting any of them accepted...

 

 

thanks again,

 

 

rikm

 

 

 

 

rikm

Link to comment
Thanks mansr, this looks to be very useful, but I have one too [an ignorant question, I mean]...it compiles well enough and:

 

sox filename.flac -r 2822400 -b 1 filename.dsf

 

produces, after thinking about it for a few minutes, a file that plays, but can you say what the syntax is for the 'hq' 'audiofile' and 'goldenear' parameters would be? Have had no luck getting any of them accepted...

 

Add "sdm -f hq" to the end of that command. You might also want to increase the quality setting of the resampler, like this:

 

sox file.flac file.dsf rate -v 2822400 sdm -f hq

Link to comment
Can I also do a DSD64 to a DSD256?

What command?

 

Yes. To do that you need to remove the high-frequency noise from the original, resample to 256x rate, and quantise to 1-bit. Like this:

 

sox in.dsf out.dsf rate -v 88200 gain 6 rate -v 11289600 sdm

 

The "gain 6" is needed to restore the volume level from the -6dB reference level for DSD/SACD. However, some DSD files (e.g. from dsdfile.com / Opus3) are encoded at a higher (nonstandard) level, and those require a lower gain setting to avoid clipping. You can also omit the "gain" step entirely and simply get a quieter output file.

 

Note that although the command above will produce valid output, the filters are optimised for DSD64. Using them for higher rates simply raises the frequency where the noise level starts rising without improving anything in the low part of the spectrum. With filters tuned for DSD128 or DSD256, other characteristics would be possible. I simply haven't got around to exploring those options yet.

Link to comment

mansr: Great job!

 

Wondering, is there any move towards a better file format for DSD? Something that can handle both tagging and (DST-like?) compression for DSD64/128/256. To have software like SoX be capable of handling conversion of .dff/.dsf to this new format, MP3Tag to allow easy tagging, and JRiver for playback I suspect would be a great forward!

 

We need a new "FLAC" but for DSD :-).

 

Archimago's Musings: A "more objective" take for the Rational Audiophile.

Beyond mere fidelity, into immersion and realism.

:nomqa: R.I.P. MQA 2014-2023: Hyped product thanks to uneducated, uncritical advocates & captured press.

 

 

Link to comment

 

Wondering, is there any move towards a better file format for DSD? Something that can handle both tagging and (DST-like?) compression for DSD64/128/256. To have software like SoX be capable of handling conversion of .dff/.dsf to this new format, MP3Tag to allow easy tagging, and JRiver for playback I suspect would be a great forward!

 

We need a new "FLAC" but for DSD :-).

 

Not quite sure what to make of this post. MP3Tag can edit metadata in a DSF file just as easy as a FLAC. If you wish that Sox has these attributes, that's another story.

 

Metadata edits using Jriver for DSF don't stick for some reason, no big loss don't use the program much.

 

We need another audio format like a hole in the head unless there's REALLY good justification. Now if what you're proposing is that a 4min DSD file can be squeezed to <100kB with the same resolution and SQ, there would be interest and would save me copious $ from buying RAID.

AS Profile Equipment List        Say NO to MQA

Link to comment
We need another audio format like a hole in the head unless there's REALLY good justification. Now if what you're proposing is that a 4min DSD file can be squeezed to <100kB with the same resolution and SQ, there would be interest and would save me copious $ from buying RAID.

 

I agree, file format proliferation is bad enough as it is. The main problem with the currently popular DSF is the size. DSD is inherently large, and since much of the information contained is noise, it is hard to compress well. DST still manages pretty well, and doing better would likely be very difficult. The trouble with DST is its complexity. It's not easy to write a decoder, much less an encoder, and making it fast is a challenge on top of that. I wonder if there might be a use for a simpler compression scheme that (obviously) didn't compress as well but was easy to implement efficiently. Extending DSF to accommodate compressed data would be trivial, or some other well-known container format could be used.

Link to comment
The main problem with the currently popular DSF is the size. DSD is inherently large

 

I don't think there's any real problem with the size. 6 TB of disk space costs peanuts these days.

 

I wonder if there might be a use for a simpler compression scheme that (obviously) didn't compress as well but was easy to implement efficiently. Extending DSF to accommodate compressed data would be trivial, or some other well-known container format could be used.

 

ZIP and other similar ones work fairly OK. The compression ratio is not high, but still makes difference to uncompressed.

 

I've thought about adding gzip compression support to DSF and why not to DSDIFF too. That would be a bit like TIFF for image data.

Signalyst - Developer of HQPlayer

Pulse & Fidelity - Software Defined Amplifiers

Link to comment

ZIP and other similar ones work fairly OK. The compression ratio is not high, but still makes difference to uncompressed.

 

I've thought about adding gzip compression support to DSF and why not to DSDIFF too. That would be a bit like TIFF for image data.

 

For some cases ZIP can save about 25% of space (depend on stuff). However it is out of DSF and DFF standards. Need support from (as software as hardware) audio player side.

AuI ConverteR 48x44 - HD audio converter/optimizer for DAC of high resolution files

ISO, DSF, DFF (1-bit/D64/128/256/512/1024), wav, flac, aiff, alac,  safe CD ripper to PCM/DSF,

Seamless Album Conversion, AIFF, WAV, FLAC, DSF metadata editor, Mac & Windows
Offline conversion save energy and nature

Link to comment
For some cases ZIP can save about 25% of space (depend on stuff). However it is out of DSF and DFF standards. Need support from (as software as hardware) audio player side.

 

Sure, it is non-standard, but would be fairly easy to establish as extra, just like ID3v2 tags on WAV or DSDIFF. Although it naturally wouldn't play on software that doesn't support it. Given how DSDIFF is designed, it would be less likely to cause conflicts there.

 

But overall, I don't feel a pressing need for such things given how cheap the storage is. If the storage space is issue for someone, time will fix that...

Signalyst - Developer of HQPlayer

Pulse & Fidelity - Software Defined Amplifiers

Link to comment
I don't think there's any real problem with the size. 6 TB of disk space costs peanuts these days.

 

True, size isn't much of an issue these days, but it's still the biggest drawback of DSF.

 

ZIP and other similar ones work fairly OK. The compression ratio is not high, but still makes difference to uncompressed.

 

The problem with standard zip is that it operates on bytes whereas a bit-oriented compression may perform better on DSD data.

Link to comment
The problem with standard zip is that it operates on bytes whereas a bit-oriented compression may perform better on DSD data.

 

Sure, but it (zlib to be specific) is fairly widely supported standard and generic compression standard that is also fairly fast to decompress even on lighter CPUs. And achieves some space savings on DSD depending on the content.

 

It doesn't really matter if the compression is byte based on something else, as the general purpose compressor cannot make any assumptions about alignment or word lengths of the data being compressed.

 

It would be easy enough to add and without big negative impact on performance, that's why I've considered it. Although I don't have big "need" for it.

Signalyst - Developer of HQPlayer

Pulse & Fidelity - Software Defined Amplifiers

Link to comment
Sure, but it (zlib to be specific) is fairly widely supported standard and generic compression standard that is also fairly fast to decompress even on lighter CPUs. And achieves some space savings on DSD depending on the content.

 

It doesn't really matter if the compression is byte based on something else, as the general purpose compressor cannot make any assumptions about alignment or word lengths of the data being compressed.

 

It's not a question of alignment, it's that a compressor probably has a better chance of finding patterns at the bit level. As a stupid example, consider a sequence like 000111000111... repeating. Viewed as bits, the pattern is obvious. When bundled into bytes, this sequence looks like 00011100 01110001 11000111 etc. It eventually repeats, but the period is much longer.

 

On the other hand, there's so much randomness (noise) inherent in a DSD bitstream that it perhaps doesn't make much difference.

 

It would be easy enough to add and without big negative impact on performance, that's why I've considered it. Although I don't have big "need" for it.

 

Oh sure. The ubiquity of zlib is a compelling argument in its favour.

Link to comment
It's not a question of alignment, it's that a compressor probably has a better chance of finding patterns at the bit level. As a stupid example, consider a sequence like 000111000111... repeating. Viewed as bits, the pattern is obvious. When bundled into bytes, this sequence looks like 00011100 01110001 11000111 etc. It eventually repeats, but the period is much longer.

 

Usually you can certainly get better compression ratios when you have some knowledge about the content. For example if the compressor would know that data contains 16-, 24- or 32-bit words instead of operating of bytes or something else.

 

Huffman-coding does pretty good job with DSD, just like it does with JPEG and MP3 which also don't have byte oriented data stream but can utilize the exact number of bits they need for certain entity.

Signalyst - Developer of HQPlayer

Pulse & Fidelity - Software Defined Amplifiers

Link to comment
Huffman-coding does pretty good job with DSD, just like it does with JPEG and MP3 which also don't have byte oriented data stream but can utilize the exact number of bits they need for certain entity.

 

The Huffman coding in JPEG and MP3 doesn't operate on bytes.

 

Anyhow, this discussion is veering into the pointless...

Link to comment
DST still manages pretty well, and doing better would likely be very difficult. The trouble with DST is its complexity. It's not easy to write a decoder, much less an encoder, and making it fast is a challenge on top of that.

I wouldn't agree about complexity in algorithmic sense. It's kind of "traditional" FIR filter prediction plus arithmetic encoding of residuals. The actual problem is in performance because DST tries to predict each bit of the DST stream. And at high DSD samplerates it is the hard job even for modern CPUs. Just like doing a dozen line SDM gives a performance issue. I probably do DST encoder as well, but I don't see where it can be used.

Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now



×
×
  • Create New...