Did We Overhype USB Audio And Overlook Possible Pitfalls?

mansr · January 21, 2016

If anything is wrong with USB, it's the fact that the high-end audio industry seems to be stuck at USB 2.0. The problem is obviously not the interface itself which continues to evolve. Going to USB 3.1 (Superspeed) would solve a lot of problems, make interfaces more reliable, obviate the need for DAC specific asynchronous drivers (in Windows and class drivers on the Mac), and open the door for new Internet protocols such as MQA and the myriad of high-resolution formats that MQA will make available via USB, not to mention the possibility of streaming 24/192/384 and perhaps higher and DSD without any data compression of any kind.

I don't see how USB 3 can solve any of the problems you mention. The data rate of USB 2.0 is more than enough for audio, and the driver situation is exactly the same whatever the physical link.

mansr · January 22, 2016

Not a single mention of i2s/dsd:)

I2S is is a trivial clock+data serial interface only suited for short links, typically on the same PCB or over a very short cable, due to the lack of any error correction. For transmission from a computer to a different device, you really want something using differential signalling and at least some amount of error correction.

mansr · January 22, 2016

There is no error correction in USB audio.

USB packets have checksums to detect any single or double bit errors, and since it uses differential signalling, it is more robust against interference than I2S to begin with.

mansr · January 22, 2016

Yes and no:
no - why would you need LVDS if you have a network DAC that excepts Ethernet input and outputs i2s/DSD to your circuit.

Ethernet is differential. I2S is fine within a single device, not between different devices.

yes - LVDS i2s for transmission.

I2S is by definition single-ended. You can of course send the same signals over a differential link using suitable transceivers, but then it's not really I2S any more.

BTW have you seen the inside of audio gear lately....nothing short about the audio signal paths;)

We're still talking about inches. The distance from a computer to a DAC is typically several metres.

mansr · January 22, 2016

I'll repeat, USB audio has no error correction (as told to me by the Godfather of USB audio, Gordon Rankin).

I just checked the USB 2.0 spec. It has CRCs for each packet. The USB audio layer doesn't add any additional error correction.

mansr · January 22, 2016

I have no quantitative data but to force a bad situation try a USB 1.1 cable and listen for the difference.

You mean forcibly run USB2.0 over a USB1.1 cable? Then you're using an out-of-spec cable, and of course there may be errors.

mansr · January 22, 2016

Good, we agree on this as well. I only raised the error correction issue because a previous comment implied that USB audio has error correction.

I apologise for the confusion. Although the basic USB layer provides error detection, the USB audio protocol does not request retransmission of damaged packets, even though it could have been designed to do this. That said, my point that USB is a more robust interface than I2S still stands.

mansr · January 22, 2016

So, yes USB is not a packet based delivery as normally defined (as for example Ethernet {part of the TCP/IP stack})

USB is packet based just like Ethernet, but USB audio is more like UDP than TCP.

mansr · January 23, 2016

Two different purposes.

All DACs have input modules that convert external signals into something the DAC module itself accepts. I2S is a standard 'interboard' format that the actual DAC board itself accepts.

USB is an 'interbox' format. I2S over LVDS is being promoted as an interbox format.

I2S over LVDS is obviously more robust than single-ended signals, but it still has problems for use an inter-box format. To see why, look at the signals I2S uses:

- BCLK: bit clock

- FRAME or LRCLK: toggles between words, indicates left/right for stereo

- DATA: serial data, sampled on BCLK rising edge

As we can see, it is a synchronous interface meaning. If there is too much skew between the signals, the transmission breaks down. 384 kHz 24-bit stereo translates to a BCLK of roughly 19 MHz. Maintaining timing integrity between multiple signals at these rates over long wires is certainly possible (HDMI has much higher rates) but it is also not entirely trivial (and HDMI cables are rarely very long). An embedded clock encoding such as Manchester (Ethernet) or NRZI (USB) is easier to handle and also uses fewer wires.

Then there is the issue of flow control, and I2S has none; one end is picked as master and that's it. From a signal timing point of view, having the source as master is easier, but it is worse with regard to sound quality. The DAC chip requires an operational clock of typically 256x Fs. For a 384 kHz sample rate, that's close to 100 MHz. Normally (when everything is on the same PCB), this clock is provided by a crystal oscillator and the bit clock is derived through a divider. For an inter-box link, we probably don't want to send a 100 MHz clock if we can avoid it (yes, HDMI does this), so we're left with either synthesising it from the bit clock using a PLL or using a local oscillator and an asynchronous sample rate converter, neither of which are ideal. Having the DAC as master using a local oscillator give the best jitter performance, but then meeting timing constraints over a long cable becomes much harder (an HDMI source is always the master).

Timing and jitter aside, I2S also provides no information on signal parameters: word length, sample rate, etc. These must always be supplied through a side channel. Most DAC chips have I2C or SPI interfaces for this purpose, and there are no protocol standards here. For an interoperable link based on I2S to be viable, a configuration channel must also be specified. This is not hard to do per se, but it still has to be done.

In summary, for an inter-box link, I2S is not sufficient by itself, nor is it a particularly good choice as a foundation for a complete solution.

mansr · January 23, 2016

With the speed and capacity of today's networks, UDP is not really needed as much to save on bandwidth and connections.

Saving bandwidth is usually not the main reason for using UDP. For real-time streaming, TCP can actually be a problem because of its reliability guarantees. If a packet is dropped, TCP will stall the transfer and request retransmission, which ends up causing a much longer drop-out than the a lone lost packet would. By the time you get the retransmission, the packet is too old anyway and you have to discard a bunch of following ones too in order to catch up. For non-real-time streaming (e.g. audio/video on demand), TCP together with a rather large local buffer tends to be the simplest to implement. Provided packet loss is not too severe, the buffer will hide the interruptions in the data stream whenever TCP needs to retransmit.

mansr · January 23, 2016

I must be missing something, why does this not work for audio (a buffer of sufficient size that TCP re-transmissions could get put back in order in time for processing by the DAC)? Are you talking about assuming buffer in the > 1 second range? I would think even in a "mediocre" home wifi situation (where TCP re-transmissions are of a non trivial amount) this would work.

For something like a phone call a buffer of that size would add too much latency. It also doesn't work for any kind of live transmission since every lost packet will make you fall a little more behind the source and eventually a buffer of any size will run out.

mansr · January 23, 2016

Above is our hypothesis.

It is a statement of facts. You may of course disagree on the implication of those facts.

Let's perform a thought experiment and test our hypothesis. I sell my Rendu series to a lot of PS Audio PerfectWave DS DAC owners. Set aside the fact that reports on this forum and via email direct to me prove that these customers overwhelmingly prefer this i2s inter-box combination because it does not validate or invalidate our hypothesis. This circumstantial evidence does provide some insight though on the outcome of our experiment because it appears to be a good solution for many. It turns out that the Rendu / PS Audio PerfectWave DS inter-box solution does not use the Rendu's master clock (even though it is a very good one) and it doesn't use a PPL. In fact, the PS Audio PerfectWave DS ignores the Rendu's master clock all together and it derives it's own clock via FPGA.

IIRC, the PerfectWave uses an asynchronous sample rate converter. In fact, the only way you can have an independent clock in a DAC without any flow control is by the use of an ASRC. While I have no doubt it is a very good one, having none at all would be better still.

Man that Ted S. guy is really cleaver. In short, you did not consider this option and therefore the hypothesis is wrong. Experiment over.

Oh, but I did.

In addition, you did not consider other factors that could be a benefit to this or other combination. For example, you did not consider that when you place a USB receiver (AKA a processor) in a DAC that some companies do not isolate that processor from the DAC section. The Rendu series has a properly isolated output section that can be a benefit to these devices and it may superceed the importance of some other factor.

Who said anything about USB?

PS the PS Audio's HDMI specification does support i2C communication, but I do not support it because it's not needed for playback.

Do not confuse I2S with I2C, they are entirely different interfaces used for entirely different purposes. The I2C link in HDMI is also known as DDC (Display Data Channel) and is used by computers to retrieve the supported resolutions etc of a display (EDID). For pure audio applications you can probably ignore it and hope for the best, although it would be able to tell you things like supported sample rates and number of channels.

mansr · January 24, 2016

Well that is not what Ted S. is saying in an article on audiophilia.com. This is what he said in double quotes no less:
"Smith: ‘I do not believe there are other DACs that use the FPGA for everything from “parsing” the input bits to outputting single bit DSD. There are a few other DACs that use sample rate conversion of all inputs to a single output frequency, but they use asynchronous sample rate conversion to convert from the incoming clock to the local clock or use a phase-lock loop (PLL) to drive the local clock. We use no input PLLs or FLLs (frequency-locked loops) and we use one master clock (and derive all other needed clock rates synchronously from that). As far as I know we are the only ones that do that.’"

It doesn't matter what they call it, if the output clock is not somehow tied to the source clock (and there is no flow control as in USB Audio 2.0), there must be a conversion going on somewhere. Otherwise there will be clock drift and skips which sound absolutely terrible.

I did not confuse i2s and i2c and I'm not talking about HDMI for video use. The PS Audio LVDS i2s spec uses pin 15 and 16 for i2c. I mentioned usb as an example to show why one input might be better than another on the same device.

BTW I use a Signature Series Rendu LVDS i2s out via a 12" HDMI cable into my Buffalo IIISE DAC. The DAC has a LVDS receiver and very short U.FL cables connecting the resulting i2s signals to the main DAC board. The Buffalo IIISE DAC is synchronously clocked.

Oh, you're just using an HDMI cable, not the HDMI signal protocol? That makes sense.

mansr · January 24, 2016

They resample everything in the FPGA to a common rate. The local clock syncs the analog section, the FPGA and the USB receiver.

Source: http://www.psaudio.com/wp-content/uploads/2014/02/DirectStream-DAC-white-paper.pdf

From the rather vague description given there, it appears to be an asynchronous sample rate converter of sorts.

mansr · January 24, 2016

In the general case the USB or Ethernet and possibly the LVDS stream is read into a buffer and then read out of a buffer (FIFO) not in the general case sample rate converted although clearly reclocked/dejittered, certainly depacketized.

Of course the DirectStream does up sampling and so sample rate conversion, but that isn't necessary to the approach

Absent a return channel for flow control, the FIFO will eventually overflow or underflow without an ASRC. Standard asynchronous USB audio has flow control, while S/PDIF, Toslink, and AES do not. Ethernet certainly can since it's a two-way interface but it depends on the high-level protocol, and there are many of those. The LVDS/I2S input requires an additional side channel, so it too could in theory do flow control although I2S as such is a synchronous interface.

mansr · January 24, 2016

Actually the DS does not asynchronously resample. all incoming rates are converted to DSD x10, which is an integer relative to both sample rate bases, hence they can use a single clock, without asynchronously resampling the data.

If the DS master clock is not locked to the source clock, they are by definition asynchronous.

mansr · January 24, 2016

right, my understanding is that the masterclock is appropriately divided and sent back to the USB receiver, so the data is all locked to the same masterclock source.

Yes, that's how asynchronous USB always works, and it's the reason this is often preferred over synchronous interfaces. S/PDIF and similar interfaces don't have a return channel and so cannot possibly work this way.

Ted Smith specifically replied that the DS does not asynchronously resample the data when I asked him that question directly.

With one-way communication from an audio source to a DAC, there are exactly two options: 1) use a PLL (or similar) to lock the DAC clock to the average rate of the incoming data stream, or 2) use an adaptive asynchronous sample rate converter to add or remove samples depending on the relative rates of the clocks. Purists tend to frown upon both.

mansr · January 24, 2016

All implementation dependent. USB and Ethernet do have flow control and I have not thought out the I2S/LVDS situation. Since the implications of buffer overflow here are not life threatening,

No, but you'll get a nasty pop from your speakers.

I might be happy to accept the need to limit continuous play to xxx hours,

If placing an upper bound on the continuous playback time is acceptable, you can of course fill your FIFO to 50% at startup and hope for the best. The time limit depends on the size of your buffer and the accuracy of the clocks. I assumed that such arbitrary limits would not be involved.

and assume that a song change will restart the buffer, but that's just me;)

That assumption is wrong if you do gapless playback.

mansr · January 28, 2016

Seems like Altera might even be ahead of Xilinx regarding this? https://www.altera.com/solutions/technology/transceiver/protocols/pro-sgmii.html , well when my snickerdoodle+gigglebits arrives in a few months I'll have to start playing around with this. Any inexpensive Altera option that is recommended?

Depends on how good you are at getting stuff for free. If you're not, some of the simpler boards can be had for less than $100. There's a decent selection here: Terasic - All FPGA Main Boards

mansr · January 29, 2016

Any advice on getting dev board for free would be welcome!

Establish a reputation as a kernel developer.

mansr · October 21, 2016

Jud, agreed! One well known case of this to me is that of Jack Bybee's products. BTW, as Jud points out, this approach to marketing is not limited to audio. I recently saw an ad for Samsung's new TV which refers to Quantum tech or some such thing. this kind of pseudo technical marketing language is used in many different product fields, but people only seem to complain about it in high end audio...

Quantum dots are a real thing, and they are actually based on quantum mechanical effects.

mansr · October 21, 2016

By no means did I mean to criticize Jack's products, I sometimes use them myself. But Jack is well known for not actually revealing anything about how his products work, or what they even are, and giving "explanations" which appear to be intended to protect his IP. Check out the myriad threads at Audio Asylum or DIY Audio if you are interested in more information.

I was referring to the Samsung TVs. Should've made that clearer.

mansr · October 22, 2016

Actually we have quantum dots for our eyes: https://en.m.wikipedia.org/wiki/Quantum_dot_display ?

Real.

And for our ears: http://www.stereotimes.com/acc091812a.shtml ?

Bullshit.

Did We Overhype USB Audio And Overlook Possible Pitfalls?

Recommended Posts

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Create an account or sign in to comment

Create an account

Sign in