Jump to content
IGNORED

The Optimal Sample Rate for Quality Audio


Recommended Posts

Interesting post, interesting comments. As is often the case in the world today everything comes down to "objective data", and conjecture-diagreement as to what it all "means". This "means", comes about because you always end up with a pesky human being variant at the end of the data stream. :

 

Absolutely. But the measurements exist pretty much exactly to take that pesky human out of the loop :). There will always be disagreements about what sounds good and what doesn't, and people will hear things differently. But what a design/electronics engineer has to do is try to find the common ground - because without common, objective criteria, every piece of equipment would have to be a bespoke design for a specific customer.

 

Somebody around here said you have your system just right, when you start tapping your foot

 

That's a great criteria for music, but not very good for equipment - because the right music gets my foot tapping even when played through an old transistor radio, a cheap car stereo or a portable PA speaker.

 

Now I like and enjoy objective analysis, but don't get so wound up in it, you forget to tap your foot... :0)

 

My problem is trying to type while tapping my foot. I would never have been any good as a drummer. :)

Link to comment
Sure, within those constraints it is impossible. Or, as a computer scientist would say, "hard" :)

 

But if you remove some of the constraints - allow massive oversampling and a practically-endless number of taps (both for filtering and for compensation), you have a situation where throwing CPU power at the problem is actually a solution.

Not exactly. Like I said, the problem is not just the post-ringing, but also the inaccurately reproduced steep transients and the way the filter will change the characteristics of the noise.

 

Then there is the entirely separate debate about how harmful the ringing actually is - as long a it is post- and not pre-ringing.

To my ears, it's extremely well audible each time I listen for pleasure and for lengthy periods of time (I listen to rock, prog rock, hard rock and heavy metal alot and yes, I like it loud). Each time I listen just to try and tell the differences and to make a judgement about how harmful the ringing actually is, whether I am doing so in blind, double blind or sighted listening tests, the outcome is usually inconclusive. So, my personal listening experience matches precisely what Bob Stuart told Robert Harley in the interview that I linked earlier in the thread. By the way... I do not own, nor have ever owned, any Meridian Audio products and I am in no way being biased here.

If you had the memory of a goldfish, maybe it would work.
Link to comment
So, my personal listening experience matches precisely what Bob Stuart told Robert Harley in the interview that I linked earlier in the thread.

 

That interview covers a fair number of different topics, so I am not sure which part you are referring to.

Link to comment
That interview covers a fair number of different topics, so I am not sure which part you are referring to.

It starts at the bottom of page 2:

Robert: There’s not a linear relationship between the objective magnitude of a change and the musical significance of that change.

And it continues onto the next page:

Robert: That brings to mind a conversation we had at CES about why blind listening tests may not be reliable. You said that when exposed to sound, our brain builds a model over time of what’s creating that sound. The rapid switching in blind testing doesn’t allow that natural process to occur, and we get confused.

The replies from Bob Stuart made so much sense to me that, recently, I started reading up on psychoacoustics myself ("Psychoacoustics - Facts and Models" 3rd ed. by Hugo Fastl & Eberhard Zwicker, and "Auditory Neuroscience" by Jan Schnupp, Israel Nelken & Andrew King). Especially the chapters on binaural hearing have been very informative to me, even though I have to admit my scientific and technical knowledge is mostly limited to the world of IT.

If you had the memory of a goldfish, maybe it would work.
If you had the memory of a goldfish, maybe it would work.
Link to comment
allow massive oversampling and a practically-endless number of taps (both for filtering and for compensation)

 

Massive number of taps means massive amount of ringing. Art is to minimize taps while maximizing filter performance.

 

Ringing is introduced when you decimate the oversampled ADC into RedBook rates. That's where the constrains are introduced. With "apodizing" upsampling you can modify the ringing behavior, but the constraints are already defined by the lowest sampling rate in the chain...

Signalyst - Developer of HQPlayer

Pulse & Fidelity - Software Defined Amplifiers

Link to comment
It starts at the bottom of page 2:

 

And it continues onto the next page:

 

Ah, OK! My bad - I was looking for comments on ringing, whereas what you were talking about was listening tests.

 

The replies from Bob Stuart made so much sense to me that, recently, I started reading up on psychoacoustics myself ("Psychoacoustics - Facts and Models" 3rd ed. by Hugo Fastl & Eberhard Zwicker, and "Auditory Neuroscience" by Jan Schnupp, Israel Nelken & Andrew King). Especially the chapters on binaural hearing have been very informative to me, even though I have to admit my scientific and technical knowledge is mostly limited to the world of IT.

 

Have to agree, the psychoacoustics are fascinating reading - and give a very different perspective to round reproduction. I am also impressed how all the work that has been going into lossy encodings has actually helped us further our understanding of how both our ears and our brains work.

Link to comment

It occurs to me to look at this question a slightly different way (thanks Barry for planting the germ of the idea).

 

Almost all DACs do oversampling to reach some much-higher-than-RedBook rate (e.g., 352.8 or 384kHz) before filtering. Doesn't this constitute at least a tacit, if not explicit, admission by most audio engineers who've worked on the problem that filtering at these much higher rates produces better results than filtering at RedBook sample rates?

 

This being so, what would be the preference of most on this board for attaining these higher rates - start at RedBook rates and interpolate most values through sample rate conversion, or start at rates that are as high as possible and interpolate few or even no values?

One never knows, do one? - Fats Waller

The fairest thing we can experience is the mysterious. It is the fundamental emotion which stands at the cradle of true art and true science. - Einstein

Computer, Audirvana -> optical Ethernet to Fitlet3 -> Fibbr Alpha Optical USB -> iFi NEO iDSD DAC -> Apollon Audio 1ET400A Mini (Purifi based) -> Vandersteen 3A Signature.

Link to comment
Almost all DACs do oversampling to reach some much-higher-than-RedBook rate (e.g., 352.8 or 384kHz) before filtering. Doesn't this constitute at least a tacit, if not explicit, admission by most audio engineers who've worked on the problem that filtering at these much higher rates produces better results than filtering at RedBook sample rates?

 

Not sure I would call it an "admission". Yes, oversampling is the simplest and most common solution to the steep filtering issue. I don't think anyone would claim the steep filtering isn't an issue.

 

This being so, what would be the preference of most on this board for attaining these higher rates - start at RedBook rates and interpolate most values through sample rate conversion, or start at rates that are as high as possible and interpolate few or even no values?

 

If we can start out with a high-sample-rate recording, and disk space and network bandwidth isn't an issue, it is probably best to use a consistently high sample rate instead of upsampling - with the small "but" that some amplifiers and speakers might not handle the high frequencies very gracefully, and might actually show a decrease in sound quality. In real life, the additional disk space and bandwidth has to be weighted against the possible difference in sound quality.

 

If, on the other hand, we start out with a recording that has been recorded in 44 or 48 kHz, 16 bit, there is only a very small benefit in upsampling earlier in the chain - the possibility of using more advanced upsampling algorithms than is possible with the processing power in the DAC.

Link to comment
Almost all DACs do oversampling to reach some much-higher-than-RedBook rate (e.g., 352.8 or 384kHz)
So? The ESS SABRE³² Reference ES9018 chip can upsample 24-bit 192 kHz material to no less than 1536 kHz, even (...and it uses a 32-bit internal data path to go with that).
If you had the memory of a goldfish, maybe it would work.
Link to comment
So? The ESS SABRE³² Reference ES9018 chip can upsample 24-bit 192 kHz material to no less than 1536 kHz, even (...and it uses a 32-bit internal data path to go with that).

 

Yes. The point is that since nearly all DACs already have sample rates of at least 352.8/384 going to the filtering process, why not feed them with native high sample rates, rather than RedBook, or 96kHz as Dan Lavry urges, then have the computer or DAC upsample by 4x or 8x just to reach the rate at which you're going to filter?

One never knows, do one? - Fats Waller

The fairest thing we can experience is the mysterious. It is the fundamental emotion which stands at the cradle of true art and true science. - Einstein

Computer, Audirvana -> optical Ethernet to Fitlet3 -> Fibbr Alpha Optical USB -> iFi NEO iDSD DAC -> Apollon Audio 1ET400A Mini (Purifi based) -> Vandersteen 3A Signature.

Link to comment

This is a very interesting way to look at it Jud.

 

My pure speculation as to why this might not be the best way is just that speculation. Is it more difficult to use 8x sample rates with traditional interfaces like AES/EBU? More jitter? Is sending an AES stream from something like a Mykerinos card more difficult at 8x? I also wonder if async USB is a completely different story when it comes to 8x.

 

Again, I have no idea just some thoughts.

Founder of Audiophile Style | My Audio Systems AudiophileStyleStickerWhite2.0.png AudiophileStyleStickerWhite7.1.4.png

Link to comment
Is it more difficult to use 8x sample rates with traditional interfaces like AES/EBU? More jitter? Is sending an AES stream from something like a Mykerinos card more difficult at 8x? I also wonder if async USB is a completely different story when it comes to 8x.

 

A very good point, Chris. Red book only needs 1.4 Mbit/s, while 384/24 requires 18.5 Mbit/s - pretty serious transmission speeds. Remember original Ethernet was 10 Mbit/s. Another issue is disk space - a red book CD is 0.6 GB, the same album in 384/24 is 8 GB. Yes, disk capacities are constantly increasing, and prices are decreasing, but still... Again, perhaps justifiable if those extra bits actually contain real information, but if the music is just upsampled, it is just fluff - better do the upsampling at the DAC instead of wasting bandwidth and disk space.

Link to comment

Protocol-wise, sample rates above 192Khz is not a problem with AES/EBU -- it can theoretically support any sample rate as long as clock can be recovered from the data. Bandwidth is usually not an issue as it is common to run AES/EBU over good old 75 Ohm RG6U coax or cat 5e (350MHz) cables. The current spec allows for maximum of 24 bit audio data though.

 

You do have some issues: the more common 110 Ohm characteristic impedance allows a pretty loose +/-20% tolerance. The common XLR connector is not impedance matched. At faster rates, impedance matching between transmitter and receiver is a must. To keep the eye opening bigger and jitter low, faster rise time is needed, so one must also consider EMC/radiated emissions at higher rates given the biphase signaling (not very efficient and quite a noise polluter.) Commercial audio equipment has to still pass FCC class A certification. AES/EBU is almost always transformer-coupled, so at higher rates it will require better and more expensive isolation.

 

In the pro audio world where AES/EBU is prevalent, 96KHz/24bit is common these days, although you'll often see 192KHz in the studio environment. I don't think I've ever seen anything higher though. Probably not a whole lot of equipment that handles LPCM at greater than 192KHz sample rate anyway. AES/EBU was designed to handle long distance transmission in less than ideal environment, rather than sheer transmission speed. With an active equalizer (distribution amp), you can easily go 2-3 football fields over cat 5.

Oppo UDP-205/Topping D90 MQA/eBay HDMI->I2S/Gallo Reference 3.5/Hsu Research VTF-3HO/APB Pro Rack House/LEA C352 amp/laser printer 14AWG power cords/good but cheap pro audio XLR cables.

Link to comment
Protocol-wise, sample rates above 192Khz is not a problem with AES/EBU -- it can theoretically support any sample rate as long as clock can be recovered from the data. Bandwidth is usually not an issue as it is common to run AES/EBU over good old 75 Ohm RG6U coax or cat 5e (350MHz) cables.

 

Yes, I should have been more precise. Raw analog bandwidth is not the issue, reliably achievable digital bandwidth is. On the other hand, USB 3.0 is supposed to give us 5 Gbit/s.

 

AES/EBU was designed to handle long distance transmission in less than ideal environment, rather than sheer transmission speed. With an active equalizer (distribution amp), you can easily go 2-3 football fields over cat 5.

 

It is amazing what you can do with properly balanced, impedance-matched and equalized stuff - a lot of that technology came out of the work done for transatlantic cables. And then we agonize over 1.5 m interconnects... :)

Link to comment
This is a very interesting way to look at it Jud.

 

My pure speculation as to why this might not be the best way is just that speculation. Is it more difficult to use 8x sample rates with traditional interfaces like AES/EBU? More jitter? Is sending an AES stream from something like a Mykerinos card more difficult at 8x? I also wonder if async USB is a completely different story when it comes to 8x.

 

Again, I have no idea just some thoughts.

 

I don't know either, though goldenpiggy and Julf have provided helpful responses.

 

Looking toward the near future, it seems to me we might point toward a "have our cake and eat it too" world, where we would no longer have to choose which we preferred, to remain "bit perfect" or feed 192kHz (or 352.8/384, or DSD) to the DAC. Unless there is some very good technical reason why 4x/8x/DSD bit rates are and will remain audibly inferior to RedBook, it seems to me the higher rates are what we ought to be aiming for as much as possible in recording, and then staying with those rates all the way into the DAC. Seeing the measurable differences between sample rate converters at SRC Comparisons, eliminating SRC in the chain to the extent possible may be a worthwhile goal.

One never knows, do one? - Fats Waller

The fairest thing we can experience is the mysterious. It is the fundamental emotion which stands at the cradle of true art and true science. - Einstein

Computer, Audirvana -> optical Ethernet to Fitlet3 -> Fibbr Alpha Optical USB -> iFi NEO iDSD DAC -> Apollon Audio 1ET400A Mini (Purifi based) -> Vandersteen 3A Signature.

Link to comment
Seeing the measurable differences between sample rate converters at SRC Comparisons

 

That link seems to illustrate a downsampling from 96 to 44.1 kHz. I am not sure that is a valid illustration of the possible harmful effects of a factor-of-2 upsample.

 

eliminating SRC in the chain to the extent possible may be a worthwhile goal.

 

It may be a worthwhile goal, but I am afraid it would require getting the recording industry to record *and distribute* all the music we want in true hi-res. Not sure there are enough of us to make it financially viable for the record industry.

Link to comment

Hi goldenpiggy,

 

...In the pro audio world where AES/EBU is prevalent, 96KHz/24bit is common these days, although you'll often see 192KHz in the studio environment...

 

While I'd love to see more work being done at higher rates, the overwhelming majority of multitrack recordings and mixes I've seen from most studios have been 24/44.1. While the interfaces can often handle higher rates (most up to 24/96 and some up to 192 - though fewer do this well, than have that number in their spec sheets) it appears to be computer power that is lacking most often.

 

When I ask clients why they didn't work at higher rates than 44.1, the most common answer is the computer system chokes with many tracks. (It should be remembered most studio recordings are using *dozens* of tracks.)

 

As more and more mixes are being done "in the box", i.e. not being converted to analog, passed through a mixing console and re-converted back to digital for the stereo mix, I recommend clients deliver their mixes for mastering, at the same sample rate as the multitrack originals. (No point in adding a sample rate conversion - particularly typical SRC - to the other things that will occur on the way to the stereo mix.)

 

So it isn't the interface protocol (AES/EBU and Firewire seem to be the ones I see most), it is computer power that seems to limit the sample rate of what I see coming from many studios. However, since this is not 100% across the board -I know folks doing high res multichannel with some Firewire interfaces, at least one with several dozen tracks at a time- perhaps more investment is needed in faster computer systems.

 

Best regards,

Barry

Soundkeeper Recordings

Barry Diament Audio

Link to comment

Hi Jud,

 

...Seeing the measurable differences between sample rate converters at SRC Comparisons, eliminating SRC in the chain to the extent possible may be a worthwhile goal.

 

Having a few dozen SRC algorithms to compare and having heard others (some of which are in software I've beta tested) over the years, I've found some very interesting correlations between what I see on the infinitewave site and what I hear in the studio.

 

Of course, the site will show that a great many SRC algorithms, including some quite popular ones, are riddled with issues. The brightening and hardening of timbres I hear with those seems to track what is displayed visually. But the most interesting test for me - because of the very fast correlation between what I see and what I hear - is the 1 kHz tone. Interesting because the harmonic artifacts (i.e. distortion) is *very* low in measured level for a great many of the algorithms BUT it *still* corresponds with what I hear in the different algorithms I've tried. (I'm not suggesting I hear 160 db down in the signal. I'm saying that *some* algorithms, which are well below this in their artifacts, just happen to sound - to my ears - quite clean.)

 

As to your last point, about eliminating SRC in the chain being worthwhile, I think this is *very* dependent on the individual algorithm. As I mentioned above, I've heard algorithms that sound quite transparent to my ears and perhaps coincidentally, perhaps not, there are no measurable harmonics showing in the 1 kHz test. (See "iZotope 64-bit SRC, steep, no alias".) To my ears, the results, unlike most of the competition, are *not* brightened and hardened. They sound much like the unconverted original. (As an aside, many other algorithms seem to have an easier time of it with integer conversion, such as 88.2 to 44.1 instead of 96 to 44.1. This has led many to mistakenly think integer conversion is "better". With those algorithms, it is less *bad*. The best algorithms I've heard can do non-integer conversion more cleanly than others do the easier, integer conversion. The best algorithms "don't care" as they seem to be able to handle the math in stride. To Julf's point, in my experience, they also don't seem to "care" whether they are downconverting or upconverting.)

 

Why use SRC? Well for me, since I record at 192k, I need it to create 96k and CD versions.

I use it in mastering too. Even when a mix comes in at 44.1k, one of the first things I'll do is create copies at a higher sample rate. The reason for this is that when applying EQ or other processing, I find the results sound better at higher rates. Further, if done at higher rates and the results later converted to 44.1 with a high quality algorithm (such as iZotope's), *some* of the benefits of the higher rates are preserved. In other words, I've found it creates a better sounding 44.1 version than if the SRC was eliminated and all mastering done at 44.1.

 

Best regards,

Barry

Soundkeeper Recordings

Barry Diament Audio

Link to comment

Barry,

 

While I'd love to see more work being done at higher rates, the overwhelming majority of multitrack recordings and mixes I've seen from most studios have been 24/44.1. While the interfaces can often handle higher rates (most up to 24/96 and some up to 192 - though fewer do this well, than have that number in their spec sheets) it appears to be computer power that is lacking most often.

 

That is sad to hear - I was actually under the (clearly false) impression most studios had upgraded. There really is no excuse - I can understand the computing power being a limitation 5 years ago, but with the latest generations of (multicore) CPUs, the required power has become really affordable.

 

Sure, I still (vaguely) remember how proud we were of being able to do 2 channels in real time at 12 bit / 32 kHz, but that was 25 years ago :)

 

perhaps more investment is needed in faster computer systems.

 

Perhaps :)

 

But that would require there to be a "business case" for it (= demand)

Link to comment
when applying EQ or other processing, I find the results sound better at higher rates.

 

Absolutely - it definitely makes sense to do as much of the processing as possible at higher resolution (in order not to throw away information because of lack of precision and "headroom"), and only do the sample rate conversion at the last possible stage.

Link to comment
Hi Jud,

 

As to your last point, about eliminating SRC in the chain being worthwhile, I think this is *very* dependent on the individual algorithm. As I mentioned above, I've heard algorithms that sound quite transparent to my ears and perhaps coincidentally, perhaps not, there are no measurable harmonics showing in the 1 kHz test. (See "iZotope 64-bit SRC, steep, no alias".) To my ears, the results, unlike most of the competition, are *not* brightened and hardened. They sound much like the unconverted original.

 

Hey, Barry. There are two reasons I was thinking eliminating SRC to the extent possible would be a good thing. The first was what you have said in the paragraph above should not be too much of a concern, at least with the better SRCs available. (Nice to hear that, and kind of confirms my listening preference to upsample to 192kHz with iZotope 64-bit.)

 

Why use SRC? * * * Further, if done at higher rates and the results later converted to 44.1 with a high quality algorithm (such as iZotope's), *some* of the benefits of the higher rates are preserved. In other words, I've found it creates a better sounding 44.1 version than if the SRC was eliminated and all mastering done at 44.1.

 

That's the other reason for staying at higher rates - not losing sample points and then having to interpolate them back. As well as some SRCs can do that second step (interpolating), why take the first (throwing away actual sample data) if you don't have to?

 

There will nearly always be some necessity for converting rates - as you mentioned, for the sake of EQ and other processing. But to the extent it can reasonably be avoided, and original data in the recording preserved, why not?

 

Julf - I'm not such a Polyanna as to imagine the music distribution companies will suddenly all decide RedBook (let alone mp3) should go the way of the dodo. I'm just suggesting that when we're thinking of what our computers should feed our DACs, and what our DACs should be able to process, to the extent practical we ought to aim for equipment and software that can preserve hi-res material in its original format all the way through the DAC; and furthermore, that we ought to aim for material that needs little or no oversampling once it hits the DAC, since I'm guessing SRC in the DAC may not work as well as something like, e.g., iZotope.

One never knows, do one? - Fats Waller

The fairest thing we can experience is the mysterious. It is the fundamental emotion which stands at the cradle of true art and true science. - Einstein

Computer, Audirvana -> optical Ethernet to Fitlet3 -> Fibbr Alpha Optical USB -> iFi NEO iDSD DAC -> Apollon Audio 1ET400A Mini (Purifi based) -> Vandersteen 3A Signature.

Link to comment
Even when a mix comes in at 44.1k, one of the first things I'll do is create copies at a higher sample rate. The reason for this is that when applying EQ or other processing, I find the results sound better at higher rates.

 

Barry, a question for you -- just something I'm curious about: When you upsample material for mastering, do you use "power-of-two," or "integer," resampling (i.e., 44.1 goes to 88.2 or 176.4), or do you just resample everything to 192?

 

--David

Listening Room: Mac mini (Roon Core) > iMac (HQP) > exaSound PlayPoint (as NAA) > exaSound e32 > W4S STP-SE > Benchmark AHB2 > Wilson Sophia Series 2 (Details)

Office: Mac Pro >  AudioQuest DragonFly Red > JBL LSR305

Mobile: iPhone 6S > AudioQuest DragonFly Black > JH Audio JH5

Link to comment
Jud - I'm just suggesting that when we're thinking of what our computers should feed our DACs, and what our DACs should be able to process, to the extent practical we ought to aim for equipment and software that can preserve hi-res material in its original format all the way through the DAC;

 

That is definitely something I agree with. Preserve the original format as far as possible.

 

and furthermore, that we ought to aim for material that needs little or no oversampling once it hits the DAC, since I'm guessing SRC in the DAC may not work as well as something like, e.g., iZotope.

 

If that implies upsampling everything to the highest rate the DAC is capable of, I don't see much point in just adding empty air into the audio files. And anyway, if you remove the role of the upsampling algorithm in the DAC, what can we then spend our time tweaking and arguing about? :)

Link to comment
If that implies upsampling everything to the highest rate the DAC is capable of, I don't see much point in just adding empty air into the audio files.

 

That wasn't actually what I was trying to say. Whether to upsample or stay at the native rate is a subject of numerous lively discussions here and elsewhere with people whose opinions I respect on both sides. I personally like upsampling in the computer, but that is with my current DAC. Whether it would be the case with other DACs (at least those that do the typical oversampling) I don't know. (N.B., I think the interpolation facilities of the best SRC software are good enough that while the additional samples are not actual recorded samples, they are not quite "empty air" either.)

 

What I was referring to is that it would be quite nice to have material recorded in 176.4/192 or even 352.8/384, or DSD, that would require only 2x (in the case of 176.4/192) or no oversampling at all in either the computer or the DAC. There is at least some non-negligible amount of material available in 176.4/192 and DSD, though one could wish for both more and cheaper.

One never knows, do one? - Fats Waller

The fairest thing we can experience is the mysterious. It is the fundamental emotion which stands at the cradle of true art and true science. - Einstein

Computer, Audirvana -> optical Ethernet to Fitlet3 -> Fibbr Alpha Optical USB -> iFi NEO iDSD DAC -> Apollon Audio 1ET400A Mini (Purifi based) -> Vandersteen 3A Signature.

Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now



×
×
  • Create New...