Jump to content
IGNORED

Audibility of digital reconstruction filters


Recommended Posts

Testing with pure sines is completely useless, because this is not about hearing tones, this is about hearing waveform shape of rising transient. This is because transients naturally alert humans about incoming danger and that's why human hearing has developed to be especially good at detecting transients, rather than steady tones. Brain is trying to automatically filter out steady tones (background noise) to detect any changes/transients.

 

So you rather need to compare band limited and band-unlimited snaps, pops and crackle where the spectrum spreads across from low frequencies to high frequencies.

 

Miska: This makes logical sense (points in the same direction as some of the comments about pre-ringing due to filters), but it in essence asserts that we are separately sensitive to frequency and rise time through two different mechanisms. How would you conduct a test of the ability to distinguish between two levels of transient attack while holding frequency constant?

 

Already been done in the academic literature.

One never knows, do one? - Fats Waller

The fairest thing we can experience is the mysterious. It is the fundamental emotion which stands at the cradle of true art and true science. - Einstein

Computer, Audirvana -> optical Ethernet to Fitlet3 -> Fibbr Alpha Optical USB -> iFi NEO iDSD DAC -> Apollon Audio 1ET400A Mini (Purifi based) -> Vandersteen 3A Signature.

Link to comment
It's not the transducer, it's the detector software behind it... ;)

 

Yes, but the detector behind the transducer cannot work with a signal it never gets. So the transducer has to respond to those transients in some manner. I don't know the transient limits of the eardrum. I would find it unlikely it responds to steep transients above the steepness it can respond to on a continuous sine wave. Much like a microphone capsule. But does the eardrum respond to much higher frequencies than total hearing does?

And always keep in mind: Cognitive biases, like seeing optical illusions are a sign of a normally functioning brain. We all have them, it’s nothing to be ashamed about, but it is something that affects our objective evaluation of reality. 

Link to comment
Miska: This makes logical sense (points in the same direction as some of the comments about pre-ringing due to filters), but it in essence asserts that we are separately sensitive to frequency and rise time through two different mechanisms.

 

That is the case, because evolution has developed the hearing system to alert about approaching danger, like breaking of a stick and similar sounds. We can immediately sense the direction with good accuracy and also detect what kind of event caused the transient, while our hearing filters out noise of a wind at the same time (constant wideband noise).

 

That's evolutionary reasoning behind:

Human Hearing Outsmarts Physical Limits - Evolution News & Views

Human hearing is highly nonlinear - physicsworld.com

 

And most of the filtering theory revolves around Fourier/Gabor rules and cannot escape those boundaries.

 

Interesting coincidence is that PCM is a linear system, while DSD is a non-linear system... :)

 

With advanced wavelet-based algorithms it is possible to approach what hearing does, but still is is really really hard to beat. Especially if you want enough accuracy to be able to tell two otherwise very similar transients A and B apart and match those against huge database of learned transient to be able to tell what it means...

 

(same applies to image/video compression, DCT based transforms are really poor compared to wavelet ones, like JPEG 2000)

 

What tends to sound especially bad is abrupt change in frequency/phase response from high order filters.

 

How would you conduct a test of the ability to distinguish between two levels of transient attack while holding frequency constant?

 

I could make some, I have quite a bunch of transient I have recorded myself, coated metal claves, wood claves, wood block, castanets, wooden maracas and glockenspiel for example. And I could also make artificial test tones.

 

Let's see if I have time to make something out. But overall, I rather leave it to the universities to find proofs for things everybody knows to happen, but nobody yet has a mathematical formula to prove (there are quite a number of such things in the world). Before someone can mathematically prove something about human hearing, you first need to have a mathematical model of human hearing and such thing doesn't exist yet.

Signalyst - Developer of HQPlayer

Pulse & Fidelity - Software Defined Amplifiers

Link to comment

To the extent that "transient attack" and frequency response are treated separately, wouldn't the things you raise here similarly apply in terms of amplifier design/capabilities and speaker design/capabilities?

 

Also, if our brains process these two separately/differently, wouldn't it imply that some of us are more sensitive to one than the other and that might explain both our equipment preferences and the differences expressed about the audibility/inaudibility of certain changes?

Synology NAS>i7-6700/32GB/NVIDIA QUADRO P4000 Win10>Qobuz+Tidal>Roon>HQPlayer>DSD512> Fiber Switch>Ultrarendu (NAA)>Holo Audio May KTE DAC> Bryston SP3 pre>Levinson No. 432 amps>Magnepan (MG20.1x2, CCR and MMC2x6)

Link to comment
To the extent that "transient attack" and frequency response are treated separately, wouldn't the things you raise here similarly apply in terms of amplifier design/capabilities and speaker design/capabilities?

 

Also, if our brains process these two separately/differently, wouldn't it imply that some of us are more sensitive to one than the other and that might explain both our equipment preferences and the differences expressed about the audibility/inaudibility of certain changes?

 

The entire chain would be implicated, which of course raises the inquiry as to how capable the rest of the chain is of reproducing these attacks, and the degree of variation that might be found among components.

 

Re sensitivity, all the academic papers I've seen certainly show considerable variation among individuals.

One never knows, do one? - Fats Waller

The fairest thing we can experience is the mysterious. It is the fundamental emotion which stands at the cradle of true art and true science. - Einstein

Computer, Audirvana -> optical Ethernet to Fitlet3 -> Fibbr Alpha Optical USB -> iFi NEO iDSD DAC -> Apollon Audio 1ET400A Mini (Purifi based) -> Vandersteen 3A Signature.

Link to comment
Exactly, that's what I've been saying all the time. It is completely different to have constant ultrasonic sine tone (without fundamental in audible range) or to have a transient with ultrasonic harmonics (noise-like, or with fundamental in audible range).

 

So this is not about alone sines, but the combination of N sines with content IN the audible band AND in ultrasonic with strong relation due to the steep rise...

 

OK, I understand now. My point about testing with constant ultrasonic signal was to determine if the effect (brain activity) occurred when any ultrasonic energy was present, or only when the ultrasonic energy was correlated with the audible signal. For example, if it was shown that a constant ultrasonic tone had the same effect as correlated ultrasonic energy, and if it was further proven that listeners preferred the ultrasonic presence, it would only be necessary to inject some ultrasonic energy after the DAC rather than go to the trouble of recording it.

 

I still remain unconvinced of the requirement. If ultrasonic perception was a survival trait, I would think we would have evolved it (or not lost it, if we orginally had it.) Many animals do have the ability to hear further into the infrasonic and ultrasonic ranges than we do, but in most cases the abilities have more to do with size than survival. For example, elephants use infrasonics to communicate, and small creatures use higher frequencies because they are too small to generate lower frequencies. (Some small birds' lower hearing cutoff is apparently as high as 3 KHz.)

 

I'm not against the use of higher sampling rates, though. I'm quite happy to see 24/96 used, with a gentle cutoff starting around 30 KHz. I don't see the sense in going flat to 48 KHz, because you are back to the problem of having a sharp filter. Simpler filters usually mean lower latency, if it matters.

"People hear what they see." - Doris Day

The forum would be a much better place if everyone were less convinced of how right they were.

Link to comment
Hi Dennis:

I'm not feeling well tonight so I am having a hard time following this conversation properly. Can either of you comment on how tinnitus figures into this? From what I have read and heard (I have mild tinnitus myself), tinnitus is actually the ear/brain self-generating at the very frequencies where hearing loss has occurred--and for many, myself included, that is right in the 12-16kHz range.

 

I have tested myself and can still hear (up to maybe 15kHz), but it is interesting to start to think about the ringing in my head (some times worse than others) as bone conduction. I always thought of bone conduction just at very low frequencies. BTW, I bet a lot of people have some tinnitus and don't even realize it because they live in a city. Out here in the country it is eerily (no pun intended) quiet. My studio/office has a noise floor of about 36dB. At times I have wanted to stick my measurement mic in my ear to measure the frequency/amplitude of the ringing! (Just kidding; of course that would not work, though the small diameter capsule would fit snugly :). )

 

Goodnight,

AJC

 

Alex, I intended to respond to this last night. Hopefully you are feeling better now.

 

Yes, tinnitus is self generated on the nerves from damaged hair cells connected to those nerves as I understand it. The bone conduction is masking to lessen how noticeable the tinnitus is. Much like a fan running will make it less noticeable than being in a quiet rural area with little noise to mask it, and all you hear is ringing in your ears. There is also some research showing applying high frequency maskers via bone conduction will reduce the tinnitus for long periods after the masker is turned off. Results are somewhat variable so far however.

 

The bone conduction Miska linked to was applying an ultrasonic sound to resonate the brain. This resonance causes a slight pulsing of the blood vessels in the head from the brain pulsing at resonance. As blood vessels go to tympanic membrane that will get pulsed as a side effect. It gets translated into perception of a tone about an octave lower than the brain resonance. Which varies a bit, but apparently is around the upper 30 khz range. The false perception in the ear is not well tuned and serves as a low level masker. It also may get picked up in one ear that is not suffering tinnitus if such is a problem in only one of the ears. After a time the brain responds by somewhat ignoring this noise or so some of the hypotheses go.

 

I had read of such in the past. Don't know if it is a method available in current therapy or merely at the research stage.

 

But no the ringing in your ears isn't likely from your head ringing at ultrasonic frequencies. Bone conduction as I stated is about 1/1000th as sensitive as air conduction to your ears.

And always keep in mind: Cognitive biases, like seeing optical illusions are a sign of a normally functioning brain. We all have them, it’s nothing to be ashamed about, but it is something that affects our objective evaluation of reality. 

Link to comment

If we assume 20 kHz upper hearing limit and want to be certain about preserved temporal resolution, 1/20e3 is the time constant we should be looking at.

 

You keep asserting this, quite out of the blue, yet you ignore the mechanics of the ear. Isn't it time for a bit self-education then? You might find interesting things.

 

Do you really think that a massively over-engineered digital audio system like you propose is going to sound significantly different from what we have today?

Link to comment
You keep asserting this, quite out of the blue, yet you ignore the mechanics of the ear. Isn't it time for a bit self-education then? You might find interesting things.

 

I could say the same to you. If you can hear to 20 kHz, you also have to honor the corresponding time constant 1/20000.

 

Do you really think that a massively over-engineered digital audio system like you propose is going to sound significantly different from what we have today?

 

I have it, although it does even better 1/50e3 timing and it does sound significantly different.

Signalyst - Developer of HQPlayer

Pulse & Fidelity - Software Defined Amplifiers

Link to comment
For example, if it was shown that a constant ultrasonic tone had the same effect as correlated ultrasonic energy, and if it was further proven that listeners preferred the ultrasonic presence, it would only be necessary to inject some ultrasonic energy after the DAC rather than go to the trouble of recording it.

 

It seems that it needs to be transient-correlated, but not necessarily correctly. Many people prefer short leaky oversampling filters.

 

I still remain unconvinced of the requirement. If ultrasonic perception was a survival trait, I would think we would have evolved it (or not lost it, if we orginally had it.

 

We still marginally have, but we are losing it all the time. Chimpanzees already have better ultrasonic hearing.

 

This is one of the more recent paper:

http://psychology.utoledo.edu/images/users/74/High_Freq_Hearing_preprint_version.pdf

 

And this is really old one, but it has some data about other mammals:

http://psychology.utoledo.edu/images/users/74/Evolution_of_Human_Hearing_1969.pdf

 

In the end we have to admit that we still don't fully understand human hearing. I rather develop systems based on what is both technically correct and what people feel correct when listening. Rather than narrow minded trumping of what people hear based on limited scientific knowledge of all the details of human hearing.

 

I'm not against the use of higher sampling rates, though. I'm quite happy to see 24/96 used, with a gentle cutoff starting around 30 KHz. I don't see the sense in going flat to 48 KHz, because you are back to the problem of having a sharp filter. Simpler filters usually mean lower latency, if it matters.

 

At 96k there's still not enough space for gentle cutoff to satisfy 1/20e3 temporal envelope, but 384k gets you close. Of course with DSD you can easily reach 1/50e3 temporal envelope which I consider enough.

 

I am pretty certain that reason why the gentle cutoff sounds better than brickwall is temporal resolution. In optimal case, you would at most have first order lowpass.

Signalyst - Developer of HQPlayer

Pulse & Fidelity - Software Defined Amplifiers

Link to comment
I don't know the transient limits of the eardrum. I would find it unlikely it responds to steep transients above the steepness it can respond to on a continuous sine wave.

 

Again, if we conclude that you eardrum can only hear up to 20 kHz, that is 1/20e3 = 50 µs in time. So we can further conclude that your temporal resolution envelope should be at least 50 µs. Meaning that step response transition should stay within 50 µs window. Meaning that you filter's impulse response shouldn't be longer than 50 µs.

 

Edit: note my earlier links regarding human time-frequency capabilities in this context.

Signalyst - Developer of HQPlayer

Pulse & Fidelity - Software Defined Amplifiers

Link to comment
To the extent that "transient attack" and frequency response are treated separately, wouldn't the things you raise here similarly apply in terms of amplifier design/capabilities and speaker design/capabilities?

 

Yes, that's why I'm trying to pick loudspeakers and headphones that are capable of at least 50 kHz bandwidth and have preferably first order cross-overs. My amplifier goes flat to 100 kHz and has only first order roll-off.

 

I also had nice amp in the past that went flat to 1 MHz and sounded awesome, but unfortunaly I broke it... :(

 

Also, if our brains process these two separately/differently, wouldn't it imply that some of us are more sensitive to one than the other and that might explain both our equipment preferences and the differences expressed about the audibility/inaudibility of certain changes?

 

Yes, definitely. :)

Signalyst - Developer of HQPlayer

Pulse & Fidelity - Software Defined Amplifiers

Link to comment

So you are not acquainted with the school of thought that looks at the temporal spread of the highest ERB in the cochlea, and derives from this a requirement for 250 us, or a transition band of 4kHz?

 

When translated to midrange frequenties the same reasoning yields filters with demonstrably inaudible preringing. Interesting, not?

Link to comment
So you are not acquainted with the school of thought that looks at the temporal spread of the highest ERB in the cochlea, and derives from this a requirement for 250 us, or a transition band of 4kHz?

 

When translated to midrange frequenties the same reasoning yields filters with demonstrably inaudible preringing. Interesting, not?

 

Thanks for the very interesting information. It makes a lot of sense to me that the frequency sensitivity of the ear is higher for continuous sine waves than for transients. To put it differently, to be sensitive to 50µs transients, the ear would have to be sensitive to sine waves - or wavelets - of much higher frequencies (since this transient is a linear combination of such higher-frequency sine waves or wavelets).

Link to comment
So you are not acquainted with the school of thought that looks at the temporal spread of the highest ERB in the cochlea, and derives from this a requirement for 250 us, or a transition band of 4kHz?

 

When translated to midrange frequenties the same reasoning yields filters with demonstrably inaudible preringing. Interesting, not?

 

Thanks for the very interesting information. It makes a lot of sense to me that the frequency sensitivity of the ear is higher for continuous sine waves than for transients. To put it differently, to be sensitive to 50µs transients, the ear would have to be sensitive to sine waves - or wavelets - of much higher frequencies (since this transient is a linear combination of such higher-frequency sine waves or wavelets).

 

It may make sense, but it doesn't comport with the (small amount of) academic literature I've read. Fokus, are there some references you can pass along?

One never knows, do one? - Fats Waller

The fairest thing we can experience is the mysterious. It is the fundamental emotion which stands at the cradle of true art and true science. - Einstein

Computer, Audirvana -> optical Ethernet to Fitlet3 -> Fibbr Alpha Optical USB -> iFi NEO iDSD DAC -> Apollon Audio 1ET400A Mini (Purifi based) -> Vandersteen 3A Signature.

Link to comment

If there is a single paper or book that makes a nices synthesis of things like these, then I haven't found it. Bits of knowledge are all over the place, spread over different disciplines, disciplines that not always communicate well with each other. Cochlear modelling is one way to start, perceptual coding (gulp) another.

Link to comment
Thanks for the very interesting information. It makes a lot of sense to me that the frequency sensitivity of the ear is higher for continuous sine waves than for transients. To put it differently, to be sensitive to 50µs transients, the ear would have to be sensitive to sine waves - or wavelets - of much higher frequencies (since this transient is a linear combination of such higher-frequency sine waves or wavelets).

 

No, 20 kHz tone has rise time of 25 µs. And the magic of hearing is that it can detect frequency content of the transient in much shorter period than you can represent with FFT.

 

This allows you to recognize and locate short transients very accurately.

Signalyst - Developer of HQPlayer

Pulse & Fidelity - Software Defined Amplifiers

Link to comment
So you are not acquainted with the school of thought that looks at the temporal spread of the highest ERB in the cochlea, and derives from this a requirement for 250 us, or a transition band of 4kHz?

 

When translated to midrange frequenties the same reasoning yields filters with demonstrably inaudible preringing. Interesting, not?

 

Are you trying to say transients shorter than 250 µs are not audible?

 

I created one 60 second file with 200 µs long transient every second, 100 µs rise and fall time. I also made right channel inverse and 200 µs delayed, so you also get interesting stereo effect at the same time.

http://www2.signalyst.com/tmp/gg2-16.flac

 

Sounds very loud and clear to me.

 

Edit: here's another, every second impulse now has different spectral content, you should be able to clearly hear tonal difference between the two

http://www2.signalyst.com/tmp/gg3-16.flac

Signalyst - Developer of HQPlayer

Pulse & Fidelity - Software Defined Amplifiers

Link to comment
No, 20 kHz tone has rise time of 25 µs. And the magic of hearing is that it can detect frequency content of the transient in much shorter period than you can represent with FFT.

 

This allows you to recognize and locate short transients very accurately.

 

Actually the rise time from zero to max would be 12.5 microseconds for a 20 khz sine wave. If you care to look at max rise time as from most negative to most positive then it would be 25 microseconds.

 

Now exactly what is the fastest transient the ear can detect? And how do you know that the ear detects starting transients that exceed the transient in a steady 20 khz wave?

And always keep in mind: Cognitive biases, like seeing optical illusions are a sign of a normally functioning brain. We all have them, it’s nothing to be ashamed about, but it is something that affects our objective evaluation of reality. 

Link to comment

In an attempt to see if transient differences are obviously audible I made the following sound. Took one tenth of a second of 4410 hz sine wave. Faded that from the beginning to the end of the tenth of a second. It makes a 'tink' sound that is almost metallic. I then took that and lopped off the first quarter of the wave. In other words rather than ramping up as a sine wave at the rate of 4410 hz the initial sample goes from zero to the max for the waveform. Which should be like a max transient vs sine transient. I did this both at 44 khz and 176 khz sample rates. The two waves do sound obviously different. You will have no trouble hearing it.

 

I am attaching a zip file which contains three .wav files in it. Two files have a sine transient tink followed by a max transient tink then it repeats the pair. A total of four tinks per file. One file is 44khz and one is 176 khz.

 

The third file named "tink 176 multiple transients" has seven tinks over about 10 seconds. The first is the regular sine wave fully intact. The last six have one of three transients at the beginning. Some have a max transient for 176 khz, some have a transient that would be max for 88 khz and some have a transient that would be max for 44 khz. See if you can differentiate those.

 

I would think if the different transient rates in the 176 khz file sound different to you then we need at least 176 khz to keep transients intact. If they sound the same to you, then I think 44.1 despite what the digital reconstruction filters do is good enough on transients that we need no more. Those of you with software or hardware having more than one digital output filter might wish to try it with the different filters and see if some differentiate this third file better than others.

Tink test files.zip

And always keep in mind: Cognitive biases, like seeing optical illusions are a sign of a normally functioning brain. We all have them, it’s nothing to be ashamed about, but it is something that affects our objective evaluation of reality. 

Link to comment
Are you trying to say transients shorter than 250 µs are not audible?

 

Not at all. Of course these are audible.

 

Do you have any idea how the cochlea works? Do you know what happens above, say, 12kHz (for most adults)? What the mechanism behind our temporal location facility of ~5-10 microseconds is?

 

I have the feeling that you looked at a 20kHz signal and from this derived requirements for a system. Wouldn't it be more efficient, and more interesting, to look at the actual detector you want the optimise the system for?

Link to comment

I would think if the different transient rates in the 176 khz file sound different to you then we need at least 176 khz to keep transients intact.

 

Thank you for the effort.

 

If you were correct then this test would make it trivially easy to demonstrate the need for super-44.1kHz sampling rates. Yet, academia have been trying this for decades, without any convincing result. Should make you think ...

 

If you look at the four basic signals in the 176k file, then you'll see that their spectra differ significantly below 1kHz. This is likely to be audible. If you downsample the file to 44.1kHz, these differences will survive and remain as audible.

Link to comment
Actually the rise time from zero to max would be 12.5 microseconds for a 20 khz sine wave. If you care to look at max rise time as from most negative to most positive then it would be 25 microseconds.

 

12.5 µs you would cut the rise waveform to half and then it would contain frequencies higher than 20 kHz. If you offset the 25 µs half cycle to start from zero and end at 1.0 you have neat clean transient rise.

Signalyst - Developer of HQPlayer

Pulse & Fidelity - Software Defined Amplifiers

Link to comment
Do you know what happens above, say, 12kHz (for most adults)? What the mechanism behind our temporal location facility of ~5-10 microseconds is?

 

Yes.

 

I have the feeling that you looked at a 20kHz signal and from this derived requirements for a system. Wouldn't it be more efficient, and more interesting, to look at the actual detector you want the optimise the system for?

 

The scientific information is inadequate, so I adjust the limits at least to the limit I can hear and then use very good safety margin (say 10x) for people who have better hearing than I do.

Signalyst - Developer of HQPlayer

Pulse & Fidelity - Software Defined Amplifiers

Link to comment
Thank you for the effort.

 

If you were correct then this test would make it trivially easy to demonstrate the need for super-44.1kHz sampling rates. Yet, academia have been trying this for decades, without any convincing result. Should make you think ...

 

If you look at the four basic signals in the 176k file, then you'll see that their spectra differ significantly below 1kHz. This is likely to be audible. If you downsample the file to 44.1kHz, these differences will survive and remain as audible.

 

Yes, all you wrote is so. I haven't said how they sound to me. Only that a steep wavefront sounds different than a sine wavefront. It sounds different in both sample rates. But the third file is the important one. Give it a listen and tell me if the tinks after the first one sound the same or different to you. I am not saying how they sound to me yet. I will after a few people have tried it and reported what they heard.

And always keep in mind: Cognitive biases, like seeing optical illusions are a sign of a normally functioning brain. We all have them, it’s nothing to be ashamed about, but it is something that affects our objective evaluation of reality. 

Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now



×
×
  • Create New...