Sound decomposition

audiobomber · July 15, 2023

This article explains the composition of sound, and why a Fourier Transform is insufficient to describe sound. The video shows how this technology can be used in recording.

https://phys.org/news/2023-07-acoustics-decompose-accurately-basic-components.html

botrytis · July 15, 2023

Not what they are saying.

audiobomber · July 15, 2023

4 hours ago, botrytis said:

Not what they are saying.

Yes, it is. They said Fourier only models sine waves, not transient behaviour, such as occurs with breathy voice or drums. The transient component has only been identified in this century, long after Fourier.

Jud · July 15, 2023

1 hour ago, audiobomber said:

Yes, it is. They said Fourier only models sine waves, not transient behaviour, such as occurs with breathy voice or drums. The transient component has only been identified in this century, long after Fourier.

The article is symptomatic of what sometimes happens in various areas of science. I've seen it reasonably often in evolutionary biology and the physics of high temperature superconductivity.

Labs are in competition for money, and they sometimes make claims about their work that are a little strong. Then the PR office of the university or business where the lab is located gets hold of this claim and dresses it up some more. Then the press takes up the story and perhaps strengthens it a bit to get more eyeballs for advertisers. And so we are inundated with news of "breakthroughs" that aren't, really.

I've discussed the reality of the situation - that this is a decades-old issue in digital audio that does not supersede or violate Fourier's mathematical proof - here:

JoeWhip · July 20, 2023

As someone who had success as a trial lawyer both with and cross examining expert scientific witnesses, there is a ton of bias in this stuff. Follow the money.

yamamoto2002 · August 2, 2023

I imagined the technology from only the phrase "sound decomposition"

as some conversion from string quartet or four part choir sound to four sheets of part score and automatic musical analysis of composition, detect themes and their developments with counter point with analysis of chord progression

Where is the original paper

audiobomber · August 2, 2023

4 hours ago, yamamoto2002 said:

Where is the original paper

More information: Leonardo Fierro et al, Enhanced Fuzzy Decomposition of Sound Into Sines, Transients, and Noise, Journal of the Audio Engineering Society (2023). DOI: 10.17743/jaes.2022.0077

yamamoto2002 · August 2, 2023

this is the link of the preprint paper

https://arxiv.org/abs/2210.14041

yamamoto2002 · August 3, 2023

Actual decomposed sound files are on "Listening test sounds" : http://research.spa.aalto.fi/publications/papers/jaes-stn/main.htm

it seems original Orig.wav files are decomposed onto 3 wav files: S.wav, T.wav, N.wav

Melody instruments are stored on S file but its attack part is lost and sound lacks punch
Snare drum strikes are stored on T file, some older algorithms stores only its beats
Cymbals and snare wire sounds are stored N files

They use STFT, some variant of Fourier transform to perform this separation task and the approach is analytic and practical

Jud · August 3, 2023

37 minutes ago, yamamoto2002 said:

Actual decomposed sound files are on "Listening test sounds" : http://research.spa.aalto.fi/publications/papers/jaes-stn/main.htm

it seems original Orig.wav files are decomposed onto 3 wav files: S.wav, T.wav, N.wav

Melody instruments are stored on S file but its attack part is lost and sound lacks punch

Snare drum strikes are stored on T file, some algorithms stores only its beats

Cymbals and snare wire sounds are stored N files

They use STFT, some variant of Fourier transform to perform this separation task and the approach is analytic and practical

In theory it's a nice approach: Overcome the problem of Fourier analysis-based filtering (the better the frequency-based performance the worse the time-based performance, and vice versa) by using two filters that emphasize opposite aspects of performance and combine the results (plus add back at least some of what both filters eliminate).

But:

- The study was low powered (only 19 participants), and so far in my reading I haven't seen how long a time there was between playing of samples, so I don't know how much problems with echoic memory might have entered in.

- MUSHRA or other officially sanctioned testing techniques were not used, though the experimenters were aware of them.

- Testing was done with Redbook files, with decent Sennheiser headphones. So that's what the participants had for comparison, Redbook run through whatever chip was in the DAC. And they wouldn't have been able to hear effects on things like soundstage.

So all in all, it sounds like a nice approach (though I don't know how these filtering results interfere/intermodulate with each other when combined)), but this testing isn't adequate to prove whether it actually works.

yamamoto2002 · August 3, 2023

1 hour ago, Jud said:

So all in all, it sounds like a nice approach (though I don't know how these filtering results interfere/intermodulate with each other when combined)),

I experimented it. Add up S+T+N to create mix.wav then orig.wav - mix.wav to create diff.wav. And found diff.wav is silence, only the very last part has error (perhaps implementation bug of overlap-add or fade-out applied)

Jud · August 3, 2023

23 minutes ago, yamamoto2002 said:

I experimented it. Add up S+T+N to create mix.wav then orig.wav - mix.wav to create diff.wav. And found diff.wav is silence, only the very last part has error (perhaps implementation bug of overlap-add)

Do you use Audirvana and/or HQPlayer? I'd be curious how some of their filters and modulators at DSD256 or DSD512 (with volume equalized) would compare to the Redbook orig.wav.

yamamoto2002 · August 3, 2023

24 minutes ago, Jud said:

Do you use Audirvana and/or HQPlayer? I'd be curious how some of their filters and modulators at DSD256 or DSD512 (with volume equalized) would compare to the Redbook orig.wav.

Unfortunately no. The experiment above is done with Audacity (free software) :

Select tracks of S.wav T.wav and N.wav and Tracks → Mix → “Mix and render to new track” to create mix.wav
Select mix.wav and Effect → invert for polarity inversion (to calculate orig.wav minus mix.wav)
Select orig.wav and mix.wav, then Tracks → Mix → “Mix and render to new track” to create diff.wav

Sign In

Sound decomposition

Recommended Posts

audiobomber

Link to comment

botrytis

Link to comment

audiobomber

Link to comment

Jud

Link to comment

JoeWhip

Link to comment

yamamoto2002

Link to comment

audiobomber

Link to comment

yamamoto2002

Link to comment

yamamoto2002

Link to comment

Jud

Link to comment

yamamoto2002

Link to comment

Jud

Link to comment

yamamoto2002

Link to comment

Create an account or sign in to comment

Create an account

Sign in

Activity

Immersive

Subscriptions

My Details