Jump to content
infwonder

Testing and feedback of new “Hi-Res” digital remastering algorithm

Rate this topic

Recommended Posts

Hi, I am new to this forum, and I would like to seek for testers and feedback of the digital remaster algorithm I built.


Introduction

While the term "hi-res audio" (that golden logo) has largely just marketing buzz to sell more expensive digital music, I also met many people in person that clearly appreciate the differences. Personally I think the mixed results originated from inconsistent remaster process. 

The conventional way of remastering from tapes with new digital exports increase the cost for record companies, who then pass it to the consumers. 

For the past few years I have been working on how to programmatically remaster (not just upsample) CD quality WAV into "hi-res" format matching JAS (Japan audio society) description. While started as skeptics, my curiosity was largely provoked by the article "The world beyond 20kHz" by David E Blackmer. It maybe also due to my previous background working on tomographic algorithms, which tries to reconstruct complete information out of limited measurements - similar to this algorithm I built for audio.

Comparison

Today, anyone can use software to easily upsample a CD WAV into higher resolution, but the problem is: besides changed format that supports higher sample rates, nothing has really gained, as shown in the following two spectrograms: on picture (1) shows the original CD WAV file (44.1 kHz / 16 bits), and picture (2) is the graph from SoX rate conversion into higher resolutions (192 kHz / 24 bits)

spacer.png

Picture 1


spacer.png

Picture 2

As far as I know, many music tracks sold online today claiming "HD" are actually just going through the same process. Here we can clearly see none of the extra frequency provided by higher resolution format was actually utilized. In fact if we changed the scale of the graph from CD (pic1), we can get the same result as the upsample one (pic2). 

So, what would my algorithm do to the same song that I just showed? Please take a look at the following spectrogram:

 

spacer.png

Picture 3

A few things to point out: 
1. The image scale is the same as the sox upsample graph I showed previously and the resolution is also 192kHz/24bits.
2. The algorithm reconstructed frequencies up to around 48kHz, which covers the 40kHz range suggested in both David's article as well as JAS definition of "Hi-Res"
3. High pass triangular dither is used in final down-mix.

Sound

So, how does it sound?

I have been using the same algorithm for the pass three years and converted hundreds of CDs from different genres, and I am please to say that when paired with capable equipments, the results are unprecedentedly satisfying: not just because I enjoyed them, but the guests and family visited my place were sometimes shocked how much better that familiar songs now sounds better! 


Caveats

Like any algorithms, of course, this approach has its own drawbacks:
1. It is slow and resource hungry: As expected with reconstructions from finite data, it takes a long time, a typical song usually takes about 4-5 hours to complete.
2. It is, in the end of the day, pure computational audio process that does not involve "remastering" in traditional sense.

One of the reason that I waited 3 years before seeking public feedback is that I believe modern cloud infrastructures has finally ready to process this algorithm with reasonable costs. 


Call for testing & feedback

As mentioned earlier, I am seeking people that are interested to test this algorithm. If you are interested, please reply below or join our discussion on Reddit (link below).

Thank you for your attention.

Share this post


Link to post
Share on other sites

Hello, welcome to AS!

 

Yes, I would be interested in your algorithm - I have done experiments where I merely upsampled CD material in the conventional way, and found that it allowed the playback hardware to deliver better subjective sound quality - because the input to the circuitry was a "better fit".

 

Your work is an extension to that, and I would be very curious how that processing altered the waveform, and whether I would see it as an improvement, or not.

 

Thanks for joining!


Frank

 

http://artofaudioconjuring.blogspot.com/

 

 

Ahhh, Mankind ... Porsche intellect, Trabant emotions ...

Share this post


Link to post
Share on other sites

Interested.


Main listening (small home office):

Surge protector +_iFi  AC iPurifiers >Isol-8 Mini sub Axis Power Conditioning+Isolation>CAPS IV Pipeline Server + Sonore 12V PS>Kii Control>Audiolense DRC>Kii Three >GIK Room Treatments.
 

Secondary Listening: CAPS Pipeline>IFi iOne DAC>Schiit Freya>Kii Three . Also an SBT and a RB Pi 3B+ running piCorePlayer as an SBT emulator. 

All absolute statements about audio are false :)

Share this post


Link to post
Share on other sites

I'd like to know more about what your algorithm does.  Where is the ultrasonic information coming from?  How is it related to the music which in the source doesn't have any information at those frequencies?


To paraphrase Rick James, "sighted listening is a helluva drug".

Share this post


Link to post
Share on other sites

I just had a track processed by ifwonder's algorithm, and noted immediately that it had adjusted the level of the HF content, in the normal audible range, by a significant amount - he agrees that this should not be happening, and is investigating.


Frank

 

http://artofaudioconjuring.blogspot.com/

 

 

Ahhh, Mankind ... Porsche intellect, Trabant emotions ...

Share this post


Link to post
Share on other sites

Hi all, I will be taking the feedback I got and think about a solution during my family vacation trip (Thanksgiving week), I hope to bring back a solution soon after I return. :D

 

Stay tune!

Share this post


Link to post
Share on other sites

I would like to thank many helps provided by @fas42 that help me improve the algorithm. The main thing I learned from our discussions is that since algorithm involves "reconstruction" of music (as data), which touches all frequency ranges, it may not be for everyone.

 

I was hoping to find testers that provides comments on how the reconstruction sound.  There is now a revised version of the algorithm that hopefully address some concerns on HF roll off. I will keep working on @fas42 and another tester, @firedog and hoping to get more feedback and how to keep improving the algorithm.

 

Thanks!

 

 

Share this post


Link to post
Share on other sites

Looks like a leaky / slow roll-off filter, similar to the MQA ones?

 

I'm curious how spectrogram of a 0 - 22.05 kHz sweep processed through the algorithm looks like.

 


Signalyst - Developer of HQPlayer

Pulse & Fidelity - Software Defined Amplifiers

Share this post


Link to post
Share on other sites

:D I do not know much about MQA, but from my understanding (by studying many spectrogram from other hi-res files), there is always a slow roll-off along with higher frequency ranges.

 

So, the algorithm tries to mimic the behavior by learning from these files. until the reconstructed files are "similar" enough to the original hi-res files. Then such "empirical" information is reused in other songs... I guess ideally, I cannot reuse such info across genres or different recording era.

 

Also, in order to properly reconstruct "music", and not noise, I have to process the original files to distinguish between the two. So the reconstructed song is actually very quiet and need to increase gain before sounding similar to recorded songs ... 

Share this post


Link to post
Share on other sites

OK, so if you take 44.1/16 0 - 22.05 kHz frequency sweep and convert it to 192/24, how does it look like? Or 1/3rd octave multitone?

 


Signalyst - Developer of HQPlayer

Pulse & Fidelity - Software Defined Amplifiers

Share this post


Link to post
Share on other sites

:D I recalled someone years ago suggested the same but it was never done ... So here we go:

 

The original file (frequency sweep, some noise was introduced during ffmpeg conversion from a mp3 file I found ... )

spacer.png

 

and the reconstructed version ... ( I don't like it... to be honest .. :D)

 

spacer.png

Share this post


Link to post
Share on other sites

In the track I tested the "remastered" version is much louder than the original file, which makes comparison difficult. 


Main listening (small home office):

Surge protector +_iFi  AC iPurifiers >Isol-8 Mini sub Axis Power Conditioning+Isolation>CAPS IV Pipeline Server + Sonore 12V PS>Kii Control>Audiolense DRC>Kii Three >GIK Room Treatments.
 

Secondary Listening: CAPS Pipeline>IFi iOne DAC>Schiit Freya>Kii Three . Also an SBT and a RB Pi 3B+ running piCorePlayer as an SBT emulator. 

All absolute statements about audio are false :)

Share this post


Link to post
Share on other sites
26 minutes ago, firedog said:

In the track I tested the "remastered" version is much louder than the original file, which makes comparison difficult. 

 

Yes, :D I had addressed that in the fix... but I have to schedule it another time since I stop the job and did the frequency sweep thing ... :P

 

Thanks!

Share this post


Link to post
Share on other sites
1 hour ago, mansr said:

Looks like it adds noise and odd-order harmonic distortion.

 

😮 what does it mean? I did not add the odd-order harmonic in the code ...

Share this post


Link to post
Share on other sites
9 hours ago, infwonder said:

😮 what does it mean? I did not add the odd-order harmonic in the code ...

Even if you didn't set out specifically to add harmonic distortion, it can still occur as a side effect of what you were trying to do.

Share this post


Link to post
Share on other sites
11 minutes ago, infwonder said:

What does it (the odd-number harmonics) do to the music? 

 

Distorts it :) You are adding new frequencies in the audible range that were not in the original recording. At the very least, you should understand why this is being done and if it's really necessary. Adding noise and distortion below 20kHz is usually not the desired outcome of remastering.

 

Share this post


Link to post
Share on other sites
2 minutes ago, pkane2001 said:

Distorts it :) You are adding new frequencies in the audible range that were not in the original recording. At the very least, you should understand why this is being done and if it's really necessary. Adding noise and distortion below 20kHz is usually not the desired outcome of remastering.

Certainly not if the goal is to create a plausible version of the high-frequency content filtered out in a CD conversion.

Share this post


Link to post
Share on other sites

Note: I found a howto and generate cleaner frequency sweep with shorter time to allow faster reconstruction...

 

This is the original freq. sweep (CD, WAV mono):

spacer.png

 

note: some other bugs still exists... the time is different...

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now



×
×
  • Create New...