Jump to content
IGNORED

Testing and feedback of new “Hi-Res” digital remastering algorithm


infwonder

Recommended Posts

Hi, I am new to this forum, and I would like to seek for testers and feedback of the digital remaster algorithm I built.


Introduction

While the term "hi-res audio" (that golden logo) has largely just marketing buzz to sell more expensive digital music, I also met many people in person that clearly appreciate the differences. Personally I think the mixed results originated from inconsistent remaster process. 

The conventional way of remastering from tapes with new digital exports increase the cost for record companies, who then pass it to the consumers. 

For the past few years I have been working on how to programmatically remaster (not just upsample) CD quality WAV into "hi-res" format matching JAS (Japan audio society) description. While started as skeptics, my curiosity was largely provoked by the article "The world beyond 20kHz" by David E Blackmer. It maybe also due to my previous background working on tomographic algorithms, which tries to reconstruct complete information out of limited measurements - similar to this algorithm I built for audio.

Comparison

Today, anyone can use software to easily upsample a CD WAV into higher resolution, but the problem is: besides changed format that supports higher sample rates, nothing has really gained, as shown in the following two spectrograms: on picture (1) shows the original CD WAV file (44.1 kHz / 16 bits), and picture (2) is the graph from SoX rate conversion into higher resolutions (192 kHz / 24 bits)

spacer.png

Picture 1


spacer.png

Picture 2

As far as I know, many music tracks sold online today claiming "HD" are actually just going through the same process. Here we can clearly see none of the extra frequency provided by higher resolution format was actually utilized. In fact if we changed the scale of the graph from CD (pic1), we can get the same result as the upsample one (pic2). 

So, what would my algorithm do to the same song that I just showed? Please take a look at the following spectrogram:

 

spacer.png

Picture 3

A few things to point out: 
1. The image scale is the same as the sox upsample graph I showed previously and the resolution is also 192kHz/24bits.
2. The algorithm reconstructed frequencies up to around 48kHz, which covers the 40kHz range suggested in both David's article as well as JAS definition of "Hi-Res"
3. High pass triangular dither is used in final down-mix.

Sound

So, how does it sound?

I have been using the same algorithm for the pass three years and converted hundreds of CDs from different genres, and I am please to say that when paired with capable equipments, the results are unprecedentedly satisfying: not just because I enjoyed them, but the guests and family visited my place were sometimes shocked how much better that familiar songs now sounds better! 


Caveats

Like any algorithms, of course, this approach has its own drawbacks:
1. It is slow and resource hungry: As expected with reconstructions from finite data, it takes a long time, a typical song usually takes about 4-5 hours to complete.
2. It is, in the end of the day, pure computational audio process that does not involve "remastering" in traditional sense.

One of the reason that I waited 3 years before seeking public feedback is that I believe modern cloud infrastructures has finally ready to process this algorithm with reasonable costs. 


Call for testing & feedback

As mentioned earlier, I am seeking people that are interested to test this algorithm. If you are interested, please reply below or join our discussion on Reddit (link below).

Thank you for your attention.

Link to comment

Hello, welcome to AS!

 

Yes, I would be interested in your algorithm - I have done experiments where I merely upsampled CD material in the conventional way, and found that it allowed the playback hardware to deliver better subjective sound quality - because the input to the circuitry was a "better fit".

 

Your work is an extension to that, and I would be very curious how that processing altered the waveform, and whether I would see it as an improvement, or not.

 

Thanks for joining!

Link to comment

Interested.

Main listening (small home office):

Main setup: Surge protector +>Isol-8 Mini sub Axis Power Strip/Isolation>QuietPC Low Noise Server>Roon (Audiolense DRC)>Stack Audio Link II>Kii Control>Kii Three (on their own electric circuit) >GIK Room Treatments.

Secondary Path: Server with Audiolense RC>RPi4 or analog>Cayin iDAC6 MKII (tube mode) (XLR)>Kii Three .

Bedroom: SBTouch to Cambridge Soundworks Desktop Setup.
Living Room/Kitchen: Ropieee (RPi3b+ with touchscreen) + Schiit Modi3E to a pair of Morel Hogtalare. 

All absolute statements about audio are false :)

Link to comment

I'd like to know more about what your algorithm does.  Where is the ultrasonic information coming from?  How is it related to the music which in the source doesn't have any information at those frequencies?

And always keep in mind: Cognitive biases, like seeing optical illusions are a sign of a normally functioning brain. We all have them, it’s nothing to be ashamed about, but it is something that affects our objective evaluation of reality. 

Link to comment

I would like to thank many helps provided by @fas42 that help me improve the algorithm. The main thing I learned from our discussions is that since algorithm involves "reconstruction" of music (as data), which touches all frequency ranges, it may not be for everyone.

 

I was hoping to find testers that provides comments on how the reconstruction sound.  There is now a revised version of the algorithm that hopefully address some concerns on HF roll off. I will keep working on @fas42 and another tester, @firedog and hoping to get more feedback and how to keep improving the algorithm.

 

Thanks!

 

 

Link to comment

:D I do not know much about MQA, but from my understanding (by studying many spectrogram from other hi-res files), there is always a slow roll-off along with higher frequency ranges.

 

So, the algorithm tries to mimic the behavior by learning from these files. until the reconstructed files are "similar" enough to the original hi-res files. Then such "empirical" information is reused in other songs... I guess ideally, I cannot reuse such info across genres or different recording era.

 

Also, in order to properly reconstruct "music", and not noise, I have to process the original files to distinguish between the two. So the reconstructed song is actually very quiet and need to increase gain before sounding similar to recorded songs ... 

Link to comment

:D I recalled someone years ago suggested the same but it was never done ... So here we go:

 

The original file (frequency sweep, some noise was introduced during ffmpeg conversion from a mp3 file I found ... )

spacer.png

 

and the reconstructed version ... ( I don't like it... to be honest .. :D)

 

spacer.png

Link to comment

In the track I tested the "remastered" version is much louder than the original file, which makes comparison difficult. 

Main listening (small home office):

Main setup: Surge protector +>Isol-8 Mini sub Axis Power Strip/Isolation>QuietPC Low Noise Server>Roon (Audiolense DRC)>Stack Audio Link II>Kii Control>Kii Three (on their own electric circuit) >GIK Room Treatments.

Secondary Path: Server with Audiolense RC>RPi4 or analog>Cayin iDAC6 MKII (tube mode) (XLR)>Kii Three .

Bedroom: SBTouch to Cambridge Soundworks Desktop Setup.
Living Room/Kitchen: Ropieee (RPi3b+ with touchscreen) + Schiit Modi3E to a pair of Morel Hogtalare. 

All absolute statements about audio are false :)

Link to comment
26 minutes ago, firedog said:

In the track I tested the "remastered" version is much louder than the original file, which makes comparison difficult. 

 

Yes, :D I had addressed that in the fix... but I have to schedule it another time since I stop the job and did the frequency sweep thing ... :P

 

Thanks!

Link to comment
11 minutes ago, infwonder said:

What does it (the odd-number harmonics) do to the music? 

 

Distorts it :) You are adding new frequencies in the audible range that were not in the original recording. At the very least, you should understand why this is being done and if it's really necessary. Adding noise and distortion below 20kHz is usually not the desired outcome of remastering.

 

Link to comment
2 minutes ago, pkane2001 said:

Distorts it :) You are adding new frequencies in the audible range that were not in the original recording. At the very least, you should understand why this is being done and if it's really necessary. Adding noise and distortion below 20kHz is usually not the desired outcome of remastering.

Certainly not if the goal is to create a plausible version of the high-frequency content filtered out in a CD conversion.

Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now



×
×
  • Create New...