Julf's Blog

PeterSt · March 20, 2012

First off, thanks for all the effort;

I hope to make it somewhat more interesting by means of a next post. But I can't finish it, because something is not clear ...

Next I did the same 24-to-16-to-24-bit downgrade as with the 96 kHz file, resulting in the 48/16 track B.

You may have made a few mistakes here, or otherwise pasted this sentence in the middle of some context which doesn't allow me to understand this.

Can you briefly explain -now in absolute sense- what happened to track B ?

Thanks,

Peter

Julf · March 20, 2012

"I hope to make it somewhat more interesting by means of a next post."

Looking forward to that

"Can you briefly explain -now in absolute sense- what happened to track B ?"

Sure. What I wrote was "Next I did the same 24-to-16-to-24-bit downgrade as with the 96 kHz file, resulting in the 48/16 track B."

So what I did was take the 48/24 downsampled copy (with the masking noise added), track D, and then did a similar "convert to 16 bits and convert back to 24 again" operation (same as what I did with the 96 kHz one) resulting in track B.

PeterSt · March 20, 2012

(same as what I did with the 96 kHz one)

Call me thick this morning, but ... in which track did *that* end up (knowing that will make it clear to me for sure).

Thanks.

Julf · March 20, 2012

"in which track did *that* end up"

"My next step was converting the [96/24] file to 16 bit and then upconverting it back to 24 bit (effectively leaving the bottom 8 bits as zeroes). This resulted in track E."

So E.

Just to be sure, here's the whole list in condensed form:

A 44.1 kHz / 16 bit

B 48 kHz / 16 bit

C 96 kHz / 24 bit

D 48 kHz / 24 bit

E 96 kHz / 16 bit

F mp3 (VBR, lame --preset insane)

G 96 kHz / 24 bit, converted and copied back and forth

H 48 kHz / 24 bit + 1 db extra gain

I 48 kHz / 16 bit, "raw" resample, no masking noise added

manisandher · March 20, 2012

Hey Julf, thanks for taking the effort to set this up. And the results are interesting, no?

For my part, I found it incredibly difficult to hear differences between all 8 tracks (playing one straight after another over a period of an hour or so). But it seems that "Whiskey" really did identify 'C' as being the original track, no?

Also, just for my understanding, track B was derived from track D with an extra 24-16-24 conversion, right?

Yes, we all know the test was ultimately 'flawed'. But I'm glad I took part anyway and look forward to another more rigorous test. Next time I think I'll allocate one day for each track - I suspect I'd get 'better' results this way.

Cheers, Mani.

Julf · March 20, 2012

"it seems that "Whiskey" really did identify 'C' as being the original track, no?"

Yes, kind of. It was a "could well be" on one of his two attempts, but I definitely give "Whiskey" credit for that one.

The important thing is that with so few responses, I don't think we can draw any conclusions this way or that - maybe apart from the fact that if you want to improve your sound quality, just turn up the volume

"Also, just for my understanding, track B was derived from track D with an extra 24-16-24 conversion, right?"

That is correct.

PeterSt · March 20, 2012

No, not the Uriah Heep album, but just me.

All right. Before typing this post, it looked interesting to me to see what I have done, and mainly : why. Not sure what I can make of it, but let's see.

Btw, this can be approached from more angles, like comparing with the others (and why the results may differ). Maybe that can be done in a later stage.

The following is to be noticed from my listening to this :

a. I can't A-B. Not only because I can't but because I don't think it can work (hear something here, and you can't avoid it there).

b. I listened to everything one time only. Just let it all play (from A to H).

c. Being bored, somewhere into track D or something I started cooking. In this case this implied the first stage of beef needing a couple of hours to boil, that first stage being on a high fire with a lot of noise. Not that I wasn't serious, but I really can't spend the time for so long otherwise.

d. My second remarks behind the "/" are from another run, which was 2 weeks or so after the fact. However, I listened to the first 10 seconds of each track only, implying a hopefully better concentration.

e. It is to be noted that generally I like 16/44.1 better than any Hires. Reasons are numerous, but with my general idea that there's also a real technical merit in it, when the Hires was done right.

Track A - 44.1 kHz / 16 bit, average: 4.9

Emphasized small S-es. Notice that this could happen in anything, but I noticed it as unnatural / Strange

---

This is relative to nothing of course, because it was the first track I listened to, and never listened again. So, no reference at all.

Referring to my remark above that I tend to like 16/44.1 better, it is to be noted that this will be (only !) about my own upsampling/filtering mechanisms, and through the Phasure NOS1 DAC which does nothing to the sound (were it about filtering - which is not in there). Thus, *this* 16/44.1 is not about this "better liking" because it is played natively as 24/96 because it was (made) just that.

Track B - 48 kHz / 16 bit, average: 5.3

Seems to sound more comfortable (relative to A) / Normal

---

Can be because 96 to 48 was "better" (hence more easy to convert to) than 96 to 44.1 ?

Track C - 96 kHz / 24 bit, average: 6.5

More spatious. More natural (after the fact ... this could well be "the one") / Flanging

---

Mind you, this "after the fact" is after listening to all of the 9 tracks, with some "asbolute memory" that this one possibly sounded better. So, this round (before the "/") I went for E, and while denoting this in the remark for C (which *is* the right one) I should have compared E to at least this one. But I really felt no need to spend more time on it.

Track D - 48 kHz / 24 bit (white noise at -120dB), average: 3.75

Sounds strange / Flanging

---

The importance here is my "Flanging" remark. It will not be a coincidence that both the most normal 24 bit tracks (C being the real one, this one being 48KHz), the flanging occurs to me. It should indicate that the bit depth is doing a few things to a real existing flanging, which btw is a slower variation in level.

What is important to others is that this flanging can easily be incurred for by BETTER playback means. So, or it gets lost in the further anomalies (like noise), or it is just there.

I did NOT at all merit the "flanging" I noticed here in both 24 bit occasions as a good thing. I just noticed it.

Track E - 96 kHz / 16 bit

No S-es ? I listened to this one after the happening once again.

Then I noted : OK, more normal S-es here / Too metallish

---

Apparently the 24 bit emphasizes something I don't like. This can be the recording. Obviously it looks like 24 bits should emphasize the betterness of the S-ses, but instead it emphasized the wrongness ?

The "Too metallish" from that second 10 seconds round will not immediately say that 16 bit is wrong, but merely that the decimation wasn't right.

In addition I must say that I found the sound on all tracks "strange" in the voices, and like one of the others said "too much instrumentationed" or something like that.

I think in the main thread I may have talked about microphones or something.

Track F - mp3 (VBR, lame --preset insane) Average: 6.75

S-es / Strange

---

No further comments.

Track G - 96 kHz / 24 bit, converted flac-wav-flac 300 times, copied between computers 100 times

Wrong S-es / Bad

---

Dangerous; I am NOT in that leage of "being able" to say that anything like copying a 100 times etc. can ever change the sound. I am accused though, to perceive differences from something like this.

It is remarkable that I denote this one as "bad" for the 10 seconds round, where I denoted nothing "as bad" as this.

How ?

Track H - 48 kHz / 24 bit + 1 db extra gain

Sounds strange. S-es buzz. I noticed the "buzz" earlier on, but didn't write that down

so I don't know anymore where it was) / Wrong

---

One of the others noticed the same, but described the "buzz" differently. So, a poor means of digital gain must be in order here.

Also notice the explicit "Wrong" I judged for the 10 seconds round. I didn't do that anywhere either.

And let's keep in mind : any digital attenuation is still underestimated; attenuation or gain doesn't differ much, but it has to be good. This one clearly is not.

And no, I did not notice the extra gain. The fan above the stove was louder anyway ...

Track I - 48 kHz / 16 bit, "raw" resample, no masking noise added Average: 5.25

Too high pitched S-es; furthermore quite normal / Rather normal, but edgy

This is remarkable just the same. How in the world can I judge with two weeks in between - never looking back at my earlier comments - the second round listening for 10 seconds only ... judge exactly the same ??

To keep in mind : No S-es are there in those first 10 seconds, but the "edgy" of it seem to resemble the "high pitched".

Okay, I merely started writing down the above to see if I could see patterns related to my general thinking / observations from music formats. I feel it is in there, but I think my means of writing it down doesn't allow for conclusions easy. So I'll leave it to that for now.

But now let's see whether we can make some more out of my own judgements ...

So, in the first round I chose for track E, so that was my "submission". Now, the only thing wrong with that, were some chopped off least significant bits. So, they are not the most important ? (no judgement, just posing something).

Along with this "E" submission, after the fact, and thinking back how things sounded (so, over 50 minutes later that was), I actually said that track C should be the one.

Well, it was the one. Still failed, because I officially said "E". But ...

Julf forgot something to tell (he *really* forgot, I'm sure) because it was at the end of my list of judgements, and this was that although I had chosen track E, it could not be the one because upsampling it to 768 made it sound worse. And, this should not happen (as I told in that submission).

Thus, I submitted E as the winner, but only because I was fed up with it.

From the 10 seconds round, I didn't really pick one, but said that I played B for a next time (sort of having chosen that one), and that "I had no problems with it". I played it again because I had denoted "normal" to it, as the only one. Btw, here too I said that I played it upsampled, and that it still didn't workout, implying that this one could not be the one either.

As you can see in the questions from me and the answers from Julf by now, this is the actual way Track B emerged :

"My next step was converting the [96/24] file to 16 bit and then upconverting it back to 24 bit (effectively leaving the bottom 8 bits as zeroes). This resulted in track E."

... with the difference that track B is 48 KHz.

Amazing ...

So, in two subsequent sessions I chose the two tracks of the same kind.

For myself it is as amazing that unconsciously I seem to be able to detect 24 bits over 16 bits (the flanging thing).

[yes, I am getting crazy myself by now, about these comparisons / seeking for the common denominators]

Trying to summarize :

In two sessions -and listening to very different things obviously- I "formally" preferred the 16 bits verions; the both I chose were the best genuine ones of it (E better than B though).

In both of these cases I could prove that neither could be the right one, because upsampling them to 768 would not workout for the better, which it otherwise would (as far as my experience goes).

After the session I "formally" preferred E, I said I better had chosen C. But I did not (formally).

I do not like Hires, generally. So, I shouldn't have chosen C.

And I didn't.

Instead I chose a 16 bit version, although still 96KHz for the one occasion (listening throughout) and another 16 bit version of 48KHz in the other - that being the 10 seconds session.

I regard the 10 seconds session to have been more serious, already because I wasn't cooking and making a lot of noise at the same time. The fact that a. the a cappella seems the worse to me to do a proper job ever and b. the first 10 seconds of it should be totally undoable (for myself, in my view), didn't prevent me from choosing exactly what I always say I like best : "low-res".

In the mean time I seem to have been able to pick the proper Hires ... (but this wasn't my formal choice).

More low-res would have been 16/44.1 of which I said this :

Emphasized small S-es. Notice that this could happen in anything, but I noticed it as unnatural. / Strange

and which I dedicate to the downconversion not working as decent as it does with going from 96 to half of that.

(btw, I added a period after unnatural, which is not there in Julf's quotes. So, the Strange is from the second session.

I think I did well. Fairly well.

Now I may wonder, why didn't I like the both 24 bit versions ?

The second, 10 seconds session, told me "flanging". Of course I didn't merit this as good. Maybe not as bad either, but I noticed it, and it can't be a coincidence that I noticed this with the two 24 bits versions only. Yes, the 24 bits will have dug that up. But what if it wasn't normal, and it was created in the process. I mean, that too would require 24 bits to get it in, right ?

The flanging in this case is NOT normal at all. It is no Lesly you know. It is a slow frequency flanger "over" the voices, and it can't be, unless from wrong processing. This, while the whole lot clearly showed as processed to me.

The first "throughout" session, made me make of C say it sounds spatious and "more natural" (not "natural !!") and from D (the other 24 bit) I said "sounds strange".

Of course I don't know anymore, but the "strange" from the first session can just as well have come from the "flanging" I heard in the second, and where this just will be about which part your attention goes.

But, this should proove (to me) that

1. The flanger is just in there (because I heard it in C);

2. It is wrong (because I told so from D).

3. And of course that only 24 bits can unveil it.

Nice.

Add to this that the only two other contenders had the same exact remark *AND* it is my own remark ever about Hires :

It doesn't grab your attention. No way to get into the music.

HOW ?

My idea about it : the recording is not good enough.

Or :

Our systems are not on par.

Or :

Something else is wrong we don't quite know about yet.

Ok, I hope you all can see that I didn't just say this out of the blind - at least not for this post (and listening efforts).

But I say it all the time ...

Peter

Jud · March 20, 2012

Don't know if I'm included in your "official" results, but you did allow me to take the test twice, once with the AQ Carbon cable in my system, then, after scrambling the choices, again with the AQ Coffee. I spent longer on the test the first time through; due to pressing errands, I wasn't able to take as much time to compare the second time through. However, as I told you when I sent in my results, the second time I thought I heard a clearer difference between my preferred selection and all the others.

For both sets of results I assigned scores of 10 through 2, from most preferred to least.

If I am reading correctly your post here and your e-mail to me about how the order was switched between tests, it turns out that in both instances I did what it appears many people did, preferred the +1db track. The first time through I selected the original as my next favorite track, and the 96/24 converted/copied version as the next after that. The second time through I gave the original a 7 (that is, ranked it fourth), and again ranked the 96/24 converted/recopied version next.

So what can be said about these results? Well, no surprise that we humans are apparently quite sensitive to level changes. Since I thought I could hear a difference more clearly in the second test, perhaps the Coffee actually was better in bringing out the level difference. And I also found it interesting that I ranked the original and its full resolution converted/recopied version together in both tests.

By the way, speaking of converting/recopying - I converted all files to AIFF with XLD before listening to them, because that is how I listen to nearly all my music, and I wanted listening conditions to be as close to normal as possible. I also used some software upsampling to 24/192 in the following way: In the first test, I went through the selections first without using any upsampling, then again with it. My order of preference did not change. Since it did not change my preferences the first time around, and I had less time, in the second test I listened using upsampling exclusively, because that is the way I normally listen to most of my music these days.

So in the second test, given that I selected the +1db version as the best, and the original and its converted/recopied version fourth and fifth best, respectively, which tracks "sneaked in" for second and third? If I'm reading your post and e-mail correctly, they would be E (96/16) as second preference, and A (good old Redbook, 44.1/16) as third.

PeterSt · March 20, 2012

Track G - 96 kHz / 24 bit, converted flac-wav-flac 300 times, copied between computers 100 times

"Uniform": 5

Very much like C

There is much much more in this all.

Maybe *because* so few attended, it is more easy to make a few important conclusions later.

PeterSt · March 20, 2012

I think I can look at this all forever and have new "amazing" conclusions ...

So, despite of me thinking that C was the original, but not submitting that one as the best sounding (and of course I was seeking for the best sounding - what to do else) ... WHY was it so necessary that I said this :

Track G - 96 kHz / 24 bit, converted flac-wav-flac 300 times, copied between computers 100 times

Wrong S-es / Bad

---

Dangerous; I am NOT in that leage of "being able" to say that anything like copying a 100 times etc. can ever change the sound. I am accused though, to perceive differences from something like this.

It is remarkable that I denote this one as "bad" for the 10 seconds round, where I denoted nothing "as bad" as this.

How ?

So, why should I not see this in the realm of my general thinking ? eh ... Hires sounds bad ?

And thus, I highly prefer to state this after all, over suggesting that 100 (okay, 300) copies deteriorate the sound, right ?

Jud · March 20, 2012

A number of people have stated that the a capella material doesn't really allow for the benefit of hi-res to be heard, as it doesn't have any significant high frequency content, and that is definitely a valid point. On the other hand, at least one person stated that unaccompanied, natural human voice is the best test material.

I do think that unaccompanied voice, or voice with instrumentation that doesn't overwhelm it, is the material with which I can best discern differences. However, the choral style in this case intentionally de-emphasizes much of what I personally listen for as natural in a singing voice: small volume changes, phrasing, breathing, all helping to create drama and project emotion. So I'm fine with vocals, but believe I personally at least would be better off with more modern vocal styles. (Some examples of artists I often listen to when comparing gear, to give a feeling for what I'm talking about: Gillian Welch, Rosanne Cash, Alison Krauss, Jakob Dylan, Ryan Bingham (his ballads).)

PeterSt · March 20, 2012

Track I - 48 kHz / 16 bit, "raw" resample, no masking noise added

My comment :

Too high pitched S-es; furthermore quite normal / Rather normal, but edgy

I only now realize that some things are not fair, maybe ...

If this track is to be seen as 16/48 indeed, my NOS1 won't do a thing with it by guarantee. This means I would be listening to material which still needs filtering. No wonder that I "am able to" perceive this as distortion (see judgement).

On the other hand, someone with a normal(ly) filtering DAC - what would happen there ?

What officially comes in there, is 24/96. So, it shouldn't do a thing either.

Shouldn't.

But this is no guarantee at all. Thus :

When such a DAC smears a nice filter over it again, it will have solved the Nyquist "problem" and judgement like I could do it, would not be possible anymore. Other things may happen, but not this.

I came to this because I saw Jud writing about his "standard upsampling". Well, hey, that would be illegal also of course. Or at least not convenient for yourself, because it could 100% equalize two different formats (depending on the format).

But what I merely start to see is that the test itself is not much valid, when the *objective* is taken in mind : can you perceive Hires or whatever it was. So :

No no, this now suddenly is (also) about whether you can perceive Nyquist violating tracks, which would be all of them under 96KHz and the 16 bits ones, just because your DAC can't see that; only with luck it overrules a few things.

Now what ?

PS: Of course now I should wonder why a 24 bit file but violating Nyquist for the sample rate, didn't come across to me as distortion (as bad as the 16/48 from track I).

Oh boy, now I must look again at the results ? maybe I better stop ...

manisandher · March 20, 2012

Well, I've looked long and hard at my results (with a short break in between to hear our Queen's speech today - when I was young, I was totally against our Monarchy, but nowadays I can see the immense value in having had a non-elected head of state for the last sixty years!)... and I can't really see a pattern.

But for what it's worth, I thought A, C, E, G and I all sounded similar - and B, D, F and H all sounded similar. The first group seemed a little more forward sounding than the latter group.

I really can't see any pattern here. But just in case it's important, I was upsampling all tracks to 768KHz in XXHighEnd.

Mani.

Jud · March 20, 2012

I came to this because I saw Jud writing about his "standard upsampling". Well, hey, that would be illegal also of course. Or at least not convenient for yourself, because it could 100% equalize two different formats (depending on the format).

I was concerned about upsampling masking differences myself, so I initially listened without any software upsampling. When I listened again with upsampling, it did not alter my preferences, so I felt OK to listen exclusively with software upsampling in the second trial.

It is at least conceivable that some software upsampling might mask differences in the original material. (For those who want measurements, I hasten to add that different sample rate converters have differing measured performance.)

Julf · March 20, 2012

Thanks for taking the time to write up your observations - some of them definitely make sense.

"I submitted E as the winner, but only because I was fed up with it."

That's as good a reason as any

Julf · March 20, 2012

"Don't know if I'm included in your "official" results, but you did allow me to take the test twice"

Indeed - unfortunately I didn't have time to redo the plots etc. to include your second run, so only the first round is included.

"no surprise that we humans are apparently quite sensitive to level changes."

I guess Spinal Tap was onto something with the "they go to 11" amps...

"I also found it interesting that I ranked the original and its full resolution converted/recopied version together in both tests."

I agree - but I still maintain that we don't have enough data to draw any real conclusions.

Julf · March 20, 2012

"I personally at least would be better off with more modern vocal styles. (Some examples of artists I often listen to when comparing gear, to give a feeling for what I'm talking about: Gillian Welch, Rosanne Cash, Alison Krauss, Jakob Dylan, Ryan Bingham (his ballads).)"

So would I - but it can be a bit tricky getting permission to use (preferably unpublished) material from any of those... :-/

Julf · March 20, 2012

"If this track is to be seen as 16/48 indeed, my NOS1 won't do a thing with it by guarantee. "

So are you saying that the NOS1 ignores the physical format of the track (96/24) and somehow "guesses" that it really is 48/16?

"No no, this now suddenly is (also) about whether you can perceive Nyquist violating tracks, which would be all of them under 96KHz and the 16 bits ones, just because your DAC can't see that; only with luck it overrules a few things."

How would they violate nyqvist? Upsampling is nyqvist-safe.

Julf · March 20, 2012

"when I was young, I was totally against our Monarchy, but nowadays I can see the immense value in having had a non-elected head of state for the last sixty years!>/i>"

Then there is the Belgian experiment - doing without government for a year

Jud · March 20, 2012

I still maintain that we don't have enough data to draw any real conclusions.

Yes, with you completely there.

So would I - but it can be a bit tricky getting permission to use (preferably unpublished) material from any of those... :-/

Right, well aware of the problem. I was mentioning these folks not as a request to have them be the ones to provide music samples (though wouldn't it be nice?), but simply to give a general idea of the type of vocal music I often find useful for testing.

PeterSt · March 20, 2012

So are you saying that the NOS1 ignores the physical format of the track (96/24) and somehow "guesses" that it really is 48/16?

No. But I think :-) I made a thinking mistake ...

Earlier I found myself in some sort of spiral where I couldn't make up my mind much about upsampling from 16 bits, that resulting in 16 bits ...

This *will* be violating Nyquist. Eh, formally.

And next I got into some wrong thinking, because you could just as well have presented a 48KHz file as an 96KHz one, without real upsampling (which XXHighEnd really can do, so it's easy for me to think about this by accident).

But you didn't of course, and I just thought the wrong way.

Apologies.

But now we're at it anyway ... think about the difference in decimating to 16 bits (after normal upsampling if you want) and upsampling from 16/48 to 24/96. There really will be a difference ...

But it is not important !

Julf · March 20, 2012

"I just thought the wrong way"

I have days like that. Just ask my wife!

Julf · March 20, 2012

"I was mentioning these folks not as a request to have them be the ones to provide music samples (though wouldn't it be nice?), but simply to give a general idea of the type of vocal music I often find useful for testing."

Yes - and I agree with you. So now we just need to find the next Gillian Welch, Rosanne Cash, Alison Krauss, Jakob Dylan or Ryan Bingham (before they have become famous) and ask for a demo tape (but in hi-res!)

Jud · March 20, 2012

So now we just need to find the next Gillian Welch, Rosanne Cash, Alison Krauss, Jakob Dylan or Ryan Bingham (before they have become famous) and ask for a demo tape (but in hi-res!)

I am guessing it would actually not be insuperably difficult to find an aspiring singer-songwriter of acceptable (or perhaps better) quality willing to provide sample material in exchange for a good recording studio providing him or her with 24/96 or 24/192 demos. Or (this is the shoot-for-the-stars alternative) ask someone like Neil Young or T-Bone Burnett (Burnett is a performer as well as producer), who have come out in favor of high-res and against the "loudness wars," if they might be willing to provide test/demo material.

Julf · March 20, 2012

"I am guessing it would actually not be insuperably difficult to find an aspiring singer-songwriter of acceptable (or perhaps better) quality willing to provide sample material in exchange for a good recording studio providing him or her with 24/96 or 24/192 demos."

It would certainly be worth a try - but let's first wait for everybody to tell what was wrong with my test and why it was flawed , so that the next round will be a better one.

41 Comments

Recommended Comments

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment