Jump to content
IGNORED

Can two audio files with the same checksum, played from the same devices, sound different?


Well, can they?  

43 members have voted

You do not have permission to vote in this poll, or see the poll results. Please sign in or register to vote in this poll.

Recommended Posts

It has been proposed that two computer files that encode digital audio (eg, AIFF, Wav, etc) and possess identical checksums, played from the same disk through the same computer, cables, etc, can nonetheless sound different, given their different download history, copying history, or other paths through the space-time continuum.

 

What say you?

 

(For reference, the number of grains of sand on the planet has been estimated to be 1020).

Link to comment

Natch everyone knows my opinion on it, and that is no.

 

It *is*'possible, but about as unlikely in a mathematical sense, as one's hand passing through a solid wall without touching the wall. Possible, but has probably never happened in the life of the universe.

 

I do think people are hearing something different, but as to what that is, well- nobody seems all that willing pay the cost in time and money to find out.

 

Paul

 

P.S. - you can hear a difference if you take a terabyte disk, and write each segment of the file to the disk in such a way that the disk has to seek over the entire disk to retrieve each block. You won't be able to do that on a normal system. Even when you do, the difference obligingly vanishes when you read the entire file into memory before playing it.

Anyone who considers protocol unimportant has never dealt with a cat DAC.

Robert A. Heinlein

Link to comment
Can two audio files with the same checksum, played from the same devices, sound different?

 

It has been proposed that two computer files that encode digital audio (eg, AIFF, Wav, etc) and possess identical checksums, played from the same disk through the same computer, cables, etc, can nonetheless sound different, given their different download history, copying history, or other paths through the space-time continuum.

 

What say you?

 

I say yes. I read the definition of checksums and based on the definition I'm not sure they guarantee identical files.

 

George said that "checksums mean identical data - bit-for-bit" in another thread and since I have kept reading this claim here on CA I looked up: Checksum

 

Here are some quotes with what I think is important bolded:

 

"A checksum or hash sum is a small-size datum computed from an arbitrary block of digital data for the purpose of detecting errors which may have been introduced during its transmission or storage."

 

...if the computed checksum for the current data input matches the stored value of a previously computed checksum, there is a very high probability the data has not been accidentally altered or corrupted.

 

...By themselves, checksums are often used to verify data integrity, but should not be relied upon to also verify data authentication."

 

My question was not answered so I am asking it again. Do these checked blocks of digital data just add up the sum of the 1's and 0's and make sure the total is the same for both files or do they actually make sure the 1's and 0's are in the exact same order on each file and do the check the entire file bit by bit? Also do they check for timing information, etc.?

 

In the end, what sounds different, sounds different and what sounds the same, sounds the same.

I have dementia. I save all my posts in a text file I call Forums.  I do a search in that file to find out what I said or did in the past.

 

I still love music.

 

Teresa

Link to comment

 

But if the files are read into memory, why would fragmentation matter?

 

PS: Excellent sig line.

 

Not saying it would, just trying to eliminate variables :-)

 

Also, if, by some very slim chance, undetected errors got through the checksumming process, wouldn't they affect individual chunks (if it's a wav) and thus manifest as the equivalent of clicks on a vinyl record rather than a more general sonic signature?

 

If it's any consolation, this place don't make sense to me either.

Link to comment

Hi Teresa-

 

What we call "Checksums" can be figured out using several different algorithms, dozens of them in fact. What they all do is mathematically provide a very high degree of certainty that if the data in two files has the same checksum, they are bit for bit identical. Cyclic Redundancy Checks for example. We often use them as a shortcut way to ensure data that is being transmitted over a communications facility, like the Internet, a phone line, or even a tape, is transmitted correctly.

 

Some simple variants include the "check digit" included on your checks. It is an example of where adding up the numbers is used, and making sure the account number a machine read is correct. What we are talking about with computer files is more sophisticated, and even less prone to miss identifying a mistake.

 

And often, we avoid the issue entirely and do a bit for bit comparison of the data in two files. This is just reading a character from each file, and comparing them one at a time. Exactly the same way a human would compare two documents to ensure they were the same.

 

In your original query, I believe you mentioned FLAC and WAV files. The same song encoded as a WAV file will have a very different Checksum, by any method, than the file encoded as a FLAC file - even though when processed, they might send the exact same information (ones and zeros) to your DAC. Thus, it is quite reasonable that a FLAC file might sound different from a WAV file.

 

But if you copied one WAV file to a new name, the two files would be identical, bit for bit and with matching checksums. At that point, it is reasonable to assume the files would sound the same, at least when played through the exact same playback chain.

 

It is also impossible for the data in each file to degrade in any way, so long as they are bit for bit identical, meaning they will have the same checksums. If not, then an error occurred in the copying.

 

Now, if you were to listen to two bit identical WAV files on your system, and tell me they sound different, I would believe you. But the reason would not be that the data is different or has noise embedded in it, nor anything like that. I do not know what the reason would be, but I would believe you hear it.

 

Remember, the data on the disk is not music, like on an LP. It is the same kind of file your bank uses to track your checking account. Nobody would use banks if the sums in their accounts mysteriously changed for no reason time and time again. That isn't to say banks do not make mistakes, because they do of course. But almost invariably, the mistake is caused by a programming error made by a human.

 

Paul

 

P.S. "data authentication" refers to enduring that the file in question really came from where you think it came from. PGP signatures are an example of this.

 

Also, if you are really interested in exactly how checksums and other data integrity algorithms worth, message me and I will send you some references. They do all boil down to how to calculate a value that represents the data in a file, bout they can get quite complex.

 

I say yes. I read the definition of checksums and based on the definition I'm not sure they guarantee identical files.

 

George said that "checksums mean identical data - bit-for-bit" in another thread and since I have kept reading this claim here on CA I looked up: Checksum

 

Here are some quotes with what I think is important bolded:

 

"A checksum or hash sum is a small-size datum computed from an arbitrary block of digital data for the purpose of detecting errors which may have been introduced during its transmission or storage."

 

...if the computed checksum for the current data input matches the stored value of a previously computed checksum, there is a very high probability the data has not been accidentally altered or corrupted.

 

...By themselves, checksums are often used to verify data integrity, but should not be relied upon to also verify data authentication."

 

My question was not answered so I am asking it again. Do these checked blocks of digital data just add up the sum of the 1's and 0's and make sure the total is the same for both files or do they actually make sure the 1's and 0's are in the exact same order on each file and do the check the entire file bit by bit? Also do they check for timing information, etc.?

 

In the end, what sounds different, sounds different and what sounds the same, sounds the same.

Anyone who considers protocol unimportant has never dealt with a cat DAC.

Robert A. Heinlein

Link to comment
Is there a who cares choice?

 

+1

 

Or, how many DSD files can you balance on the head of a pin?

"Relax, it's only hi-fi. There's never been a hi-fi emergency." - Roy Hall

"Not everything that can be counted counts, and not everything that counts can be counted." - William Bruce Cameron

 

Link to comment

I answered no, as the probabilities are way to low given the setting that the files actually contain very similar data. Both MD5 and SHA1 will very likely generate widely different checksums for similar data.

 

And for all practical purposes, 1/10^50 is 0 anyway :)

 

I just don't understand this obsession with checksums. Why not compare files bit-by-bit?

Home: Apple Macbook Pro 17" --Mini-Toslink--> Cambridge Audio DacMagic --XLR--> 2x Genelec 8020B

Work: Apple Macbook Pro 15" --USB--> Focusrite Scarlett 2i2 --1/4\"--> Superlux HD668B / 2x Genelec 6010A

Link to comment
I just don't understand this obsession with checksums. Why not compare files bit-by-bit?

Well if you download a file over a network, by definition, the original is not there for you to compare bitwise with the copy. On the other hand, the checksum of the original can be sent as part of the original and can be compared with the computed checksum of the copy.

Oops, I just noticed you are a computer scientist, so I guess you already knew that. Of course, you are right - if the two files that allegedly sound different are both to hand, they could be compared bit for bit. Maybe it's time for a new thread, so we can really explore this fascinating subject in depth:

 

Can two audio files which are identical bit for bit, played from the same devices, sound different?

Not everything that can be counted counts, and not everything that counts can be counted.

- Einstein

Link to comment

I don't get this discussion. No one seriously argues that the 2 files are different. That's just not a plausible argument.

 

The question then becomes, can they sound different, and if yes, how?

 

Well John Swenson has demonstrated (and measured) that different playback software installed in the same HW setup can result in different amounts of ground plane noise being generated when playing back the same file. Ground plane noise can intefere with transmission of audio and thereby effect the resulting sound in playback. So this could be an explanation for some of the differences heard by users when playing back files, even when the data is identical, or perceived differences between lossless compressed and uncompressed.

 

But I believe the original question here assumes that everything in the playback chain is the same - HW and Software - if that is so, then no I don't believe they can sound different, because they are identical.

Main listening (small home office):

Main setup: Surge protector +_iFi  AC iPurifiers >Isol-8 Mini sub Axis Power Conditioning+Isolation>QuietPC Low Noise Server>Roon (Audiolense DRC)>Stack Audio Link II>Kii Control>Kii Three >GIK Room Treatments.

Secondary Listening: Server with Audiolense RC>RPi4 or analog>Matrix Element i Streamer/DAC (XLR)+Schiit Freya>Kii Three .

Bedroom: SBTouch to Cambridge Soundworks Desktop Setup.
Living Room/Kitchen: Ropieee (RPi3b+ with touchscreen) + Schiit Modi3E to a pair of Morel Hogtalare. 

All absolute statements about audio are false :)

Link to comment

Come to think of it, how about:

Can the same file, played more than once on the same device, sound different each time?

 

I would have to answer that with a resounding yes. Every time I listen to a piece of music it sounds a bit different, I guess depending on my psychological/physiological/emotional/pharmacological state at the time.

 

That makes the question of bit-identical files sounding different somewhat moot, at least for me, since I can't really listen to them both at the same time.

Not everything that can be counted counts, and not everything that counts can be counted.

- Einstein

Link to comment

+1

 

_

Come to think of it, how about:

Can the same file, played more than once on the same device, sound different each time?

 

I would have to answer that with a resounding yes. Every time I listen to a piece of music it sounds a bit different, I guess depending on my psychological/physiological/emotional/pharmacological state at the time.

 

That makes the question of bit-identical files sounding different somewhat moot, at least for me, since I can't really listen to them both at the same time.

Main listening (small home office):

Main setup: Surge protector +_iFi  AC iPurifiers >Isol-8 Mini sub Axis Power Conditioning+Isolation>QuietPC Low Noise Server>Roon (Audiolense DRC)>Stack Audio Link II>Kii Control>Kii Three >GIK Room Treatments.

Secondary Listening: Server with Audiolense RC>RPi4 or analog>Matrix Element i Streamer/DAC (XLR)+Schiit Freya>Kii Three .

Bedroom: SBTouch to Cambridge Soundworks Desktop Setup.
Living Room/Kitchen: Ropieee (RPi3b+ with touchscreen) + Schiit Modi3E to a pair of Morel Hogtalare. 

All absolute statements about audio are false :)

Link to comment
I don't get this discussion. No one seriously argues that the 2 files are different. That's just not a plausible argument.

 

The question should have said something like:

 

If I have a WAV file, which I concert to AIFF, then concert to FLAC, then convert to ALAC, then email as a ZIP file to my friend in Singapore, who uploads it to his Google Drive, where it is downloaded by a teenager in Russia, who uploads it to a Bittorrent site, from which I download it and convert it back to a WAV file with an identical md5sum, can the two files sound different?

Link to comment

There is another aspect of the question that I think could have its importance...

 

At extraction time, is it important to take care of conditions ?

 

If I extract an album with a PC that is not optimized, connected on the same AC line as an appliance that is known to be electrically noisy, extraction that will land on a noisy hard drive - as opposed to - The same extraction, on an isolated and "clean" electrical circuit, isolated from any vibration, extracted on an SD card that has its own independant circuit and clean power...

 

And of course all resulting with the same checksum: could this have its importance ?

Alain

Link to comment
There is another aspect of the question that I think could have its importance...

 

At extraction time, is it important to take care of conditions ?

 

If I extract an album with a PC that is not optimized, connected on the same AC line as an appliance that is known to be electrically noisy, extraction that will land on a noisy hard drive - as opposed to - The same extraction, on an isolated and "clean" electrical circuit, isolated from any vibration, extracted on an SD card that has its own independant circuit and clean power...

 

And of course all resulting with the same checksum: could this have its importance ?

 

You begin to touch on all the mechanisms that have been posited and explored to explain why people hear a difference, when they do. If the extraction is to the same media (i.e. a hard disk) and the checksums match, then no - the electrical characteristics in force during the extraction will make no difference. The reason of course being, it is just data, not music, that is on that hard disk.

 

If to different media, then yes, there could be some audible differences, but the cause is not the data or anything inherent about the data. (i.e .there is not any "noise" in the digital data that makes a 1 less than a one, or a zero more than a zero. ) The difference heard must be because of the media itself. CD's for example, use a completely different physical method to retrieve the data than does a hard drive, than does a SSD drive. And so on.

 

Even then, if the data files are copied to the same media, there is again no inherent difference in the data or in the data storage.

 

Yes, identical files can sound different based upon external considerations, like the media, the listener's mood, health, or even their state of mind. None of that has anything to do, however, with if the media was extracted by an average power supply, or a very clean power supply. The *playback* chain's power supplies can make a difference, but two bit identical files played back on that chain from the same media won't sound different because the data in one copy has "degraded" or is inherently more noisy.

 

Attempts to investigate and explain what some well respected people are hearing has met with little success. If a reliable and repeatable difference can be heard, which I have been led to believe is the case, then some other operator must be in effect. Unlike analog data, when a computer reads digital data, it either gets it right or wrong. No in-between. When it transmits that data as an audio stream (really, a digital stream carrying digital representations of audio data, there is a difference) then other operators, like timing, come into play.

 

Somewhere, something *is* happening. But the current explanations, and especially the assertions that such differences will survive copying to different media sources, are as unacceptable as "bias" being the sole reason people hear differences.

 

IMNSHO, YMMV, IDCSBYOTA, etc.

-Paul

Anyone who considers protocol unimportant has never dealt with a cat DAC.

Robert A. Heinlein

Link to comment

Hi Paul,

 

Yes, I believe that because digital is a representation of ones and zeros, there can't be anything other than a one and a zero initialized on the magnetic (or memory) of a medium. Thanks :)

 

I once wondered if anyone would have been curious about using a hard drive to imprint analog signals, like any other magnetic medium (reel to reel tape, cassette, data tape, etc...). I never heard of that, but this brought me to another question... Since magnetic particles are polarized from electrical impulses... Could there be something that leaves traces that are not recognized as zeros nor ones, but are there (like "dust", without being "analyzed" nor recognized as anything usable, but that could affect in a very low level fashion) ? Because when we talk about differences, they seem to be so vanishingly small that the cause is hard to grasp...

 

Sorry for the poor imaging of what I am trying to convey... Still at a "I wonder", "Could it be that..." level...

 

Regards,

 

Alain

Alain

Link to comment

John Swenson, whose day job I believe involves data storage hardware design, mentioned that he thinks it's possible for jitter to be in effect preserved in a stored file, because stored binary 1s do not all have precisely the same levels (but they're all high enough to be 1s), and neither do all stored 0s have precisely the same levels (but they're low enough to be 0s).

 

However, John could not think of a mechanism (other than fragmentation) resulting in a difference between two bit for bit identical files that would be affected by history prior to the time of the immediately preceding storage on disc/SSD.

 

So I suppose if practical for you, it would be a nice idea to rip with the best quality, lowest noise hardware you have available for the purpose. Couldn't hurt.

One never knows, do one? - Fats Waller

The fairest thing we can experience is the mysterious. It is the fundamental emotion which stands at the cradle of true art and true science. - Einstein

Computer, Audirvana -> optical to EtherREGEN -> microRendu -> ISO Regen -> Pro-Ject Pre Box S2 DAC -> Spectral DMC-12 & DMA-150 -> Vandersteen 3A Signature.

Link to comment

Hi Alain - your understanding is very good indeed, and yes, any media has the equivalent of digital "dust" - making it more or less difficult to read the data from the media. But that is totally an effect of the media, not something that is inherent in the data. Change the media, and you change the "dust" - but not the data.

 

It's important to make that distinction - as if you take the noisiest most nasty media possible, whatever that may be, and copy the data to the cleanest most perfect media possible (again, whatever that may be), the "dust" from the first copy does not come along.

 

The exception to that is, of course, when you playing back the music over a digital connection. Them that "dust" may affect the timing, and thus affect the sound. Or some other mechanism like that. So playing back from media with less of that "digital dust" you are thinking of makes a lot of sense, and probably will sound better.

 

-Paul

 

P.S. "Dust" is really not at all an accurate description, but it does work as an explanation. Next time someone goes off on electrical fuzz or noise on a media related to audio playback, think of it as dust in the cracks of an LP groove, and you will have a good understanding. The grooves are perfect, but the dust can throw off the playback, even though the dust is not part of the music on the LP. ;)

 

P.P.S. - What Jud said makes a whole lot of sense. Just remember when you copy the data, you are not copying the noise along with it, unless and until you convert the data to analog music signals before you copy it.

 

Hi Paul,

 

Yes, I believe that because digital is a representation of ones and zeros, there can't be anything other than a one and a zero initialized on the magnetic (or memory) of a medium. Thanks :)

 

I once wondered if anyone would have been curious about using a hard drive to imprint analog signals, like any other magnetic medium (reel to reel tape, cassette, data tape, etc...). I never heard of that, but this brought me to another question... Since magnetic particles are polarized from electrical impulses... Could there be something that leaves traces that are not recognized as zeros nor ones, but are there (like "dust", without being "analyzed" nor recognized as anything usable, but that could affect in a very low level fashion) ? Because when we talk about differences, they seem to be so vanishingly small that the cause is hard to grasp...

 

Sorry for the poor imaging of what I am trying to convey... Still at a "I wonder", "Could it be that..." level...

 

Regards,

 

Alain

Anyone who considers protocol unimportant has never dealt with a cat DAC.

Robert A. Heinlein

Link to comment

Hi Jud,

 

I recall reading this some time ago - thanks. It could explain things (level of a "1" or a "0") and it would also explain the checksum being the same. But at playback, I suppose that these slight differences from the imprint should not affect the fact that they are interpreted as being the same ? Or the opposite, in the course of the signal drawn, something occurs that switches a zero into a one and/or vice versa ?

 

And yes, maybe extracting music under the best conditions is not a bad thing...

 

Edit: of course, this does not explain why the same will not happen with let's say a word document or else... And about jitter (but how is it "imprinted" ?)... Finally, a question that calls for more questions... maybe.

Alain

Link to comment

Oops, I just noticed you are a computer scientist, so I guess you already knew that. Of course, you are right - if the two files that allegedly sound different are both to hand, they could be compared bit for bit.

 

Yes. This is pretty much implied by "played from the same device" ;)

Home: Apple Macbook Pro 17" --Mini-Toslink--> Cambridge Audio DacMagic --XLR--> 2x Genelec 8020B

Work: Apple Macbook Pro 15" --USB--> Focusrite Scarlett 2i2 --1/4\"--> Superlux HD668B / 2x Genelec 6010A

Link to comment

Hi Paul,

 

I believe that the way you word it is a lot better than my very limited vocabulary :) This has been a question that I had for some years now, but I fought against it since my way of extracting music was not the best (using more than one CD/DVD reader to extract many CDs at the same time, with a regular PC that carried a hardware raid controller, with 5 hard drives, etc...)... I finally voted against it, but that question has always bugged me...

 

And the other problem is to now add this in our perception about the importance of everything... As we like to perceive things, digital should not be affected, but after all that I read about the importance of a good clean power and isolation, there seems to be more to it, even at the extraction level...

 

Many things lie in the way to phrase them so they can "sound" legitimate... :)

 

Regards,

Alain

Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now


×
×
  • Create New...