l i n u x - u s e r s - g r o u p - o f - d a v i s
Next Meeting:
July 7: Social gathering
Next Installfest:
Latest News:
Jun. 14: June LUGOD meeting cancelled
Page last updated:
2002 Aug 08 14:01

The following is an archive of a post made to our 'vox-tech mailing list' by one of its subscribers.

Report this post as spam:

(Enter your email address)
Re: [vox-tech] shell script challenge - Now MD5sum erratia
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [vox-tech] shell script challenge - Now MD5sum erratia

begin Charles Polisher <cpolish@attbi.com> 
> Micah Cowan wrote:
> > GNU Linux writes:
> >  > Found a very interesting page on md5sum. It's:
> >  > 
> >  > http://hills.ccsf.org/~jharri01/project.html
> >  > 
> >  > "So why does MD5 seem so secure? Because 128 bits allows you to have
> >  > 2128=340,282,366,920,938,463,463,374,607,431,768,211,456 different
> >  > possible MD5 codes"
> >  > 
> >  > Lots of good reading for insomniacs.
> > 
> > It still shouldn't be relied upon, however, that two identical MD5
> > checksums are sufficient evidence that the corresponding files are
> > identical; I've heard more than one person claim to have encountered
> > identical MD5 sums for different files, and its certainly not
> > impossible, just improbable.
>   I'm dubious ;^)
>   <H. Lector voice>
>     The voices tell you they've seen MD5's collide; do they
>     tell you other things, Micah? 
>   </H. Lector voice>
i'd have to agree with chuck.

   2^128 = 340,282,366,920,938,463,374,607,431,768,211,456

is an awfully big number.  according to ORA's PGP book, this number is 
is billions of times larger than the total number of computer files that
the human race has ever created, and will ever create, for the next
thousand years.

another way to put this:  you are 10^25 times more likely to win
the california lottery jackpot than you are to find 2 files with the
same md5sum.

i think this goes over the line of being "improbable" and enters the
realm of "effectively impossible".

if someone WERE to find 2 files with the same md5sum, i think they owe
it to the world to announce their find.  it's kind of like finding
amelia earhardt's plane, or noah's ark.  you just don't find 2 files
with equal md5sums and say "oh, that's interesting".  you report it
scientific american, new york times and slashdot.   :-)

> > But it's a heluva lot better than running diff from one file to
> > every other file - a factorial-time operation! :)
>   And now, a quibble: Actually, a comparison of the entire file would
>   be no different than a comparison of the md5sum. From a Big O
>   standpoint, it's just a constant factor. If comparing md5's isn't
>   factorial, neither should a full diff. If there were a ton of files
>   and the lengths were spread out, the comparisons could be further
>   reduced by sorting the list by file size, then comparing only among
>   groups of the same size.
that may suffer from having to be done using a HUGE temp file, rather
than small atomic operations done in memory.

you'd have to generate the list (ugh), sort the list (double ugh) and
THEN do the comparisons on the candidates.  i think this would be
significantly more work for the computer, but would take a few seconds
to write for the programmer.


GPG Fingerprint: B9F1 6CF3 47C4 7CD8 D33E  70A9 A3B9 1945 67EA 951D
vox-tech mailing list

LUGOD Group on LinkedIn
Sign up for LUGOD event announcements
Your email address:
LUGOD Group on Facebook
'Like' LUGOD on Facebook:

Hosting provided by:
Sunset Systems
Sunset Systems offers preconfigured Linux systems, remote system administration and custom software development.

LUGOD: Linux Users' Group of Davis
PO Box 2082, Davis, CA 95617
Contact Us

LUGOD is a 501(c)7 non-profit organization
based in Davis, California
and serving the Sacramento area.
"Linux" is a trademark of Linus Torvalds.

Sponsored in part by:
Appahost Applications
For a significant contribution towards our projector, and a generous donation to allow us to continue meeting at the Davis Library.