When is it appropriate to use CRC for error detection versus more modern hashing functions such as MD5 or SHA1? Is the former easier to implement on embedded hardware?
It all depends on your requirements and expectation.
Here are quick brief differences between these hash function algorithms:
is a cryptographic hash algorithm,
produces a 160-bit (20-byte) hash value known as a message digest
it is a cryptographic hash and since 2005 it's no longer considered secure,
can be used for encryption purposes,
an example of a sha1 collision has been found
first published in 1993 (as SHA-0), then 1995 as SHA-1,
series: SHA-0, SHA-1, SHA-2, SHA-3,
In summary, using SHA-1 is no longer considered secure against well-funded opponents, because in 2005, cryptanalysts found attacks on SHA-1 which suggests it may be not secure enough for ongoing useschneier. U.S. NIST advise that federal agencies should stop using SHA1-1 for application which require collision resistance and must use SHA-2 after 2010NIST.
Therefore, if you're looking for simple and quick solution for checking the integrity of a files (against the corruption), or for some simple caching purposes in terms of performance, you can consider CRC-32, for hashing you may consider to use MD5, however if you're developing professional application (which should be secure and consistent), to avoid any collision probabilities - use SHA-2 and above (such as SHA-3).
Some simple benchmark test in PHP:
# Testing static text.
$ time php -r 'for ($i=0;$i<1000000;$i++) crc32("foo");'
real 0m0.845s
user 0m0.830s
sys 0m0.008s
$ time php -r 'for ($i=0;$i<1000000;$i++) md5("foo");'
real 0m1.103s
user 0m1.089s
sys 0m0.009s
$ time php -r 'for ($i=0;$i<1000000;$i++) sha1("foo");'
real 0m1.132s
user 0m1.116s
sys 0m0.010s
# Testing random number.
$ time php -r 'for ($i=0;$i<1000000;$i++) crc32(rand(0,$i));'
real 0m1.754s
user 0m1.735s
sys 0m0.012s\
$ time php -r 'for ($i=0;$i<1000000;$i++) md5(rand(0,$i));'
real 0m2.065s
user 0m2.042s
sys 0m0.015s
$ time php -r 'for ($i=0;$i<1000000;$i++) sha1(rand(0,$i));'
real 0m2.050s
user 0m2.021s
sys 0m0.015s
Related:
CRC code is simpler and faster.
For what do you need any?
For CRC information on implementation, speed and reliability see A painless guide to CRC error detection algorithms. It has everything on CRCs.
Unless somebody is going to try and modify your data maliciously and hide the change CRC is sufficient. Just use a "Good" (standard) polinomial.
I ran every line of this PHP code in 1.000.000 loop. Results are in comments (#).
hash('crc32', 'The quick brown fox jumped over the lazy dog.');# 750ms 8 chars
hash('crc32b','The quick brown fox jumped over the lazy dog.');# 700ms 8 chars
hash('md5', 'The quick brown fox jumped over the lazy dog.');# 770ms 32 chars
hash('sha1', 'The quick brown fox jumped over the lazy dog.');# 880ms 40 chars
hash('sha256','The quick brown fox jumped over the lazy dog.');# 1490ms 64 chars
hash('sha384','The quick brown fox jumped over the lazy dog.');# 1830ms 96 chars
hash('sha512','The quick brown fox jumped over the lazy dog.');# 1870ms 128 chars
My conclusion:
Use "sha256" (or higher) when you need added security layer.
Do not use "md5" or "sha1" because they have:
CRC works fine for detecting random errors in data that might occur, for example, from network interference, line noise, distortion, etc.
CRC is computationally much less complex than MD5 or SHA1. Using a hash function like MD5 is probably overkill for random error detection. However, using CRC for any kind of security check would be much less secure than a more complex hashing function such as MD5.
And yes, CRC is much easier to implement on embedded hardware, you can even get different packaged solutions for this on IC.
I found a study that shows how inappropriate CRC hashes are for hash tables. It also explains the actual characteristics of the algorithm. The study also includes evaluation of other hash algorithms and is a good reference to keep.
The relevant conclusion on CRC for hashes:
CRC32 was never intended for hash table use. There is really no good reason to use it for this purpose, and I recommend that you avoid doing so. If you decide to use CRC32, it's critical that you use the hash bits from the end opposite to that in which the key octets are fed in. Which end this is depends on the specific CRC32 implementation. Do not treat CRC32 as a "black box" hash function, and do not use it as a general purpose hash. Be sure to test each application of it for suitability.
UPDATE
It seems the site is down. The internet archive has a copy though.