I\'ve read several stackoverflow posts about this topic, particularly this one:
Secure hash and salt for PHP passwords
but I still have a few questions, I ne
You question is about using passwords as an authentication mechanism and how to securely store these passwords in a database using a hash. As you probably already know the goal is to be able to verify passwords without storing these passwords i clear text in the database. In this context let me try to answer each of your questions:
If someone has access to your database/data, then they would still have to figure out your hashing algorithm and your data would still be somewhat secure, depending on your algorithm? All they would have is the hash and the salt.
The basic idea of hashing passwords is that the attacker has knowledge of the hashing algorithm and has access to both the hash and the salt. By selecting a cryptographic strong hash function and a suitable salt value that is different for each password the computational effort required to guess the password is so high that the cost exceeds the possible gain the attacker can get from guessing the password. So to answer your question, hiding the hash function does not improve the security.
If someone has access to your database/data and your source code, then it seems like no matter what your do, your hashing algorithm can be reversed engineered, the only thing you would have on your side would be how complex and time consuming your algorithm is?
You should always use a well-known (and suitably strong) hashing algorithm, and reverse engineering this algorithm is not meaningful as there is nothing hidden in your code. If you didn't mean reverse engineer but actually reverse then, yes, the passwords are protected by the complexity of reversing the hash function (or guessing a password that matches a hash value). Good hash functions makes this very hard.
It seems like the weakest link is: how secure your own systems are and who has access to it?
In general this is true, but when it comes to securing passwords by storing them as hashes you should still assume that the attacker has full access to the hashes and design your system accordingly by choosing an appropriate hash function and using salts.
What types of attacks are these hashes trying to protect against? I've read about rainbow table and dictionary attacks (brute force), but how are these attacks administered?
The basic attack that password hashing protects against is when the attacker gets access to your database. The clear text password cannot be read from the database and the password is protected.
A more sophisticated attacker can generate a list of possible passwords and compute the hash using the same algorithm as you. He can then compare the computed hash to the stored hash and if he finds a match he has a valid password. This is a brute force attack and it is generally assumed that the attacker has "offline" access to your database. By requiring the users to use long and complex passwords the effort required to "brute force" a password is significantly increased.
When the attacker wants to attack not one password, but all the passwords in the database a large table of passwords and hash value pairs can be precomputed and further improved by using what is called hash chains. Rainbow tables is an application of this idea and can be used to brute force many passwords simultaneously without increasing the effort significantly. However, if a unique salt is used to compute the hash for each password a precomputed table becomes useless as it is different for each salt and cannot be reused.
To sum it up: Security by obscurity is not a good strategy for protecting sensitive information and modern cryptography allows you to secure information without having to resort to obscurity.
If you have just a hash, no salt, then once they know your data (and algorithm) they can get your password via a rainbow table lookup. If you have a hash and a salt, they can get your password by burning a lot of CPU cycles and building a rainbow table.
If your salt is the same for all your data, they only need to burn a lot of CPU cycles once to build the table and then they have all the passwords. If your salt is not always the same, they need to burn through the CPU cycles to make a unique rainbow table for each record.
If the salt is long enough, the CPU cycles they need become very cost-prohibitive.
If you know your data security is breached, of course, you need to reset all the passwords immediately anyway, because as far as you know the attacker is willing to spend that time.
The security of cryptographic algorithms is always in their secret input. Reasonable cryptanalysis is based on an assumption that any attacker knows what algorithm you use. Good cryptographic hashes are non-invertible and collision resistant. This means that there's still a lot of work to do going from a hash to the value that generated it, regardless of whether you know the algorithm applied.
As I noted in a comment, there are attacks that these measures defend against. First, knowing the password may lead to authorization to do things beyond what the contents of the database suggest. Second, those passwords may be used elsewhere, and you expose your users to risk by revealing their passwords as a result of a break-in. Third, with hashing, an insider can't exploit read-only access to the database (subject to less auditing, etc.) to impersonate a user.
Dictionaries and rainbow tables are techniques for accelerating hash inversion.
( OP )
brings up a good point, if your data is compromised then game over ... my follow up question is: what types of attacks are these hashes trying to protect against? I've read about rainbow table and dictionary attacks (brute force), but how are these attacks administered
( discussion )
It's not a game, except to the attacker. Research these terms:
Then tell us ( as thought provocation for you ) how do we protect against whom? For one thing, it is the efforts of innocents vis-a-vis intruders - and for another it is data-recovery if part of the system fails.
It is an interesting experiment that the original intent of tcp/ip and so on is advertised as being a weapon of war, survivability under attacks. Okay, so passwords are hashed - no one can recover them ...
Which, duh, includes the owner-operator of the system.
So you build a robust record locking tool that implements key controls, then political pressures force the use of brand-x tools.
You can read Federal Information Security Management Act (FISMA) and by the time you have read it some governmental entity somewhere will have had an entire disk either stolen or compromised.
How would you protect that disk if it was your personal identity information on that disk.
I can tell you from the caliber of Martin Liversage and jadeters they will be paying attention.
If someone has access to your database/data, then they would still have to figure out your hashing algorithm and your data would still be somewhat secure, depending on your algorithm? All they would have is the hash and the salt.
This might be all a really dedicated opponent would need. Much of this answer depends on how valuable the data is, which would tell you how motivated the opponent is. Credit card numbers are going to be extremely valuable, and criminal attackers seem to have plenty of time and accomplices to do their dirty work. Some bad guys have been known to farm out key decryption tasks to botnets!
If someone has access to your database/data and your source code, then it seems like no matter what your do, your hashing algorithm can be reversed engineered, the only thing you would have on your side would be how complex and time consuming your algorithm is?
If they have access to your source and all the data, the question is going to be "how did you load your key into the memory of the server in the first place?" If it's embedded in the data or in the program code, it's game over and you've lost. If it was hand-keyed by an operator at the machine's boot time, it should be as secure as your trust in your operator. If it is stored in an HSM*, it should still be secure.
And if they have root-level authority access to your running machine, then they can probably trigger and recover a memory dump that will reveal the secret key.
It seems like the weakest link is: how secure your own systems are and who has access to it?
This is true. But there are alternatives that help improve security.
For bank-like protection, the kind that passes security and industry audits, it's recommended that you use a *Hardware Security Module (HSM) to perform key storage and encryption/decryption functions. The commercial strength HSMs we're looking at cost 10s of thousands of dollars or more each, depending on capacity. But I have seen hardware encryption cards that plug into a PCI slot that cost substantially less.
The idea behind an HSM is that the encryption happens on a secure, hardened platform that nobody has access to without the secret keys. Most of them have cabinets with intrusion detection switches, trip wires, epoxied chips, and memory that will self-destruct if tampered with. Not even the legitimate owner or the factory should be able to recover the database key from an HSM without the set of authorized crypto keys (usually carried on smart cards.)
For a very small installation, an HSM can be as simple as a smart card. Smart cards aren't high performance encryption devices, though, so you can't pump more than about one decryption transaction per second through them. Systems using smart cards usually just store the root key, then decrypt the working database key on the smart card and send it back to the database accessing system. These will still yield the working database key if the attacker can access running memory, or if the attacker can sniff the USB traffic to and from the smart card.
And I have no experience getting TPM chips to work (yet), but theoretically they can be used to securely store keys on a machine. Again, it is still no defense against an attacker taking a memory dump while the key is loaded in memory, but they would prevent a stolen hard drive containing code and data from revealing its secrets.
When hashing a password, it is one way. So it is very difficult to get the password even if you have the salt, source and alot of cpu cyles to burn.