My Question:
What is the Best Approach to Ensure Data Security of Small Data? Below I present a concern around symmetric and asymmetric encryption. I'm curious if there is a way to do asymmetric encryption on small data with an equivalent of some sort of "salting" to actually make it secure? If so, how do you pick a "salt" and implement it properly? Or is there a better way to handle this?
Explanation of My Concern:
When encrypting something that has "bulk" it seems to me that asymmetric encryption approaches are pretty secure. My concern is around if I have a small field of data, say a credit card number, password, or social security number in a database. Then the data being encrypted is of fixed length and presentation. That being said, a hacker could attempt to encrypt every possible social security numbers (10^9 permutations) with the public key and compare it to values stored in the db. Once they find a match, they know the real number. Similar attacks can be done for the other data types. Because of this, I decided to avoid symmetric methods like mysql's AES_ENCRYPT()
built in function, however now I'm questioning asymmetric as well.
How do we properly protect small data?
Salting is normally used for hash algorithms, but I need to be able to get the data back after. I thought about maybe having some "base bulk text", then append the sensitive data to the end. Do the encrypt on that concatenation. Decryption would reverse the process, by decrypting then stripping off the "base bulk text". If the hacker can figure out the base bulk text then I don't see how this would add any additional security.
Picking other data to include as part of encryption, to help act like a salt value derived from other fields in the database(or hash values of those fields, or combination there of yields the same issue) also seems like it is vulnerable. As hackers could be run through combinations similar to the attack mentioned above to try to perform a more intelligent form of "brute force". That being said, I'm unsure of how to properly secure the small data and my googles have not helped me.
What is the best approach to ensure data security of small data?
When I encrypt short messages, I add a relatively long random salt to them before encryption. Edit others suggest prepending the salt to the payload.
So, for example, if I encrypt the fake credit card number 4242 4242 4242 4242
. what I actually encrypt is
the first time, and
the second time, and so forth.
This random salting significantly discourages the lookup table approach you describe. Many operating systems furnish sources of high-quality random numbers like *nix /dev/rand
and Windows' RNGCryptoServiceProvider
It's still not OK to hold payment card data in that way without defense in depth and PCI data security certification.
Edit: Some encryption schemes handle this salting as part of their normal functioning.
If you are encrypting with an RSA public key, there is no need to salt the small data. Use OAEP padding. The padding introduces the equivalent of random salt. Try it: encrypt the credit card number twice with the same RSA public key, using OAEP padding, and look at the result. You will see two different values, indistinguishable from random data.
If you are encrypting with an AES symmetric key, then you can use a random IV per data, and store the IV in the clear, publicly, next to the ciphertext. Try encrypting the credit number twice with AES CBC mode, for example, with a unique, 16 byte (cryptographically strong) IV each time. You will see two different ciphertexts. Now, assuming a 16-byte AES key, try to brute force those two outputs, without using any knowledge of the key. Use just the ciphertext, and the 16 byte IVs, and try to discover the credit card number.
EDIT: It's beyond the scope of the question, but since I mention it in the comment, if a client can send you arbitrary ciphertext to decrypt ("decrypt this credit card info"), you must not let the client see any difference between a padding error on decryption, vs. any other error on decryption. Look up "padding oracle".
If you need to encrypt data use a symmetric key algorithm, AES is a good choice. Use a mode such as CBC and a random IV, this will ensure that encryption the same data will produce different output.
Add PKCS#7 née PKCS#5 for padding.
If there is real value in the data hire a cryptographic domain expert to help with the design and later validation.
Asymmetric encryption is most useful for communicating encrypted data between two parties. For example, you have a mobile application that accepts credit card numbers and needs to transmit them to the server for processing. You want the public application (which is inherently insecure) to be able to encrypt the data and only you should be able to decrypt it in your secure environment.
Storage is a completely different matter. You're not communicating anything to or from an insecure party, you are the only one dealing with the data. You don't want to give everyone a way to decrypt things if they breach your storage, you want to make things as difficult as possible. Use a symmetric algorithm for storage and include a unique Initialization Vector with each encrypted value as a hurdle to decryption if the storage is compromised.
PCI-DSS requires that you use Strong Cryptography, which they define as follows.
At the time of publication, examples of industry-tested and accepted standards and algorithms for minimum encryption strength include AES (128 bits and higher), TDES (minimum triple-lengthkeys), RSA (2048 bits and higher), ECC (160 bits and higher), and ElGamal (2048 bits and higher). See NIST Special Publication 800-57 Part 1 (http://csrc.nist.gov/publications/) for more guidance on cryptographic key strengths and algorithms.
Beyond that, they are primarily concerned with key management, and with good reason. Breaching your storage won't help as much as actually having the means to decrypt your data, so ensure that your symmetric key is managed correctly and in accordance with their requirements.