Lets say I run a medical facility and want a website where my users/patients can lookup their private records. What would be the best solution
Your question is What are the best practices for such architecture?
I like this article from Microsoft Security Best Practices to Protect Internet Facing Web Servers
, which has had 11 revisions. Granted some of it is Microsoft-platform specific, a lot of the concepts you can apply to a platform-independent solution.
reference: http://social.technet.microsoft.com/wiki/contents/articles/13974.security-best-practices-to-protect-internet-facing-web-servers.aspx
First you need to identify the attacks that you want to try and protect against, and then address each of them individually. Since you mention "most-common attacks", we will start there; here is a quick list for a common three-tiered service (client-web-datastore):
Once a leak or breach happens, these are some of the issues that make it easier for the attackers, and thus should also be addressed:
Now we look at common mitigations:
1-3 (Inputs, SQL Injection, XSS) deal a lot with bad inputs. So all inputs from a client need to be sanitized, and (attack-focused) testing needs to be performed to ensure the code works correctly.
4 (Guessing) Automated tools will be used to try and guess a users' password, or if they have the data already, they will try to force the key or hash. Mitigations involve choosing the correct algorithm for the encryption or hash. Increasing the bit size of the key. Enforcing policies on password/key complexity. Using salts. Limiting the number of attempts per second. etc.
5 (Leaks) If the data is encrypted onsite, and the admins/employees/janitors do not have the keys to decrypt the data, then leaking the information is of limited value (especially if #4 is handled correctly). You can also place limitations on who and how the data can be accessed (the NSA just learned a valuable lesson here and are enacting policies to ensure that two people need to be present to access private data). Proper journaling and logging of access attempts is also important.
6 (Social Engineering) Attackers will attempt to call your support desk, impersonate a client, and either request access to privileged information or have the support desk change information (password, personal information, etc). They will often chain together multiple support calls until the have all the information needed to take control of an account. Support needs to be trained and limited in what information they will give out, as well as what data they can edit.
7 (Man-in-the-middle) This is where an attacker will attempt to inject himself into the flow of communication, most commonly through the use of rootkits running on client's machines or fake access points (wifi, for example). Wire/protocol based encryption (such as SSL) obviously is the first level of protection. But variants (such as man-in-the-browser) won't be mitigated as they will see the data after the SSL packets have been decrypted. In general, clients cannot be trusted, because the platforms themselves are insecure. Encouraging users to use dedicated/isolated machines is a good practice. Limit the amount of time that keys and decrypted data are stored in memory or other accessible locations.
8-9 (CSRF and Replay) Similar to man-in-the-middle, these attacks will attempt to duplicate (e.g. capture) a user's credentials, and/or transactions and reuse them. Authentication against the client origin, limiting the window when credentials are valid, requiring validation of the transaction (via a separate channel such as email, phone, SMS, etc) all help to reduce these attacks.
Proper encryption/hashing/salting is probably the first thing that companies screw up. Assuming all your other defense fall (and like you said, they probably will), this is your last hope. Invest here and ensure that this is done properly. Ensure that individual user records are encoded with different keys (not one master key). Having the client do the encryption/decryption can solve a lot of security issues as the server never knows the keys (so nobody can steal them). On the other hand, if the client looses the keys, then they will loose their data as well. So a trade off will have to be made there.
Invest in testing/attacking your solution. The engineer that implements a website/database is often not equipped to think about all the possible attack scenarios.
While josh poley's and Bala Subramanyam's are good answers, I would add that, if the core value of your business is security you should:
Hackers and developers will be your main asset, and they should know that. Indeed we can list most common security practices here, but applying our suggestions you won't make your system truly secure, just funny to hack.
When security matters, great talents, passion and competence are your only protection.
This is what I'm thinking:
All records are store in my home computer (offline) encrypted with my personal key. Within this computer there's the patient records and a private and a public key for each user. This computer uploads new data, as is, encrypter to the webserver.
The webserver only contains encrypted data.
I supply the public key to my users. Be it using email sent from somewhere else, or even by snail mail.
Webserver decrypts data on every request. Because the users password is its public key, decription on the server can only happen while there's an active session.
Because there's asymetric keys in play, I can even insert new encrypted data on the webserver (user input) and later fetch it to my offline-computer.
Downside: Requesting a new password requires the offline-computer to upload re-encrypted data, and to send a new password somehow.
Upside: Makes the webserver security concerns less relevant.
Is this the best solution?
Ok I will just try to build up a little on what you already proposed. Firstly you might want to research the technologies behind mega website; it uses presumably exactly what you'd be interested. On the fly JS based encryption however still does have some weaknesses. That being said it would not be easy to implement on the fly decryption of the records with js and html, not impossible though. Thus yes I would say you are generally thinking in the right direction.
Regardless you would have to consider all the common attack techniques and defenses (website attacks, server attacks etc.), but this topic is way too broad to be covered fully and completely in a single answer. And needless to say those are already very well covered in other answers.
As for 'architecture', if you are really paranoid you could also have the database on a separate server, which runs the database on an obscure port and allows incoming connections only from the webserver.