问题
I have a web application that I am currently working on that uses a MySQL database for the back-end, and I need to know what is better for my situation before I continue any further.
Simply put, in this application users will be able to construct their own forms with any number fields (they decide) and right now I have it all stored in a couple tables linked by foreign keys. A friend of mine suggests that to keep things "easy/fast" that I should convert each user's form to a flat table so that querying data from them stays fast (in case of large growth).
Should I keep the database normalized with everything pooled into relational tables with foreign keys (indexes, etc) or should I construct flat tables for every new form that a user creates?
Obviously some positives of creating flat tables is data separation (security) and query speeds would be cut down. But seriously how much gain would I get from this? I really don't want 10000 tables and to be dropping, altering, and adding all of the time, but if it will be better than I will do it... I just need some input.
Thank you
回答1:
Rule of thumb. It's easier to go from normalized to denormalized than the other way around.
Start with a reasonable level of database normalization (by reasonable I mean readable, maintainable, and efficient but not prematurely optimized), then if you hit performance issues as you grow, you have the option of looking into ways in which denormalization may increase performance.
回答2:
Keep your data normalized. If you index properly, you will not encounter performance issues for a very long time.
Regarding security: The flat approach will require you to write lots of create/drop table, alter table etc statements, ie a lot more code and a lot more points of failure.
The only reason to have flat files would be when your users can connect to the DB directly (you could still go for row level security). But in that case, you are really reimplementing a variant of phpmyadmin
回答3:
...in this application users will be able to construct their own forms with any number fields...
Yikes! Then how could you possibly do any sort of normalization when the users are, in essense, making the database decisions for you.
I think you either need to manage it step by step or let your freak flag fly and just keeping buying hardware to keep up with the thrashing you're going to get when the users really start to get into it....Case in point, look what happens when users start to understand how to make new forms and views in SharePoint...CRIKY!! Talk about scope creep!!
回答4:
Altering the schema during runtime is rarely a good idea. What you want to consider is the EAV (Entity-Attribute-Value) model.
Wikipedia has some very good info on the pros and cons, as well as implementation details. EAV is to be avoided when possible, but for situations like yours with an unknown number of columns for each form, EAV is woth considering.
回答5:
Keep your data normalized. The system will should stay fast provided you have proper indexing.
If you really want to go fast then switch the schema to one of the key value databases like bigDB /couchDB etc. That is totally denormalized and very very fast.
回答6:
The way I would handle this is to use a normalized, extensible "Property" table, such as below:
Table: FormProperty
id: pk
form_id: fk(Form)
key: varchar(128)
value: varchar(2048)
The above is just an example, but I've used this pattern in many cases, and it tends to work out pretty well. The only real "gotcha" is that you need to serialize the value as a string/varchar and then deserialize it to whatever it needs to be, so there is a little added responsibility on the client.
回答7:
Normalized == fast searches, easier to maintain indexes, slower insert transactions (on multiple rows)
Denormalized == fast inserts, ususally this is used when there are a lot of inserts (data warehouses that collect and record chronological data)
来源:https://stackoverflow.com/questions/4328022/should-i-use-flat-tables-or-a-normalized-database