问题
I am using Ruby on Rails 3.0.7 and MySQL 5. In my application I have two database tables, say TABLE1 and TABLE2, and for performance reasons I have denormalizated some data in TABLE2 so that I have repeated values of TABLE1 in that one. Now, in TABLE1 I need to update some of those involved values and, of course, I must update properly also denormalized values in TABLE2.
What I can do to update those values in a performant way? That is, if TABLE2 contains a lot of values (1.000.000 or more), what is the best way to keep update both tables (techniques, pratices, ...)?
What can happen during the time it takes to update the database tables? For example, an user can have some problems on acceding some web site pages involving those denormalized values? If so, what those are and how can I handle the situation?
回答1:
There are a few ways to handle this situation:
- You can use a database trigger. This is not a database agnostic option and the RoR support of it is non-existent as far as I know. If your situation requires absolutely no data-inconsistency This would probably be the most performant way to achieve your goal, but I'm not a DB expert.
- You can use a batch operation to sync the two tables periodically. This method allows your two tables to drift apart and then re-synchronizes the data every so often. If your situation allows this drift to occur, this can a good option as it allows the DB to be updated during off hours. If you need to do the sync every 5 minutes you will probably want to look into other options. This can be handled by your ruby code, but will require a background job runner of some sort (cron, delayed_job, redis, etc.)
- You can use a callback from inside your Rails model. You can use
"after_update :sync_denormalized_data"
. This callback will be wrapped in a database level transaction (assuming your database supports transactions). You will have Rails level code, consistent data, and no need for a background process at the expense of making two writes every time. - Some mechanism I haven't thought of....
These types of issues are very application specific. Even within the same application you may use more than one of the methods depending on the flexibility and performance requirements involved.
回答2:
Or you can maintain normalized set of data and have your two denomalized tables. And periodically sync them. Other way have a normalized table structure to maintain data (insert/update/delete) and write a materialized view to do the reporting, that is what you are achieving by unnormalized view. you can set data updation parameters for materialized views as per your requirements.
来源:https://stackoverflow.com/questions/6517609/updating-denormalized-database-tables