问题
I'm new to MySQL, and just learned about the importance of data normalization. My database has a simple structure:
I have 1 table called users
with fields:
userName (string)
userEmail (string)
password (string)
requests (an array of dictionaries in JSON string format)
data (another array of dictionaries in JSON string format)
deviceID (string)
Right now, this is my structure. Being very new to MySQL, I'm really not seeing why my above structure is a bad idea? Why would I need to normalize this and make separate tables? That's the first question-why? (Some have also said not to put JSON in my table. Why or why not?)
The second question is how? With the above structure, how many tables should I have, and what would be in each table?
Edit:
So maybe normalization is not absolutely necessary here, but maybe there's a better way to implement my data field? The data
field is an array of dictionaries: each dictionary is just a note item with a few keys (title, author, date, body). So what I do now is, which I think might be inefficient, every time a user composes a new note, I send that note from my app to PHP to handle. I get the JSON array of dictionaries already part of that user's data, I convert it to a PHP array, I then add to the end of this array the new note, convert the whole thing back to JSON, and put it back in the table as an array of dictionaries. And this process is repeated every time a new note is composed. Is there a better way to do this? Maybe a user's data should be a table, with each row being a note-but I'm not really sure how this would work?
回答1:
The answer to all your questions really depends on what the JSON data is for, and whether you'll ever need to use some property of that data to determine which rows are returned.
If your data truly has no schema, and you're really just using it to store data that will be used by an application that knows how to retrieve the correct row by some other criteria (such as one of the other fields) every time, there's no reason to store it as anything other than exactly as that application expects it (in this case, JSON).
If the JSON data DOES contain some structure that is the same for all entries, and if it's useful to query this data directly from the database, you would want to create one or more tables (or maybe just some more fields) to hold this data.
As a practical example of this, if the data fields contains JSON enumerating services for that user in an array, and each service has a unique id, type, and price, you might want a separate table with the following fields (using your own naming conventions):
serviceId (integer)
userName (string)
serviceType (string)
servicePrice (float)
And each service for that user would get it's own entry. You could then query for users than have a particular service, which depending on your needs, could be very useful. In addition to easy querying, indexing certain fields of the separate tables can also make for very QUICK queries.
Update: Based on your explanation of the data stored, and the way you use it, you probably do want it normalized. Something like the following:
# user table
userId (integer, auto-incrementing)
userName (string)
userEmail (string)
password (string)
deviceID (string)
# note table
noteId (integer, auto-incrementing)
userId (integer, matches user.userId)
noteTime (datetime)
noteData (string, possibly split into separate fields depending on content, such as subject, etC)
# request table
requestId (integer, auto-incrementing)
userId (integer, matches user.userId)
requestTime (datetime)
requestData (string, again split as needed)
You could then query like so:
# Get a user
SELECT * FROM user WHERE userId = '123';
SELECT * FROM user WHERE userNAme = 'foo';
# Get all requests for a user
SELECT * FROM request WHERE userId = '123';
# Get a single request
SELECT * FROM request WHERE requestId = '325325';
# Get all notes for a user
SELECT * FROM note WHERE userId = '123';
# Get all notes from last week
SELECT * FROM note WHERE userId = '123' AND noteTime > CURDATE() - INTERVAL 1 WEEK;
# Add a note to user 123
INSERT INTO note (noteId, userId, noteData) VALUES (null, 123, 'This is a note');
Notice how much more you can do with normalized data, and how easy it is? It's trivial to locate, update, append, or delete any specific component.
回答2:
Normalization is a philosophy. Some people think it fits their database approach, some don't. Many modern database solutions even focus on denormalization to improve speeds.
Normalization often doesn't improve speed. However, it greatly improves the simplicity of accessing and writing data. For example, if you wanted to add a request, you would have to write a completely new JSON field. If it was normalized, you could simply add a row to a table.
In normalization, "array of dictionaries in JSON string format" is always bad. Array of dictionaries can be translated as list of rows, which is a table.
If you're new to databases: NORMALIZE. Denormalization is something for professionals.
回答3:
A main benefit of normalization is to eliminate redundant data, but since each user's data is unique to that user, there is no benefit to splitting this table and normalizing. Furthermore, since the front-end will employ the dictionaries as JSON objects anyway, undue complication and a decrease in performance would result from trying to decompose this data.
Okay, here is a normalized mySQL data-model. Note: you can separate authors and titles into two tables to further reduce data redundancy. You can probably use similar techniques for the "requests dictionaries":
CREATE TABLE USERS(
UID int NOT NULL AUTO_INCREMENT PRIMARY KEY,
userName varchar(255) UNIQUE,
password varchar(30),
userEmail varchar(255) UNIQUE,
deviceID varchar(255)
) ENGINE=InnoDB;
CREATE TABLE BOOKS(
BKID int NOT NULL AUTO_INCREMENT PRIMARY KEY,
FKUSERS int,
Title varchar(255),
Author varchar(50)
) ENGINE=InnoDB;
ALTER TABLE BOOKS
ADD FOREIGN KEY (FKUSERS)
REFERENCES USERS(UID);
CREATE TABLE NOTES(
ID int NOT NULL AUTO_INCREMENT PRIMARY KEY,
FKUSERS int,
FKBOOKS int,
Date date,
Notes text
) ENGINE=InnoDB;
ALTER TABLE NOTES
ADD FOREIGN KEY BKNO (FKUSERS)
REFERENCES USERS(UID);
ALTER TABLE NOTES
ADD FOREIGN KEY (FKBOOKS)
REFERENCES BOOKS(BKID);
回答4:
In your case, I will abstract out the class that handles this table. Then keep the data normalized. if in future, the data access patterns changes and i need to normalized the data, i css just do so with less impact on the program. I just need to change the class that handles this set of data to query the normalized tables , but return the data as if the database structure never changed.
来源:https://stackoverflow.com/questions/9321399/normalizing-mysql-data