问题
I am writing a addressbook module for my software right now. I have the database set up so far that it supports a very flexible address-book configuration.
I can create n-entries for every type I want. Type means here data like 'email', 'address', 'telephone' etc.
I have a table named 'contact_profiles'.
This only has two columns:
id Primary key
date_created DATETIME
And then there is a table called contact_attributes. This one is a little more complex:
id PK
#profile (Foreign key to contact_profiles.id)
type VARCHAR describing the type of the entry (name, email, phone, fax, website, ...) I should probably change this to a SET later.
value Text (containing the value for the attribute).
I can now link to these profiles, for example from my user's table. But from here I run into problems.
At the moment I would have to create a JOIN for each value that I want to retrieve. Is there a possibility to somehow create a View, that gives me a result with the type's as columns?
So right now I would get something like
#profile type value
1 email name@domain.tld
1 name Sebastian Hoitz
1 website domain.tld
But it would be nice to get a result like this:
#profile email name website
1 name@domain.tld Sebastian Hoitz domain.tld
The reason I do not want to create the table layout like this initially is, that there might always be things to add and I want to be able to have multiple attributes of the same type.
So do you know if there is any possibility to convert this dynamically?
If you need a better description please let me know.
回答1:
You have reinvented a database design called Entity-Attribute-Value. This design has a lot of weaknesses, including the weakness you've discovered: it's very hard to reproduce a query result in a conventional format, with one column per attribute.
Here's an example of what you must do:
SELECT c.id, c.date_created,
c1.value AS name,
c2.value AS email,
c3.value AS phone,
c4.value AS fax,
c5.value AS website
FROM contact_profiles c
LEFT OUTER JOIN contact_attributes c1
ON (c.id = c1.profile AND c1.type = 'name')
LEFT OUTER JOIN contact_attributes c1
ON (c.id = c1.profile AND c1.type = 'email')
LEFT OUTER JOIN contact_attributes c1
ON (c.id = c1.profile AND c1.type = 'phone')
LEFT OUTER JOIN contact_attributes c1
ON (c.id = c1.profile AND c1.type = 'fax')
LEFT OUTER JOIN contact_attributes c1
ON (c.id = c1.profile AND c1.type = 'website');
You must add another LEFT OUTER JOIN
for every attribute. You must know the attributes at the time you write the query. You must use LEFT OUTER JOIN
and not INNER JOIN
because there's no way to make an attribute mandatory (the equivalent of simply declaring a column NOT NULL
).
It's far more efficient to retrieve the attributes as they are stored, and then write application code to loop through the result set, building an object or associative array with an entry for each attribute. You don't need to know all the attributes this way, and you don't have to execute an n
-way join.
SELECT * FROM contact_profiles c
LEFT OUTER JOIN contact_attributes ca ON (c.id = ca.profile);
You asked in a comment what to do if you need this level of flexibility, if not use the EAV design? SQL is not the correct solution if you truly need unlimited metadata flexibility. Here are some alternatives:
- Store a
TEXT
BLOB, containing all the attributes structured in XML or YAML format. - Use a semantic data modeling solution like Sesame, in which any entity can have dynamic attributes.
- Abandon databases and use flat files.
EAV and any of these alternative solutions is a lot of work. You should consider very carefully if you truly need this degree of flexibility in your data model, because it's hugely more simple if you can treat the metadata structure as relatively unchanging.
回答2:
If you are limiting yourself to displaying a single email, name, website, etc. for each person in this query, I'd use subqueries:
SELECT cp.ID profile
,cp.Name
,(SELECT value FROM contact_attributes WHERE type = 'email' and profile = cp.id) email
,(SELECT value FROM contact_attributes WHERE type = 'website' and profile = cp.id) website
,(SELECT value FROM contact_attributes WHERE type = 'phone' and profile = cp.id) phone
FROM contact_profiles cp
If you're using SQL Server, you could also look at PIVOT.
If you want to show multiple emails, phones, etc., then consider that each profile must have the same number of them or you'll have blanks.
I'd also factor out the type column. Create a table called contact_attribute_types
which would hold "email", "website", etc. Then you'd store the contact_attribute_types.id
integer value in the contact_attributes
table.
回答3:
You will need to generate a query like:
select #profile,
max(case when type='email' then value end) as email,
max(case when type='name' then value end) as name,
max(case when type='website' then value end) as website
from mytable
group by #profile
However, that will only show one value for each type per #profile. Your DBMS may have a function you can use instead of MAX to concatenate all the values as a comma-separated string, or you may be able to write one.
This kind of data model is generally best avoided for the reasons you have already mentioned!
回答4:
You create a view for each contact type
When you want all the information you pull from the entire table, when you want a subset of a specific contact type, you pull from the view.
I'd create a stored procedure that takes the intent {all, phone, email, address} as one of the parameters and then derive the data. All my app code would call this stored procedure to get the data. Also, when a new type is added (which should be very infrequently, you create another view and modify only this sproc).
I've implemented a similar design for multiple small/med size systems and have had no issues.
Am I missing something? This seems trivial?
EDIT:
I see what I was missing... You are trying to be normalized and denormalized at the same time. I'm not sure what the rest of your business rules are for pulling records. You could have profiles with multiple or null values for phone/email/addresses etc. I would keep your data format the same and again use a sproc to create the specific view you want. As your business needs change, you leave your data alone and just create another sproc to access it.
回答5:
There is no one right answer for this question, as one would need to know, for your specific organization or application, how many of those contact methods the business wants to collect, how current they want the information to be, and how much flexibility they are willing to invest in.
Of course, many of here could make some good guesses as to what the average business would want to do, but the real answer is to find out what your project, what your users, are interested in.
BTW, all architecture questions about "best"-ness require this sort of cost, benefit, and risk analysis.
回答6:
Now that the approach of document-oriented databases is getting more and more popular, one could use one of them to store all this information in one entry - and therefor deleting all those extra joins and queries.
来源:https://stackoverflow.com/questions/369213/sql-database-problems-with-addressbook-table-design