Approach to generic database design

泄露秘密 提交于 2019-12-02 03:22:22

问题


An application that I'm facing at a customer, looks like this:

  • it allows end users to enter "materials".
  • To those materials, they can append any number of "properties".
  • Properties can have a any value of type: decimal, int, dateTime and varchar (length varying from 5 characters to large chunks of text),

Essentially, the Schema looks like this:

Materials
MaterialID int not null PK
MaterialName varchar(100) not null

Properties
PropertyID
PropertyName varchar(100)

MaterialsProperties
MaterialID
PropertyID
PropertyValue varchar(3000)

An essential feature of the application is the search functionality: end users can search materials by entering queries like:

  • [property] inspectionDate > [DateTimeValue]
  • [property] serialNr = 35465488

Guess how this performs over the MaterialsProperties-table with nearly 2 million records in it.

Database was initially created under SQL Server 2000 and later on migrated to SQL Server 2005

How can this be done better?


回答1:


  1. Since users can enter their own property names, i guess every query is going to involve a scan of the properties table (in your example i need to find the propertyid of [inspectionDate]). If the properties table is large, your join would also take a long time. You could try and optimize by denormalizing and storing name with propertyID. This would be a denaormalized column in the MaterialsProperties table.
  2. You could try adding a property type (int, char etc) to the materialsproperty table and partition the table on the type.
  3. Look at Object Relational Mapping/Entity Attribute Value Model techniques for query optimization.
  4. Since you already have a lot of data (2 million records) do some data mining as see if there are repeating groups of properties for many materials. You can them put them in one schema and the rest as the EAV table. Look here for details: http://portal.acm.org/citation.cfm?id=509015&dl=GUIDE&coll=GUIDE&CFID=49465839&CFTOKEN=33971901



回答2:


You could consider separating your MaterialsProperties table by typel e.g. into IntMaterialProperties, CharMaterialProperties, etc. This would:

  • Partition your data.
  • Allow for potentially faster look-ups for integer (or other numeric) type look-ups.
  • Potentially reduce storage costs.

You could also introduce a Type column to Properties, which you could use to determine which MaterialProperties table to query. The column could also be used to validate the user's input is of the correct type, eliminating the need to query given "bad" input.



来源:https://stackoverflow.com/questions/1310630/approach-to-generic-database-design

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!