i am doing a database design using EAV. I am facing an issue when i try to model an entity with attribute having multiple values?
For example
Entity
<
Yes! it is quite possible to have multi-value attributes attributes in EAV.
In fact it is easier than with the traditional relational model where one would need to either create an extra column, or store the multiple values with a delimited format of sort; both such approach suffer added complexity when querying the database for a given value of the underlying field (attribute).
The simplest way to have multiple values is simply to have an extra value record! (as shown in the question)
In addition, the EAV store structure can be changed to explicitly accommodate multi-values, with:
As said, these changes to the physical schema of the EAV store are not necessary, but they can be used to ensure that the data conforms to the (logical) schema as well as to maybe display the several values of a multivalue attributes in a particular order etc.
Edit: (details on implementing multi-value and/or composite ("object-like") attributes)
If you are absolutely positive that the multiple ["sub"-]values which constitute an attribute (or similarly the multiple parts which constitute an "object type" attribute), are completely atomic, i.e. will never be searched or displayed (or ...) individually, you can store such an attribute' value "sets", as a single record in the value table, by encoding the multiple values into a single string; For this purpose, JSON or XML-at-large comes to mind and seems particularly interesting for very expandable/generic, but any other format which you can parse in and out in a reliable fashion would also work (say delimited format).
A more "natural" way (EAV-wise) to store such "attribute value parts", is by storing them individually (in multiple records of the value table, possibly with a sequence field as hinted earlier). This approach allows handling the "sub parts", in some contexts, as if they were attributes.
In both cases, you need to alter the attribute table to add the necessary properties and types codes, to describe such multi-part attributes. Similarly to the approach of storing the data (in the value table), you can either make the attribute record such that all the info for a given [multi-part] attribute is stored in a single attribute record, or, [and this is typically easier and more flexible] you can create one attribute per part, plus one attribute to "tie them together" (for example with a property which contains a string delimted with each of the attribute ID values of the subparts.
For example:
A composite attribute for metallic pipe items, could be the diameter, made of two parts: a numeric value and a unit code (Millimeter, vs. Inches).
With the first approach:
- there would be one record in the attribute table, with a type indicating that this is a multi-value, and an extension property to contain the [ordered] list of types of the individual sub parts.
- there would be a single record in the value table, containing a coded value such as as say "0.75|Inch" (or <diam>0.75</diam><unit>Inch</unit>
).
With the second approach:
- there would be 3 records in the attribute table: a record or type numeric and named say "diamvalue", a record of type string, named "unit" and a record of type composite name "Diameter"; this last record would somehow have a reference to the ID of the two other attributes (a simple comma delimited string comes to mind)
- there would be two records in the value table, one each of the diamvalue and the unit attributes (such records would have an additional field, called say "parent" containing the AttributeID of the "Diameter" attribute. Optionally there could be also an value record for the "Diameter" attribute [I personally find this redundant with the "parent" property.
As indicated earlier, the main advantage of the second solution is that [when appropriate] one can query the catalog for a particular set of items based on the value of an attribute part, for example searching for all pipes which have a metric unit. Such queries are resolved at the level of SQL, whereby with the first approach, SQL would have to scan all attribute values for the attribute "Diameter" and parse the value to search for the unit code.
A picture's worth a thousand words ;-)
This diagram shows a possible layout with sample data for the "second approach".
Entity
id | name | description
-- | ---- | ------------
1 | configuration1 | configuration1
Attribute
id | name | type | Required | Repeats | SubAttribIdList
-- | ---- | ---- | -------- | ------- | ---------------
1 | att1 | string | N | N | null (only applicable to composite types)
2 | att2 | int | Y | N | null
3 | att3 | string | Y | Y | null
4 | DiamValue | numeric | Y | N | null
5 | Unit | string | Y | N | null
6 | Diameter | composite| N | N | 4,5
Value
id | entityId| attributeId | ParentAttribId |SeqNr | value
-- | --------| ----------- | -------------- |----- | -----
1 | 1 | 1 | null | 1 | a
2 | 1 | 2 | null | 1 | 1
3 | 1 | 3 | null | 1 | b (this value and next show show a repeating attribute)
4 | 1 | 3 | null | 2 | c
5 | 1 | 3 | null | 3 | d
6 | 1 | 4 | 6 | 1 | 0.75 (this value and next one shows a composite attribute
7 | 1 | 5 | 6 | 1 | Inches
A few notes:
- The SeqNr for the values ids 6 and 7 is 1, for both. Their order is implicit to the SubAttribIdList. If Attribute id 6 had been made a multi-value ("Repeats") attribute, an entity could have additional couplets of two values, sequenced, in pair, 2, 3 etc.
- The sequence Number for non repeatable Attributes is set to 1, systematically, could as well be NULL this this doesn't apply.
- The "Required" property of attribute doesn't figure in the multi-value or composite question; I just added it as it is commonly used to help the application (or the entity access layer) enforce various integrity rules.
- A few of the design choices in this layout imply a maximum of 1 level of inclusion for composite attributes (a composite cannot be a included in a composite), as well as prevent a composite to include a multi-value attribute. These limitations can be avoided with the proper structure (and with a bit of added complexity in the access layer), but the simpler schema is typically acceptable (attributes that would require such fancy structure are often a indicative of a flaw in the logical schema).