I want to extract results from EAV (entity-attribute-value) tables, or more specifically entity-metadata tables (think like wordpress wp_posts
and wp_postmeta
) as a "nicely formatted relational table", in order to do some sorting and/or filtering.
I've found some examples of how to format the results within the query (as opposed to writing 2 queries and joining the results in code), but I would like to know the "most efficient" method for doing so, especially for larger result sets.
And when I say "most efficient", I mean for something like the following scenarios:
Get all Entities with last name like XYZ
Return a list of Entities sorted by birthday
e.g. turn this:
** ENTITY ** ----------------------- ID | NAME | whatever ----------------------- 1 | bob | etc 2 | jane | etc 3 | tom | etc ** META ** ------------------------------------ ID | EntityID | KEY | VALUE ------------------------------------ 1 | 1 | first name | Bob 2 | 1 | last name | Bobson 3 | 1 | birthday | 1983-10-10 . | 2 | first name | Jane . | 2 | last name | Janesdotter . | 2 | birthday | 1983-08-10 . | 3 | first name | Tom . | 3 | last name | Tomson . | 3 | birthday | 1980-08-10
into this:
** RESULTS ** ----------------------------------------------- EID | NAME | first name | last name | birthday ----------------------------------------------- 1 | bob | Bob | Bobson | 1983-10-10 2 | jane | Jane | Janesdotter | 1983-08-10 3 | tom | Tom | Tomson | 1980-08-10
so I can sort or filter by any of the meta fields.
I found some suggestions here, but I can't find any discussion of which performs better.
Options:
- GROUP_CONCAT:
SELECT e.*, GROUP_CONCAT( CONCAT_WS('||', m.KEY, m.VALUE) ORDER BY m.KEY SEPARATOR ';;' ) FROM `ENTITY` e JOIN `META` m ON e.ID = m.EntityID
- Multi-Join:
SELECT e.*, m1.VALUE as 'first name', m2.VALUE as 'last name', m3.VALUE as 'birthday' FROM `ENTITY` e LEFT JOIN `META` m1 ON e.ID = m1.EntityID AND m1.meta_key = 'first name' LEFT JOIN `META` m2 ON e.ID = m2.EntityID AND m2.meta_key = 'last name' LEFT JOIN `META` m3 ON e.ID = m3.EntityID AND m3.meta_key = 'birthday'
- Coalescing:
SELECT e.* , MAX( IF(m.KEY= 'first name', m.VALUE, NULL) ) as 'first name' , MAX( IF(m.KEY= 'last name', m.VALUE, NULL) ) as 'last name' , MAX( IF(m.KEY= 'birthday', m.VALUE, NULL) ) as 'birthday' FROM `ENTITY` e JOIN `META` m ON e.ID = m.EntityID
- Code:
SELECT e.* FROM `ENTITY` e WHERE e.ID = {whatever};
in PHP, create a placeholder object from resultSELECT m.* FROM `META` m WHERE m.EntityID = {whatever};
in PHP, loop through results and attach to entity object like:$e->{$result->key} = $result->VALUE
Which is better in general, and for filtering/sorting?
Related questions:
Anything using pivot or aggregates will probably be faster, as they don't require the table to be self-joined. The join based approaches will require the optimiser to perform several sub-query operations and then join the results together. For a small data set this might not matter so much, but this could significantly degrade performance if you're doing an analytic query on a larger data set,
The best way to find out would be to test, off course. The answer may be different depending on the size of the dataset, the number of different meta-keys, their distribution (do all entities have values for all meta-keys? or only for a few of them?), the settings of your database server and possibly many other factors.
If I were to guess, I'd say that the cost of the JOIN
operations in option 2 would be smaller than the cost of GROUP BY
and aggregate functions needed in options 1 and 3.
So, I would expect to find Option 2 faster than 1 and 3.
To measure Option 4, you'll have to consider more factors as the application may be in another server so the loads of the two (db and application) servers and the number of clients that will be requesting these results have to be taken into account.
Sidenote: you need GROUP BY e.ID
in options 1 and 3.
来源:https://stackoverflow.com/questions/8764290/what-is-best-performance-for-retrieving-mysql-eav-results-as-relational-table