What is best performance for Retrieving MySQL EAV results as Relational Table

∥☆過路亽.° 提交于 2019-11-27 01:38:11

问题


I want to extract results from EAV (entity-attribute-value) tables, or more specifically entity-metadata tables (think like wordpress wp_posts and wp_postmeta) as a "nicely formatted relational table", in order to do some sorting and/or filtering.

I've found some examples of how to format the results within the query (as opposed to writing 2 queries and joining the results in code), but I would like to know the "most efficient" method for doing so, especially for larger result sets.

And when I say "most efficient", I mean for something like the following scenarios:

Get all Entities with last name like XYZ

Return a list of Entities sorted by birthday


e.g. turn this:

** ENTITY **
-----------------------
ID  | NAME | whatever
-----------------------
 1  | bob  | etc
 2  | jane | etc
 3  | tom  | etc

** META **
------------------------------------
ID | EntityID | KEY         | VALUE
------------------------------------
 1 |   1      | first name  | Bob
 2 |   1      | last name   | Bobson
 3 |   1      | birthday    | 1983-10-10
 . |   2      | first name  | Jane
 . |   2      | last name   | Janesdotter
 . |   2      | birthday    | 1983-08-10
 . |   3      | first name  | Tom
 . |   3      | last name   | Tomson
 . |   3      | birthday    | 1980-08-10

into this:

** RESULTS **
-----------------------------------------------
EID | NAME | first name | last name    | birthday
-----------------------------------------------
 1  | bob  | Bob        | Bobson       | 1983-10-10
 2  | jane | Jane       | Janesdotter  | 1983-08-10
 3  | tom  | Tom        | Tomson       | 1980-08-10

so I can sort or filter by any of the meta fields.


I found some suggestions here, but I can't find any discussion of which performs better.

Options:

  1. GROUP_CONCAT:
    SELECT e.*, GROUP_CONCAT( CONCAT_WS('||', m.KEY, m.VALUE) ORDER BY m.KEY SEPARATOR ';;' )
    FROM `ENTITY` e JOIN `META` m ON e.ID = m.EntityID
    
  2. Multi-Join:
    SELECT e.*, m1.VALUE as 'first name', m2.VALUE as 'last name', m3.VALUE as 'birthday'
    FROM `ENTITY` e
    LEFT JOIN `META` m1
        ON e.ID = m1.EntityID AND m1.meta_key = 'first name'
    LEFT JOIN `META` m2
        ON e.ID = m2.EntityID AND m2.meta_key = 'last name'
    LEFT JOIN `META` m3
        ON e.ID = m3.EntityID AND m3.meta_key = 'birthday'
    
  3. Coalescing:
    SELECT e.*
       , MAX( IF(m.KEY= 'first name', m.VALUE, NULL) ) as 'first name'
       , MAX( IF(m.KEY= 'last name', m.VALUE, NULL) ) as 'last name'
       , MAX( IF(m.KEY= 'birthday', m.VALUE, NULL) ) as 'birthday'
    FROM `ENTITY` e
    JOIN `META` m
        ON e.ID = m.EntityID
    
  4. Code:
    SELECT e.* FROM `ENTITY` e WHERE e.ID = {whatever};
    
    in PHP, create a placeholder object from result
    SELECT m.* FROM `META` m WHERE m.EntityID = {whatever};
    
    in PHP, loop through results and attach to entity object like: $e->{$result->key} = $result->VALUE

Which is better in general, and for filtering/sorting?

Related questions:

  1. Binding EAV results
  2. How to Pivot a MySQL entity

回答1:


Anything using pivot or aggregates will probably be faster, as they don't require the table to be self-joined. The join based approaches will require the optimiser to perform several sub-query operations and then join the results together. For a small data set this might not matter so much, but this could significantly degrade performance if you're doing an analytic query on a larger data set,




回答2:


The best way to find out would be to test, off course. The answer may be different depending on the size of the dataset, the number of different meta-keys, their distribution (do all entities have values for all meta-keys? or only for a few of them?), the settings of your database server and possibly many other factors.

If I were to guess, I'd say that the cost of the JOIN operations in option 2 would be smaller than the cost of GROUP BY and aggregate functions needed in options 1 and 3.

So, I would expect to find Option 2 faster than 1 and 3.

To measure Option 4, you'll have to consider more factors as the application may be in another server so the loads of the two (db and application) servers and the number of clients that will be requesting these results have to be taken into account.


Sidenote: you need GROUP BY e.ID in options 1 and 3.



来源:https://stackoverflow.com/questions/8764290/what-is-best-performance-for-retrieving-mysql-eav-results-as-relational-table

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!