Entity Attribute Value Database vs. strict Relational Model Ecommerce

前端 未结 10 1363
谎友^
谎友^ 2020-11-22 05:23

It is safe to say that the EAV/CR database model is bad. That said,

Question: What database model, technique, or pattern should be used to deal with \"clas

相关标签:
10条回答
  • 2020-11-22 05:43

    It is safe to say that the EAV/CR database model is bad.

    No, it's not. It's just that they're an inefficient usage of relational databases. A purely key/value store works great with this model.

    Now, to your real question: How to store various attributes and keep them searchable?

    Just use EAV. In your case it would be a single extra table. index it on both attribute name and value, most RDBMs would use prefix-compression to on the attribute name repetitions, making it really fast and compact.

    EAV/CR gets ugly when you use it to replace 'real' fields. As with every tool, overusing it is 'bad', and gives it a bad image.

    0 讨论(0)
  • 2020-11-22 05:47

    I'm struggling with the same issue. It may be interesting for you to check out the following discussion on two existing ecommerce solutions: Magento (EAV) and Joomla (regular relational structure): https://forum.virtuemart.net/index.php?topic=58686.0

    It seems, that Magento's EAV performance is a real showstopper.

    That's why I'm leaning towards a normalized structure. To overcome the lack of flexibility I'm thinking about adding some separate data dictionary in the future (XML or separate DB tables) that could be edited, and based on that, application code for displaying and comparing product categories with new attributes set would be generated, together with SQL scripts.

    Such architecture seems to be the sweetspot in this case - flexible and performant at the same time.

    The problem could be frequent use of ALTER TABLE in live environment. I'm using Postgres, so its MVCC and transactional DDL will hopefully ease the pain.

    0 讨论(0)
  • 2020-11-22 05:51

    I'm surprised nobody mentioned NoSQL databases.

    I've never practiced NoSQL in a production context (just tested MongoDB and was impressed) but the whole point of NoSQL is being able to save items with varying attributes in the same "document".

    0 讨论(0)
  • 2020-11-22 05:55

    There's a few general pros and cons I can think of, there are situations where one is better than the other:

    Option 1, EAV Model:

    • Pro: less time to design and develop a simple application
    • Pro: new entities easy to add (might even be added by users?)
    • Pro: "generic" interface components
    • Con: complex code required to validate simple data types
    • Con: much more complex SQL for simple reports
    • Con: complex reports can become almost impossible
    • Con: poor performance for large data sets

    Option 2, Modelling each entity separately:

    • Con: more time required to gather requirements and design
    • Con: new entities must be modelled and designed by a professional
    • Con: custom interface components for each entity
    • Pro: data type constraints and validation simple to implement
    • Pro: SQL is easy to write, easy to understand and debug
    • Pro: even the most complex reports are relatively simple
    • Pro: best performance for large data sets

    Option 3, Combination (model entities "properly", but add "extensions" for custom attributes for some/all entities)

    • Pro/Con: more time required to gather requirements and design than option 1 but perhaps not as much as option 2 *
    • Con: new entities must be modelled and designed by a professional
    • Pro: new attributes might be easily added later on
    • Con: complex code required to validate simple data types (for the custom attributes)
    • Con: custom interface components still required, but generic interface components may be possible for the custom attributes
    • Con: SQL becomes complex as soon as any custom attribute is included in a report
    • Con: good performance generally, unless you start need to search by or report by the custom attributes

    * I'm not sure if Option 3 would necessarily save any time in the design phase.

    Personally I would lean toward option 2, and avoid EAV wherever possible. However, for some scenarios the users need the flexibility that comes with EAV; but this comes with a great cost.

    0 讨论(0)
  • 2020-11-22 05:58

    Where performance is not a major requirement, as in an ETL type of application, EAV has another distinct advantage: differential saves.

    I've implemented a number of applications where an over-arching requirement was the ability to see the history of a domain object from its first "version" to it's current state. If that domain object has a large number of attributes, that means each change requires a new row be inserted into it's corresponding table (not an update because the history would be lost, but an insert). Let's say this domain object is a Person, and I have 500k Persons to track with an average of 100+ changes over the Persons life-cycle to various attributes. Couple that with the fact that rare is the application that has only 1 major domain object and you'll quickly surmize that the size of the database would quickly grow out of control.

    An easy solution is to save only the differential changes to the major domain objects rather than repeatedly saving redundant information.

    All models change over time to reflect new business needs. Period. Using EAV is but one of the tools in our box to use; but it should never be automatically classified as "bad".

    0 讨论(0)
  • 2020-11-22 06:00

    EAV has many drawbacks:

    1. Performance degradation over time Once the amount of data in the application grows beyond a certain size, the retrieval and manipulation of that data is likely to become less and less efficient.
    2. The SQL queries are very complex and difficult to write.
    3. Data Integrity problems. You can't define foreign keys for all the fields needed.
    4. You have to define and maintain your own metadata.
    0 讨论(0)
提交回复
热议问题