Entity-Attribute-Value Table Design

15,696

Solution 1

I'm going to offer a contrary opinion to most of the comments on this question. While EAV is EVIL for all of the reasons that you can find thoroughly explained many times here on SO and DBA.SE and elsewhere, there is one really common application for which most of the things that are wrong with EAV are largely irrelevant and the (few) advantages of EAV are very much germane. That application is online product catalogs.

The main problem with EAV is that it doesn't let the database do what it is really good at doing, which is helping to give proper context to different attributes of information about different entities by arranging them in a schema. Having a schema brings many, many advantages around accessing, interpreting and enforcing integrity of your data.

The fact about product catalogs is that the attributes of a product are almost entirely irrelevant to the catalog system itself. Product catalog systems do (at most) three things with product attributes.

  1. Display the product attributes in a list to end users in the form: {attribute name}: {attribute value}.

  2. Display the attributes of multiple products in a comparison grid where attributes of different products line up against each other (products are usually columns, attributes are usually rows)

  3. Drive rules for something (e.g. pricing) based on particular attribute/value combinations.

If all your system does is regurgitate information that is semantically irrelevant (to the system) then the schema for this information is basically unhelpful. In fact the schema gets in the way in an online product catalog, especially if your catalog has many diverse types of products, because you're always having to go back into the schema to tinker with it to allow for new product categories or attribute types.

Because of how it's used, even the data type of an attribute value in a product catalog is not necessarily (vitally) important. For some attributes you may want to impose contraints, like "must be a number" or "must come from this list {...}". That depends on how important attribute consistency is to your catalog and how elaborate you want your implementation to be. Looking at the product catalogs of several online retailers I'd say most are prepared to trade off simplicity for consistency.

Yes, EAV is evil, except when it isn't.

Solution 2

I do not know if this should be a comment or answer. Nevertheless here I go.

I do not know exactly what are you building. But have you taken a look into Magento EAV database structure? Yes, it can be slow, queries can be huge but for us the pluses are more than the minus. And on the other hand magento takes care of the queries.

We are in the middle of a migration of our online store (medium-big size store) to use Magento and for now we are very happy with the EAV approach.

Solution 3

Yes, there is typically a large penalty in assembling the queries for an EAV model. There are bigger performance penalties for checking the self-consistency of the data, because the DBMS is not going to be able to do it for you. If something goes wrong, the DBMS cannot tell you.

With a more orthodox database design, as recommended by Oded in comments, the DBMS ensures that the data in the database is more nearly consistent. I would strongly counsel the use of a regular (non-EAV) design.

Share:
15,696

Related videos on Youtube

James Arnold
Author by

James Arnold

PHP Developer.

Updated on June 01, 2022

Comments

  • James Arnold
    James Arnold almost 2 years

    I am currently designing a database structure for the products section of an ecommerce platform. It needs to be designed in such a way that makes it possible to sell an infinite number of different types of products with an infinite number of different attributes.

    E.g. The attributes of a laptop would be RAM, Screen Size, Weight, etc. The attributes of a book would be Author, ISBN, Publisher, etc.

    It seems like an EAV structure would be most suitable.

    • Select a product
    • Product belongs to attribute set
    • Attribute set contains attributes x and y
      • Attribute x is data type datetime (values stored in attribute_values_datetime)
      • Attribute y is data type int (values stored in attribute_values_int)
    • Each attribute definition denotes the type (i,e, x has column type -> datetype)

    Assuming the above, could I join the selection to the attribute_values_datetime table to get the right data without getting the result set and building a second query now that the table is known? Would there be a large performance hit constructing a query of this type or would the below be more suitable (although less functional)

    • Select a product
    • Product belongs to attribute set
    • Attribute set contains attributes x and y
      • Attribute x is data type datetime but stored as TEXT in attribute_values
      • Attribute y is data type int but stored as TEXT in attribute_values
    • Oded
      Oded almost 12 years
      Don't go with EAV. Never mind the performance issues (massive table that will only ever grow), consider how you would query against it. EAV is normalization gone overboard in most cases.
    • Jodrell
      Jodrell almost 12 years
      What are you going to do with the attributes, will you want to use them for filtering?
    • Jodrell
      Jodrell almost 12 years
      I am inclined to agree with @Oded, you end up building a database within a database. I'm left wondering what approach large online retailers take (the good ones.)
    • James Arnold
      James Arnold almost 12 years
      Some may be used for filtering, yes. Others may just be dropdowns for colours, packaging options, etc. Some attributes will have an impact on the product price.
    • James Arnold
      James Arnold almost 12 years
      @oded - Do you have any suggestions for an alternative?
    • Oded
      Oded almost 12 years
      Use the database as a database... Create tables for the actual product types you do end up having. I would push back against unreasonable requirements - and "an infinite number of different types of products with an infinite number of different attributes" certainly sounds unreasonable to me. Get some estimated limits from your business.
    • Jodrell
      Jodrell almost 12 years
      Just thinking out loud, books are a rather extreme example. A store would only sell around 1000 types laptop (just guessing) but for books the problem is increased by serveral orders of magnitude that would probably crush an EAV model.
    • James Arnold
      James Arnold almost 12 years
      I only used those product types as an example. The platform will be used for more than one e-commerce site; the intention is to re-theme it for a multitude of different customers without having to change the database structure or underlying logic.
    • Mike Sherrill 'Cat Recall'
      Mike Sherrill 'Cat Recall' almost 12 years
      @Oded: EAV has nothing to do with normalization. There isn't any rule of decomposition that says, "Store the name of an attribute as data in a row in a table, and store its value, no matter the data type, as varchar(n) in the same row." It might be abstraction gone overboard, though.
    • Oded
      Oded almost 12 years
      @Catcall - I know what you mean. My point was that some developers who first learn about normalization may very well take it overboard and arrive at EAV to model the whole DB (oh look at how flexible a schema I have now!).
    • Bill Karwin
      Bill Karwin almost 10 years
      @Oded, there is no way someone could follow rules of normalization, overboard or not, and arrive at EAV. They can arrive at EAV only if they do not understand what normalization means at all. Both the physical table that stores EAV data, and the virtual table it trying to model fail to be a relation. And you can't put a table in any normal forms if it isn't a relation. That's a prerequisite, as if there's a "0th normal form."
  • fresher
    fresher over 7 years
    1) What measures we can take to prevent performacne problems after usng eav , if we use eav , for sure performance problems will happen if we have thousands of products ?
  • Joel Brown
    Joel Brown over 7 years
    @PhpBeginner Why do you say that performance problems are inevitable using EAV for a product catalog? I don't think that is a fair comment. Please be specific about what will perform worse? This kind of generalization is precisely what I'm talking about in this answer. EAV is evil for most applications. Online product catalogs is not one of them. In this specific scenario you cannot say "EAV is slow", or "EAV makes your queries complicated", or "EAV removes meaning from the data" or any of the other things that are usually valid criticisms of EAV.