How can you represent inheritance in a database?

132,972

Solution 1

@Bill Karwin describes three inheritance models in his SQL Antipatterns book, when proposing solutions to the SQL Entity-Attribute-Value antipattern. This is a brief overview:

Single Table Inheritance (aka Table Per Hierarchy Inheritance):

Using a single table as in your first option is probably the simplest design. As you mentioned, many attributes that are subtype-specific will have to be given a NULL value on rows where these attributes do not apply. With this model, you would have one policies table, which would look something like this:

+------+---------------------+----------+----------------+------------------+
| id   | date_issued         | type     | vehicle_reg_no | property_address |
+------+---------------------+----------+----------------+------------------+
|    1 | 2010-08-20 12:00:00 | MOTOR    | 01-A-04004     | NULL             |
|    2 | 2010-08-20 13:00:00 | MOTOR    | 02-B-01010     | NULL             |
|    3 | 2010-08-20 14:00:00 | PROPERTY | NULL           | Oxford Street    |
|    4 | 2010-08-20 15:00:00 | MOTOR    | 03-C-02020     | NULL             |
+------+---------------------+----------+----------------+------------------+

\------ COMMON FIELDS -------/          \----- SUBTYPE SPECIFIC FIELDS -----/

Keeping the design simple is a plus, but the main problems with this approach are the following:

  • When it comes to adding new subtypes, you would have to alter the table to accommodate the attributes that describe these new objects. This can quickly become problematic when you have many subtypes, or if you plan to add subtypes on a regular basis.

  • The database will not be able to enforce which attributes apply and which don't, since there is no metadata to define which attributes belong to which subtypes.

  • You also cannot enforce NOT NULL on attributes of a subtype that should be mandatory. You would have to handle this in your application, which in general is not ideal.

Concrete Table Inheritance:

Another approach to tackle inheritance is to create a new table for each subtype, repeating all the common attributes in each table. For example:

--// Table: policies_motor
+------+---------------------+----------------+
| id   | date_issued         | vehicle_reg_no |
+------+---------------------+----------------+
|    1 | 2010-08-20 12:00:00 | 01-A-04004     |
|    2 | 2010-08-20 13:00:00 | 02-B-01010     |
|    3 | 2010-08-20 15:00:00 | 03-C-02020     |
+------+---------------------+----------------+
                          
--// Table: policies_property    
+------+---------------------+------------------+
| id   | date_issued         | property_address |
+------+---------------------+------------------+
|    1 | 2010-08-20 14:00:00 | Oxford Street    |   
+------+---------------------+------------------+

This design will basically solve the problems identified for the single table method:

  • Mandatory attributes can now be enforced with NOT NULL.

  • Adding a new subtype requires adding a new table instead of adding columns to an existing one.

  • There is also no risk that an inappropriate attribute is set for a particular subtype, such as the vehicle_reg_no field for a property policy.

  • There is no need for the type attribute as in the single table method. The type is now defined by the metadata: the table name.

However this model also comes with a few disadvantages:

  • The common attributes are mixed with the subtype specific attributes, and there is no easy way to identify them. The database will not know either.

  • When defining the tables, you would have to repeat the common attributes for each subtype table. That's definitely not DRY.

  • Searching for all the policies regardless of the subtype becomes difficult, and would require a bunch of UNIONs.

This is how you would have to query all the policies regardless of the type:

SELECT     date_issued, other_common_fields, 'MOTOR' AS type
FROM       policies_motor
UNION ALL
SELECT     date_issued, other_common_fields, 'PROPERTY' AS type
FROM       policies_property;

Note how adding new subtypes would require the above query to be modified with an additional UNION ALL for each subtype. This can easily lead to bugs in your application if this operation is forgotten.

Class Table Inheritance (aka Table Per Type Inheritance):

This is the solution that @David mentions in the other answer. You create a single table for your base class, which includes all the common attributes. Then you would create specific tables for each subtype, whose primary key also serves as a foreign key to the base table. Example:

CREATE TABLE policies (
   policy_id          int,
   date_issued        datetime,

   -- // other common attributes ...
);

CREATE TABLE policy_motor (
    policy_id         int,
    vehicle_reg_no    varchar(20),

   -- // other attributes specific to motor insurance ...

   FOREIGN KEY (policy_id) REFERENCES policies (policy_id)
);

CREATE TABLE policy_property (
    policy_id         int,
    property_address  varchar(20),

   -- // other attributes specific to property insurance ...

   FOREIGN KEY (policy_id) REFERENCES policies (policy_id)
);

This solution solves the problems identified in the other two designs:

  • Mandatory attributes can be enforced with NOT NULL.

  • Adding a new subtype requires adding a new table instead of adding columns to an existing one.

  • No risk that an inappropriate attribute is set for a particular subtype.

  • No need for the type attribute.

  • Now the common attributes are not mixed with the subtype specific attributes anymore.

  • We can stay DRY, finally. There is no need to repeat the common attributes for each subtype table when creating the tables.

  • Managing an auto incrementing id for the policies becomes easier, because this can be handled by the base table, instead of each subtype table generating them independently.

  • Searching for all the policies regardless of the subtype now becomes very easy: No UNIONs needed - just a SELECT * FROM policies.

I consider the class table approach as the most suitable in most situations.


The names of these three models come from Martin Fowler's book Patterns of Enterprise Application Architecture.

Solution 2

The 3rd option is to create a "Policy" table, then a "SectionsMain" table that stores all of the fields that are in common across the types of sections. Then create other tables for each type of section that only contain the fields that are not in common.

Deciding which is best depends mostly on how many fields you have and how you want to write your SQL. They would all work. If you have just a few fields then I would probably go with #1. With "lots" of fields I would lean towards #2 or #3.

Solution 3

In addition at the Daniel Vassallo solution, if you use SQL Server 2016+, there is another solution that I used in some cases without considerable lost of performances.

You can create just a table with only the common field and add a single column with the JSON string that contains all the subtype specific fields.

I have tested this design for manage inheritance and I am very happy for the flexibility that I can use in the relative application.

Solution 4

With the information provided, I'd model the database to have the following:

POLICIES

  • POLICY_ID (primary key)

LIABILITIES

  • LIABILITY_ID (primary key)
  • POLICY_ID (foreign key)

PROPERTIES

  • PROPERTY_ID (primary key)
  • POLICY_ID (foreign key)

...and so on, because I'd expect there to be different attributes associated with each section of the policy. Otherwise, there could be a single SECTIONS table and in addition to the policy_id, there'd be a section_type_code...

Either way, this would allow you to support optional sections per policy...

I don't understand what you find unsatisfactory about this approach - this is how you store data while maintaining referential integrity and not duplicating data. The term is "normalized"...

Because SQL is SET based, it's rather alien to procedural/OO programming concepts & requires code to transition from one realm to the other. ORMs are often considered, but they don't work well in high volume, complex systems.

Solution 5

The another way to do it, is using the INHERITS component. For example:

CREATE TABLE person (
    id int ,
    name varchar(20),
    CONSTRAINT pessoa_pkey PRIMARY KEY (id)
);

CREATE TABLE natural_person (
    social_security_number varchar(11),
    CONSTRAINT pessoaf_pkey PRIMARY KEY (id)
) INHERITS (person);


CREATE TABLE juridical_person (
    tin_number varchar(14),
    CONSTRAINT pessoaj_pkey PRIMARY KEY (id)
) INHERITS (person);

Thus it's possible to define a inheritance between tables.

Share:
132,972

Related videos on Youtube

Steve Jones
Author by

Steve Jones

Serial Entrepreneur based in London. Working on my latest startup.

Updated on February 18, 2020

Comments

  • Steve Jones
    Steve Jones over 4 years

    I'm thinking about how to represent a complex structure in a SQL Server database.

    Consider an application that needs to store details of a family of objects, which share some attributes, but have many others not common. For example, a commercial insurance package may include liability, motor, property and indemnity cover within the same policy record.

    It is trivial to implement this in C#, etc, as you can create a Policy with a collection of Sections, where Section is inherited as required for the various types of cover. However, relational databases don't seem to allow this easily.

    I can see that there are two main choices:

    1. Create a Policy table, then a Sections table, with all the fields required, for all possible variations, most of which would be null.

    2. Create a Policy table and numerous Section tables, one for each kind of cover.

    Both of these alternatives seem unsatisfactory, especially as it is necessary to write queries across all Sections, which would involve numerous joins, or numerous null-checks.

    What is the best practice for this scenario?

  • D'Arcy Rittich
    D'Arcy Rittich almost 14 years
    +1: 3rd option is the closest to the inheritance model, and most normalized IMO
  • Steve Jones
    Steve Jones almost 14 years
    Your option #3 is really just what I meant by option #2. There are many fields and some Section would have child entities too.
  • Steve Jones
    Steve Jones almost 14 years
    There will be way too much information to present the whole Policy in one go, so it'd never be necessary to retrieve the whole record. I think it is 2005, although I have used 2008's sparse in other projects.
  • Steve Jones
    Steve Jones almost 14 years
    Yeah, I get the normalisation thing ;-) For such a complex structure, with some sections being simple and some having their own complex sub-structure, it seems unlikely that an ORM would work, although it would be nice.
  • Steve Jones
    Steve Jones almost 14 years
    Yes, your third option, "Class Table Inheritance" is what I mentioned as my second option and it likely to be best in this case, imho. It is the only option that has a chance of sensible chance of modelling a non-trivial structure (e.g. some sections have a huge child-entity structure, which some do not).
  • CeejeeB
    CeejeeB almost 13 years
    I agree a very good answer. I am interested though by the fact that option 3 would then have 1 to 1 relationships between the policies table and the policy types tables. What are your thoughts on this with regards to database normalization?
  • Admin
    Admin over 11 years
    I am using this design, too, but you don't mention the drawbacks. Specifically: 1) you say you don't need the type; true but you cannot identify the actual type of a row unless you look at all subtypes tables to find a match. 2) It's hard to keep the master table and the subtype tables in sync (one can e.g. remove the row in the subtype table and not in the master table). 3) You can have more than one subtype for each master row. I use triggers to work around 1, but 2 and 3 are very hard problems. Actually 3 is not a problem if you model composition, but is for strict inheritance.
  • Jo So
    Jo So over 10 years
    +1 for @Tibo's comment, that's a grave problem. Class Table inheritance actually yields an unnormalized schema. Where as Concrete Table inheritance doesn't, and I don't agree with the argument that Concrete Table Inheritance hinders DRY. SQL hinders DRY, because it has no metaprogramming facilities. The solution is to use a Database Toolkit (or write your own) to do the heavy lifting, instead of writing SQL directly (remember, it is actually only a DB interface language). After all, you also don't write your enterprise application in assembly.
  • Andrew
    Andrew over 9 years
    @Tibo, about point 3, you can use the approach explained here: sqlteam.com/article/…, Check the Modeling One-to-Either Constraints section.
  • ThomasBecker
    ThomasBecker over 9 years
    @DanielVassallo Firstly thanks for stunning answer, 1 doubt if a person has a policyId how to know whether its policy_motor or policy_property? One way is to search policyId in all sub Tables but I guess this is the bad way isn't it, What should be the correct approach?
  • Admin
    Admin about 9 years
    @DanielVassallo how do you insert into Class Table Inheritance INSERT INTO policy_property (date_issued) VALUES ('somedate'); That would't work?
  • Admin
    Admin about 9 years
    @JoSo can anybody tell me how to insert into Class Table Inheritance model? For example, INSERT INTO policy_property (date_issued) VALUES (2010-08-20 12:00:00); wouldn't work because date_issued is not a field attribute of policy_property even though it is a field of the superclass policies
  • Adam
    Adam over 8 years
    I really like your third option. However, I'm confused how SELECT will work. If you SELECT * FROM policies, you'll get back policy ids but you still won't know which subtype table the policy belongs to. Won't you still have to do a JOIN with all of the subtypes in order to get all of the policy details?
  • giannis christofakis
    giannis christofakis over 8 years
    Does other DBs support INHERITS besides PostgreSQL ? MySQL for example ?
  • Nathan Williams
    Nathan Williams over 7 years
    @Tibo I presume your suggestion to solve point 1) is to include a type field in the super table and update it with triggers on each of the subtables whenever a subtable id is inserted(/changed/deleted)? This would be similar to Andrew 's solution for point 3) which doesn't appear to need the triggers. Note however neither would not solve point 1) when using one to many inheritance.
  • Admin
    Admin over 7 years
    @NathanWilliams Yes, yes and yes :) I think there is no silver bullet here as we're somehow working against the relational model and each solution has its own pitfalls.
  • a_horse_with_no_name
    a_horse_with_no_name over 7 years
    @giannischristofakis: MySQL is only a relational database, whereas Postgres is an object-relational database. So, no MySQL does not support this. In fact, I think that Postgres is the only current DBMS that supports this type of inheritance.
  • Steve Jones
    Steve Jones almost 7 years
    That's an interesting idea. I haven't used JSON in SQL Server yet, but use it a lot elsewhere. Thanks for the heads up.
  • egemen
    egemen over 6 years
    And you should add primary keys for every tables. stackoverflow.com/questions/840162/…
  • Dener
    Dener over 6 years
    @DanielVassallo May I use the 3º Solution you gave with EntityTypeConfiguration as this solution approach? Inheritance of EntityTypeConfiguration
  • kirodge
    kirodge about 6 years
    As a side note: Entity Framework Core only supports Single Table Inheritance (TPH). docs.microsoft.com/sv-se/ef/core/modeling/relational/… "TPT support is something we plan to implement, but it is a big cross-cutting feature. I can't say that it will be in the next release following 2.1 because there are many other competing features that are also high priority". - Arthur Vickers github.com/aspnet/EntityFrameworkCore/issues/2266
  • Shamal Perera
    Shamal Perera almost 6 years
    @DanielVassallo In single table inheritance, is there any advantage if types are maintained in a separate table with type Id and type name. Then having a foreign key constraint for it.
  • David Ferreira
    David Ferreira over 5 years
    @Tibo A workarround for the point 2 could be to use triggers to control parent and child sync, right?
  • Stephan-v
    Stephan-v about 5 years
    Where is the term "unified section table" coming from? Google shows almost no results for it and there are enough confusing terms here already.
  • mapto
    mapto about 5 years
    @marco-paulo-ollivier, the OP's question is about SQL Server, so I don't understand why you provide a solution that only works with Postgres. Obviously, not addressing the problem.
  • Caius Jard
    Caius Jard over 4 years
    @mapto this question has become something of a "how does one do OO style inheritance in a database" dupe target; that it was originally about sql server is likely now irrelevant
  • formixian
    formixian over 4 years
    I want to mention as well that on the concrete table inheritance scenario, there is the possibility to add a differentiator column to indentify which specialized table the parent table entry relates to. Usually, I use the specialization table name as the differentiator value.
  • Vadim Shvetsov
    Vadim Shvetsov about 4 years
    I agreed that last scenario is the most suitable. But the main drawback I've experienced with it is to work with shared fields along the subtypes that can be mandatory for the first subtype and optional for the second.
  • defraggled
    defraggled almost 3 years
    It seems that INHERITS pg functionality allows much more elegant querying, however the table/columns themselves are still set up exactly in concrete inheritance fashion: repeating all common attributes. I don't think this removes any of the flaws already raised with concrete inheritance
  • C Deuter
    C Deuter almost 3 years
    DRY doesn't apply to db migrations, which the best practice is to make purely additive. The denormalizing tables is a 100% valid strategy, it can greatly improve the performance of full table reads, and it often times greatly simplifies designs.
  • MatBailie
    MatBailie almost 3 years
    That's great for data you don't intend to index... If you intend to use the columns in WHERE clauses, etc, you'll want to index them, and the JSON pattern inhibits you there.