RESTful design: when to use sub-resources?

30,450

Solution 1

A year later, I ended with the following compromise (for database rows that contain a unique identifier):

  1. Assign all resources a canonical URI at the root (e.g. /companies/{id} and /employees/{id}).
  2. If a resource cannot exist without another, it should be represented as its sub-resource; however, treat the operation as a search engine query. Meaning, instead of carrying out the operation immediately, simply return HTTP 307 ("Temporary redirect") pointing at the canonical URI. This will cause clients to repeat the operation against the canonical URI.
  3. Your specification document should only expose root resources that match your conceptual model (not dependent on implementation details). Implementation details might change (your rows might no longer be unique identifiable) but your conceptual model will remain intact. In the above example, you'd tell clients about /companies but not /employees.

This approach has the following benefits:

  1. It eliminates the need to do unnecessary database look-ups.
  2. It reduces the number of sanity-checks to one per request. At most, I have to check whether an employee belongs to a company, but I no longer have to do two validation checks for /companies/{companyId}/employees/{employeeId}/computers/{computerId}.
  3. It has a mixed impact on database scalability. On the one hand you are reducing lock contention by locking less tables, for a shorter period of time. But on the other hand, you are increasing the possibility of deadlocks because each root resource must use a different locking order. I have no idea whether this is a net gain or loss but I take comfort in the fact that database deadlocks cannot be prevented anyway and the resulting locking rules are simpler to understand and implement. When in doubt, opt for simplicity.
  4. Our conceptual model remains intact. By ensuring that the specification document only exposes our conceptual model, we are free to drop URIs containing implementation details in the future without breaking existing clients. Remember, nothing prevents you from exposing implementation details in intermediate URIs so long as your specification declares their structure as undefined.

Solution 2

This is problematic because it's no longer obvious that a user belongs to a particular company.

Sometimes this may highlight a problem with your domain model. Why does a user belong to a company? If I change companies, am I whole new person? What if I work for two companies? Am I two different people?

If the answer is yes, then why not take some company-unique identifier to access a user?

e.g. username:

company/foo/user/bar

(where bar is my username that is unique within that specific company namespace)

If the answer is no, then why am I not a user (person) by myself, and the company/users collection merely points to me: <link rel="user" uri="/user/1" /> (note: employee seems to be more appropriate)

Now outside of your specific example, I think that resource-subresource relationships are more appropriate when it comes to use rather than ownership (and that's why you're struggling with the redundancy of identifying a company for a user that implicitly identifies a company).

What I mean by this is that users is actually a sub-resource of a company resource, because the use is to define the relationship between a company and its employees - another way of saying that is: you have to define a company before you can start hiring employees. Likewise, a user (person) has to be defined (born) before you can recruit them.

Solution 3

Your rule to decide if a resource should be modeled as sub resource is valid. Your problem does not arise from a wrong conceptual model but you let leak your database model into your REST model.

From a conceptual view an employee if it can only exist within a company relationship is modeled as a composition. The employee could be thus only identified via the company. Now databases come into play and all employee rows get a unique identifier.

My advice is don't let the database model leak in your conceptional model because you're exposing infrastructure concerns to your API. For example what happens when you decide to switch to a document oriented database like MongoDB where you could model your employees as part of the company document and no longer has this artificial unique id? Would you want to change your API?

To answer your extra questions

How should I represent the fact that a resource to belongs to another?

Composition via sub resources, other associations via URL links.

How should I represent the fact that a resource cannot be identified without another?

Use both id values in your resource URL and make sure not to let your database leak into your API by checking if the "combination" exists.

What relationships are sub-resources meant and not meant to model?

Sub resources are well suited for compositions but more generally spoken to model that a resource cannot exist without the parent resource and always belongs to one parent resource. Your rule when a resource could not exist without another, it should be represented as its sub-resource is a good guidance for this decision.

Solution 4

if a subresource is uniquely identifiable without its owning entity, it is no subresource and should have its own namespace (i.e. /users/{user} rather than /companies/{*}/users/{user}). Most importantly: never ever ever everer uses your entity's database primary key as the resource identifier. that's the most common mistake where implementation details leak to the outside world. you should always have a natural business key (like username or company-number, rather than user-id or company-id). the uniqueness of such a key can be enforced by a unique constraint, if you wish, but the primary key of an entity should never ever everer leave the persistence-layer of your application, or at least it should never be an argument to any service method. If you go by this rule, you shouldn't have any trouble distinguishing between compositions (/companies/{company}/users/{user}) and associations (/users/{user}), because if your subresource doesn't have a natural business key, that identifies it in a global context, you can be very certain it really is a depending subresource (or you must first create a business key to make it globally identifiable).

Solution 5

This is one way you can resolve this situation:

/companies/{companyName}/employee/{employeeId} -> returns data about an employee, should also include the person's data

/person/{peopleId} -> returns data about the person

Talking about employee makes no sense without also talking about the company, but talking about the person does make sense even without a company and even if he's hired by multiple companies. A person's existence is independent of whether he's hired by any companies, but an employment's existence does depend on the company.

Share:
30,450
Gili
Author by

Gili

Email: cowwoc2020 at gmail dot com.

Updated on July 08, 2022

Comments

  • Gili
    Gili almost 2 years

    When designing resource hierarchies, when should one use sub-resources?

    I used to believe that when a resource could not exist without another, it should be represented as its sub-resource. I recently ran across this counter-example:

    • An employee is uniquely identifiable across all companies.
    • An employee's access control and life-cycle depend on the company.

    I modeled this as: /companies/{companyName}/employee/{employeeId}

    Notice, I don't need to look up the company in order to locate the employee, so should I? If I do, I'm paying a price to look up information I don't need. If I don't, this URL mistakenly returns HTTP 200:

    /companies/{nonExistingName}/employee/{existingId}

    1. How should I represent the fact that a resource to belongs to another?
    2. How should I represent the fact that a resource cannot be identified without another?
    3. What relationships are sub-resources meant and not meant to model?
  • Gili
    Gili over 11 years
    I wish your answer was more concise, but in any case you nailed it. The key is to define: /companies/{companyName}/users and /users/{id} because looking up users associated with a company requires {companyName} but looking up individual users does not, hence the user is a top-level resource. Thank you!
  • Gili
    Gili over 11 years
    I tried playing this game a few years ago but I no longer attempt to design database-agnostic software. The cost/benefit simply isn't worth it, seeing how rarely we change databases and some implementation detail always leaks through (e.g. not all databases use integer IDs). You might as well do the best job possible using the current database. As a bonus you end up with simpler code. Final point: I could always retain backwards-compatibility after a migration by storing the original id in the new database schema.
  • Gili
    Gili over 11 years
    I'm sorry if my last comment came off as negative. I agree with most of your answer. Trading database portability for performance is a subjective decision that doesn't really affect the answer. I like the bit you wrote about composition via sub-resources, associations via URL links.
  • saintedlama
    saintedlama over 11 years
    With "not to let your database leak into your API" I meant not let the database model leak into our API model. In the questions's context that means: don't worry if you have to load the company to load an employee because the schema company->employee reflects your conceptual model best.
  • Gili
    Gili over 11 years
    True. I was worried about unnecessary coupling between company and employee but in this case it seems quite appropriate.
  • Daniel
    Daniel almost 11 years
    @Gili: I was having a similar problem and your approach makes sense to me. /companies/{companyName}/users shows all the users belonging to a particular company and /users shows all the users in the system. However, should I be able to identify a single user as both /companies/{companyName}/users/{id} and /users/{id}? Wouldn't that be redundant?
  • Gili
    Gili almost 11 years
    @Daniel, it's not redundant. When someone requests HTTP GET /companies/{companyName}/users/{id} you should return HTTP 303 ("See Other") pointing to /users/{id}. The former is an alias. The latter is the canonical URI.
  • Doug Moscrop
    Doug Moscrop almost 11 years
    I don't even know how necessary that redirect is. The server controls the URIs and the client should have no idea. If the server wants to serve up a direct representation at that point, it can. It cannot make any difference to the client except for potential cache misses.
  • Gili
    Gili over 10 years
    Interesting idea, but it doesn't actually answer the question. It's just that a person is not strictly the same as the employee (which is meant to be implied by the question), hence you're not answering whether you should be able to refer to an employee without the company name.
  • Lie Ryan
    Lie Ryan over 10 years
    @Gili: for the answer to that, no you shouldn't; the point of this distinction is that it makes no sense talking about the employee without also talking about the company. Data that are not highly tied to the company should live in the person resource, the person resource isn't a subresource of company nor employee, it's an independent, freestanding entity.
  • Kai
    Kai about 10 years
    As a general rule: always consider the primary key an implementation detail of your persistence layer.
  • Gili
    Gili about 10 years
    I disagree with you for the following reasons: 1. The reason the community moved away from natural business keys in the first place is because they change over time. 2. If you use a non-business key, there isn't a reason it would ever change (even if you change databases). 3. Natural keys make it impossible to implement idempotent operations. Someone can invoke PUT /companies/Nintendo/ at the same time that someone deletes and creates a new Nintendo. If the first client retries a PUT operation (idempotency) he has no way of detecting that the underlying instance has changed.
  • Dave
    Dave over 8 years
    Also - if you have a recursive resource (think heirarchy of tags), you cannot use the business key as the key in the URL because the real business key would contain the entire path and could exceed URL length limitations.
  • Hossein Shahdoost
    Hossein Shahdoost almost 8 years
    A simpler way to decide is to keep entities with composite keys as sub resources
  • Gili
    Gili almost 8 years
    @Sub-Zero How is this relevant to the question/answer? I don't see any composite keys here, do you?
  • Hossein Shahdoost
    Hossein Shahdoost almost 8 years
    :D, The title of the question is "when to use sub-resources?", and since resources are mostly a presentation of our entities, I think it would be an easy way to decide if you use sub resources only for entities with composite key. since they need two keys to be accessed, and one key is always PK of another entity.
  • Gili
    Gili almost 8 years
    @Sub-Zero I think there are downsides to your proposal but in any case please post this as a separate answer instead of commenting on this one. Thank you.
  • Ryall
    Ryall almost 7 years
    @Gili, shouldn't the canonical redirect be a 301 permanent? It's always going to be a redirect and this allows it to be cached.
  • Gili
    Gili almost 7 years
    @Ryall not necessarily. Today /companies/ComputerCentral might map to /companies/1 but 10 years from now maybe the company went bankrupt and an unrelated company took its name so /companies/ComputerCentral now maps to /companies/2.