What's the best way of structuring data on firebase?

56,229

Solution 1

UPDATE: There is now a doc on structuring data. Also, see this excellent post on NoSQL data structures.

The main issue with hierarchical data, as opposed to RDBMS, is that it's tempting to nest data because we can. Generally, you want to normalize data to some extent (just as you would do with SQL) despite the lack of join statements and queries.

You also want to denormalize in places where read efficiency is a concern. This is a technique used by all the large scale apps (e.g. Twitter and Facebook) and although it goes against our DRY principles, it's generally a necessary feature of scalable apps.

The gist here is that you want to work hard on writes to make reads easy. Keep logical components that are read separately separate (e.g. for chat rooms, don't put the messages, meta info about the rooms, and lists of members all in the same place, if you want to be able to iterate the groups later).

The primary difference between Firebase's real-time data and a SQL environment is querying data. There's no simple way to say "SELECT USERS WHERE X = Y", because of the real-time nature of the data (it's constantly changing, sharding, reconciling, etc, which requires a simpler internal model to keep the synchronized clients in check)

A simple example will probably set you in the right state of mind, so here goes:

/users/uid
/users/uid/email
/users/uid/messages
/users/uid/widgets

Now, since we're in a hierarchical structure, if I want to iterate users' email addresses, I do something like this:

// I could also use on('child_added') here to great success
// but this is simpler for an example
firebaseRef.child('users').once('value')
.then(userPathSnapshot => {
   userPathSnapshot.forEach(
      userSnap => console.log('email', userSnap.val().email)
   );
})
.catch(e => console.error(e));

The problem with this approach is that I have just forced the client to download all of the users' messages and widgets too. No biggie if none of those things number in thousands. But a big deal for 10k users with upwards of 5k messages each.

So now the optimal strategy for a hierarchical, real-time structure becomes more obvious:

/user_meta/uid/email
/messages/uid/...
/widgets/uid/...

An additional tool which is extremely useful in this environment are indices. By creating an index of users with certain attributes, I can quickly simulate a SQL query by simply iterating the index:

/users_with_gmail_accounts/uid/email

Now if I want to, say, get messages for gmail users, I can do something like this:

var ref = firebase.database().ref('users_with_gmail_accounts');
ref.once('value').then(idx_snap => {
   idx_snap.forEach(idx_entry => {
       let msg = idx_entry.name() + ' has a new message!';
       firebase.database().ref('messages').child(idx_entry.name())
          .on(
             'child_added', 
             ss => console.log(msg, ss.key)
          );
   });
})
.catch(e => console.error(e));

I offered some details in another SO post about denormalizing data, so check those out as well. I see that Frank already posted Anant's article, so I won't reiterate that here, but it's also a great read.

Solution 2

Firebase is very much not like a relational database. If you want to compare it to anything, I'd compare it to a hierarchical database.

Anant recently wrote a great post over on the Firebase blog about denormalizing your data: https://www.firebase.com/blog/2013-04-12-denormalizing-is-normal.html

I'd indeed suggest keeping the "ID" of each application as a child of each applicant.

Solution 3

Your scenario looks like one to many in relational world, as per your example an applicant have many applications. If we come to firebase nosql way it looks like below. It should scale without any performance issues. That's why we need denormalization as mentioned below.

applicants:{
applicant1:{
    .
    .
    applications:{
        application1:true,
        application3:true
    }
},
applicant2:{
    .
    .
    applications:{
        application2:true,
        application4:true
    }
}}

applications:{
application1:{
    .
    .
},
application2:{
    .
    .
},
application3:{
    .
    .
},
application4:{
    .
    .
}}
Share:
56,229
hopper
Author by

hopper

Updated on April 29, 2020

Comments

  • hopper
    hopper about 4 years

    I am new to firebase and I want to know what's the best way of structuring data on it.

    I have a simple example:

    There are Applicants and Applications on my project. 1 applicant can have several applications. How can I relate this 2 objects on firebase? Does it work like a relational database? Or the approach needs to be completely different in terms of data design?

  • hopper
    hopper about 11 years
    Thanks Frank! This is really helpful . Exactly what I was looking for !
  • hopper
    hopper about 11 years
    Thanks for this insight Kato !
  • Kato
    Kato over 10 years
    For the time being. The views in the v2 release of Firebase will contain some great capabilities for automating that process.
  • Tommie C.
    Tommie C. over 7 years
    Good but I have a follow-on, how do we create this structure from Swift or anywhere using the Firebase SDK? Also how can we validate that the new data added to the applications node actually exists In the applications list using the Firebase validation rules?
  • Satish Sojitra
    Satish Sojitra over 7 years
    @prateep, Good example. But here issue is when I delete path applications/application1 where application1 is child for some applicants. If i try to access path applicants/application1 which is not there. so you need to update indexes in both places like application1:{ applicants:{applicant1: true} ...} so now when I delete applicantion1 i have to check it's child applicants and update applicants child node for application. :)
  • owiewio
    owiewio about 4 years
    Aware that I'm resurrecting an old comment thread here, but I'm struggling to find a more up-to-date solution. Is this still the best approach? ie getting all users_with_gmail_accounts and then running a forEach?