Leaderboard ranking with Firebase

15,084

Solution 1

Finding an arbitrary player's rank in leaderboard, in a manner that scales is a common hard problem with databases.

There are a few factors that will drive the solution you'll need to pick, such as:

  • Total Number players
  • Rate that individual players add scores
  • Rate that new scores are added (concurrent players * above)
  • Score range: Bounded or Unbounded
  • Score distribution (uniform, or are their 'hot scores')

Simplistic approach

The typical simplistic approach is to count all players with a higher score, eg SELECT count(id) FROM players WHERE score > {playerScore}.

This method works at low scale, but as your player base grows, it quickly becomes both slow and resource expensive (both in MongoDB and Cloud Firestore).

Cloud Firestore doesn't natively support count as it's a non-scalable operation. You'll need to implement it on the client-side by simply counting the returned documents. Alternatively, you could use Cloud Functions for Firebase to do the aggregation on the server-side to avoid the extra bandwidth of returning documents.

Periodic Update

Rather than giving them a live ranking, change it to only updating every so often, such as every hour. For example, if you look at Stack Overflow's rankings, they are only updated daily.

For this approach, you could schedule a function, or schedule App Engine if it takes longer than 540 seconds to run. The function would write out the player list as in a ladder collection with a new rank field populated with the players rank. When a player views the ladder now, you can easily get the top X + the players own rank in O(X) time.

Better yet, you could further optimize and explicitly write out the top X as a single document as well, so to retrieve the ladder you only need to read 2 documents, top-X & player, saving on money and making it faster.

This approach would really work for any number of players and any write rate since it's done out of band. You might need to adjust the frequency though as you grow depending on your willingness to pay. 30K players each hour would be $0.072 per hour($1.73 per day) unless you did optimizations (e.g, ignore all 0 score players since you know they are tied last).

Inverted Index

In this method, we'll create somewhat of an inverted index. This method works if there is a bounded score range that is significantly smaller want the number of players (e.g, 0-999 scores vs 30K players). It could also work for an unbounded score range where the number of unique scores was still significantly smaller than the number of players.

Using a separate collection called 'scores', you have a document for each individual score (non-existent if no-one has that score) with a field called player_count.

When a player gets a new total score, you'll do 1-2 writes in the scores collection. One write is to +1 to player_count for their new score and if it isn't their first time -1 to their old score. This approach works for both "Your latest score is your current score" and "Your highest score is your current score" style ladders.

Finding out a player's exact rank is as easy as something like SELECT sum(player_count)+1 FROM scores WHERE score > {playerScore}.

Since Cloud Firestore doesn't support sum(), you'd do the above but sum on the client side. The +1 is because the sum is the number of players above you, so adding 1 gives you that player's rank.

Using this approach, you'll need to read a maximum of 999 documents, averaging 500ish to get a players rank, although in practice this will be less if you delete scores that have zero players.

Write rate of new scores is important to understand as you'll only be able to update an individual score once every 2 seconds* on average, which for a perfectly distributed score range from 0-999 would mean 500 new scores/second**. You can increase this by using distributed counters for each score.

* Only 1 new score per 2 seconds since each score generates 2 writes
** Assuming average game time of 2 minute, 500 new scores/second could support 60000 concurrent players without distributed counters. If you're using a "Highest score is your current score" this will be much higher in practice.

Sharded N-ary Tree

This is by far the hardest approach, but could allow you to have both faster and real-time ranking positions for all players. It can be thought of as a read-optimized version of of the Inverted Index approach above, whereas the Inverted Index approach above is a write optimized version of this.

You can follow this related article for 'Fast and Reliable Ranking in Datastore' on a general approach that is applicable. For this approach, you'll want to have a bounded score (it's possible with unbounded, but will require changes from the below).

I wouldn't recommend this approach as you'll need to do distributed counters for the top level nodes for any ladder with semi-frequent updates, which would likely negate the read-time benefits.

Ternary tree example

Final thoughts

Depending on how often you display the leaderboard for players, you could combine approaches to optimize this a lot more.

Combining 'Inverted Index' with 'Periodic Update' at a shorter time frame can give you O(1) ranking access for all players.

As long as over all players the leaderboard is viewed > 4 times over the duration of the 'Periodic Update' you'll save money and have a faster leaderboard.

Essentially each period, say 5-15 minutes you read all documents from scores in descending order. Using this, keep a running total of players_count. Re-write each score into a new collection called scores_ranking with a new field players_above. This new field contains the running total excluding the current scores player_count.

To get a player's rank, all you need to do now is read the document of the player's score from score_ranking -> Their rank is players_above + 1.

Solution 2

One solution not mentioned here which I'm about to implement in my online game and may be usable in your use case, is to estimate the user's rank if they're not in any visible leaderboard because frankly the user isn't going to know (or care?) whether they're ranked 22,882nd or 22,838th.

If 20th place has a score of 250 points and there are 32,000 players total, then each point below 250 is worth on average 127 places, though you may want to use some sort of curve so as they move up a point toward bottom of the visible leaderboard they don't jump exactly 127 places each time - most of the jumps in rank should be closer to zero points.

It's up to you whether you want to identify this estimated ranking as an estimation or not, and you could add some a random salt to the number so it looks authentic:

// Real rank: 22,838

// Display to user:
player rank: ~22.8k    // rounded
player rank: 22,882nd  // rounded with random salt of 44

I'll be doing the latter.

Solution 3

Alternative perspective - NoSQL and document stores make this type of task overly complex. If you used Postgres this is pretty simple using a count function. Firebase is tempting because it's easy to get going with but use cases like this are when relational databases shine. Supabase is worth a look https://supabase.io/ similar to firebase so you can get going quickly with a backend but its opensource and built on Postgres so you get a relational database.

Share:
15,084
haim
Author by

haim

Updated on June 12, 2022

Comments

  • haim
    haim about 2 years

    I have project that I need to display a leaderboard of the top 20, and if the user not in the leaderboard they will appear in the 21st place with their current ranking.

    Is there efficient way to this?

    I am using Cloud Firestore as a database. I believe it was mistake to choose it instead of MongoDB but I am in the middle of the project so I must do it with Cloud Firestore.

    The app will be use by 30K users. Is there any way to do it without getting all the 30k users?

     this.authProvider.afs.collection('profiles', ref => ref.where('status', '==', 1)
            .where('point', '>', 0)
            .orderBy('point', 'desc').limit(20))
    

    This is code I did to get the top 20 but what will be the best practice for getting current logged in user rank if they are not in the top 20?

  • haim
    haim over 6 years
    wow. best answer i ever saw on stackoverflow. i definitely will need to read your answer another few times to understand how to implement it. thanks for the time to answer.
  • Dan McGrath
    Dan McGrath over 6 years
    Thanks Shimon! No problem, hope it's helpful :)
  • Dan McGrath
    Dan McGrath over 6 years
    Definitely a good thought on the Top 20 to reduce load, although doesn't help with the hard part requested (arbitrary player's rank). One caveat is Cloud Functions guarantee in order execution, which creates a race problem here you'd need to deal with in the Cloud Function.
  • Thaina Yu
    Thaina Yu over 6 years
    Great answer but I would like to add one more solution. It not support in firebase but couchdb can always give offset when query. NoSQL Database that have indexing should keep the shard count and sum to offset when query like couchdb. Sadly not many database support this feature
  • ishandutta2007
    ishandutta2007 about 5 years
    Fantastic analysis. But like most google job interviews your focus has been on optimizing time, but I want optimise for my firebase bills as well. slightly delayed ranking is fine at the promise of saving firebase bills.
  • ishandutta2007
    ishandutta2007 about 5 years
    @Thaina didn't get you. can you elaborate, if possible post it as a separate detailed answer.
  • Thaina Yu
    Thaina Yu about 5 years
    @ishandutta2007 In short, the feature I mention is a feature of couchdb that not exist in firebase. You should google and learn about couchdb
  • ishandutta2007
    ishandutta2007 about 5 years
    @Thaina I thought you were talking about how to implement that feature in firestore.
  • Ralph Yozzo
    Ralph Yozzo over 4 years
    Why would users need write access to highScores? Then, they could write any value. I believe highScores must be written only server side otherwise it could be hacked.
  • Rahul Reddy Vemireddy
    Rahul Reddy Vemireddy about 3 years
    This answer is timeless :). Explained funamentals with analysis so well