Database design for a chat system

13,625

100 Billions rows in one table will never work online. Not only all possible partitioning ways are applied to reduce the sizes, but also separation of active/passive data strategies. But nevertheless all the high maters, the answer:

Postgres is indeed effective working with big data itself.

and yet:

Postgres has not effective enough strategy to fight poor design

Look at your example: table chat_room lists two users in separate columns - what for? You have user_id in messages referencing users.id. And you have chat_room.id in it, so you have data which users were in that chat_room. Now if your idea was to pre-aggregate which users participated in chat_room over time or at all, make it one array column, like (chat_room.id int, users_id bigint[]) or if you want join time and leave time, add corresponding attributes. active/passive data can be implemented using archived chat_rooms in different relation then active ones. Btw aggregation on who participated in that chatroom can be performed on such archiving...

Above is not instructions for action, just expression. There is no best practice for database schema. First make a clear plan what your chat will do, then make db schema, try it, improve, try, improve, try, improve and so on, until everything works. If you have concerns on how it will work with 100 billions of rows - fill it up and check...

Share:
13,625
Vipin Chaudhary
Author by

Vipin Chaudhary

I am a 3rd year Btech Student in IIIT Delhi, I am a competitive programmer, IOS, Android, Web Developer. Did 6 internships so far in 3 years of my Btech Currently hired as intern in Hindustan Media Limited and Learnix edutech and Hypothizer as well. I am Computer vision and deep learning enthusiast :) I have working knowledge experience in more than 15 programming langauges and many Frameworks like Django, Android, CodeIgnitor etc

Updated on June 04, 2022

Comments

  • Vipin Chaudhary
    Vipin Chaudhary almost 2 years

    I know there is a lot of posts out there discussing Db design for a chat system, but they didn't explain anything about the scalability of that design, so here my question.

    I want to design a Db of a real-time chat between 2 or more users, let's take 2 users first, here what I came up with.

    Table 1:

    name: User

    fields: id, name

    Table 2

    name: Chat Room

    fields: id, user1, user2

    Table 3:

    name: Message

    fields: Chat_room_id, user_id, message

    Now considering Facebook in mind, it has around 2 billion active users per month and let say 1 billion of them indulge in chatting and each user sends 100 messages.

    which make 100 Billion entries in table: Message, so the question is,

    "Will Mysql or Postgres be able to handle this much of entries and show particular chat room messages in real-time ?" if not then what should be the best practice to follow that, I know that it also depends on the server on which RDBMS is installed but still want to know the optimum architecture.

    PS: I am using Django as backend and AngularJs for asynchronous behavior