{"id":1859,"date":"2015-12-12T15:05:37","date_gmt":"2015-12-12T15:05:37","guid":{"rendered":"http:\/\/www.doanduyhai.com\/blog\/?p=1859"},"modified":"2017-08-10T16:45:55","modified_gmt":"2017-08-10T16:45:55","slug":"killrchat-data-model-design","status":"publish","type":"post","link":"https:\/\/www.doanduyhai.com\/blog\/?p=1859","title":{"rendered":"KillrChat Data Model Design"},"content":{"rendered":"<p>In this post, we&#8217;re digging into <strong>KillrChat<\/strong> Cassandra data model<\/p>\n<p><strong>KillrChat<\/strong> data model focus on 3 main components:<\/p>\n<ul>\n<li>users<\/li>\n<li>chat rooms<\/li>\n<li>chat messages<\/li>\n<\/ul>\n<h1>I Users management<\/h1>\n<p>For a successful chat application, the number of users will grow fast. To be able to scale out with this growth, we need to choose a partition key that will scale out nicely. A good candidate is the user <strong>login<\/strong>.<\/p>\n<div id=\"attachment_1860\" style=\"width: 906px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/www.doanduyhai.com\/blog\/wp-content\/uploads\/2015\/12\/UserManagement.png\"><img aria-describedby=\"caption-attachment-1860\" loading=\"lazy\" class=\"size-full wp-image-1860\" src=\"https:\/\/www.doanduyhai.com\/blog\/wp-content\/uploads\/2015\/12\/UserManagement.png\" alt=\"User Distribution\" width=\"896\" height=\"420\" srcset=\"https:\/\/www.doanduyhai.com\/blog\/wp-content\/uploads\/2015\/12\/UserManagement.png 896w, https:\/\/www.doanduyhai.com\/blog\/wp-content\/uploads\/2015\/12\/UserManagement-300x141.png 300w\" sizes=\"(max-width: 896px) 100vw, 896px\" \/><\/a><p id=\"caption-attachment-1860\" class=\"wp-caption-text\">User Distribution<\/p><\/div>\n<p>The <strong>users<\/strong> table is quite simple and straightforward:<\/p>\n<pre class=\"brush: sql; title: ; notranslate\" title=\"\">\r\nCREATE TABLE killrchat.users(\r\n   login text,\r\n   pass text, \/\/password is not allowed as column name because reserved word\r\n   lastname text,\r\n   firstname text,\r\n   bio text,\r\n   email text,\r\n   chat_rooms set&lt;text&gt;,\r\n   PRIMARY KEY(login));\r\n<\/pre>\n<p>The corresponding Java mapping for this table:<\/p>\n<pre class=\"brush: java; light: true; title: ; notranslate\" title=\"\">\r\n@Table(keyspace = KEYSPACE, name = USERS)\r\npublic class UserEntity {\r\n\r\n    @PartitionKey\r\n    private String login;\r\n\r\n    @NotEmpty\r\n    @Column\r\n    private String pass;\r\n\r\n    @Column\r\n    private String firstname;\r\n\r\n    @Column\r\n    private String lastname;\r\n\r\n    @Column\r\n    private String email;\r\n\r\n    @Column\r\n    private String bio;\r\n\r\n    @Column(name = &quot;chat_rooms&quot;)\r\n    private Set&lt;String&gt; chatRooms = new HashSet&lt;&gt;();\r\n\r\n    \/\/Omitted getters &amp; setters\r\n<\/pre>\n<p>Please notice the columns <em>chatRooms<\/em> in the table <strong>users<\/strong>. This columns will store a list of chat rooms to which the current user has registered. <strong>We use a set to store all the chat room ids instead of using a separated table<\/strong>. Clearly, collections in CQL are suited for this use case where the cardinality of the set is pretty reduce (it does not make sense for an user to register to 10 <sup>6<\/sup> chat rooms). Furthermore, since the firstname and lastname of an user is immutable (unless your application decides to allow people changing their name), we do not have to take care of update scenarios.<\/p>\n<p>To avoid multiple users creating an account with the same login (partition key), we rely on Cassandra <strong>LightWeight Transaction<\/strong>. <strong>The operation is expensive but the trade-off is still acceptable with regards to the chat usage lifecycle. After all you only create account once<\/strong>.<\/p>\n<pre class=\"brush: sql; title: ; notranslate\" title=\"\">\r\nINSERT INTO killrchat.users(login,...) VALUES(\u2018jdoe\u2019,...)IF NOT EXISTS;\r\n<\/pre>\n<hr\/>\n<h1>II Chatroom data model<\/h1>\n<p>To scale the chat rooms number nicely, we apply the same approach as with the users. Let\u2019s use <strong>room name<\/strong> as partition key.<\/p>\n<div id=\"attachment_1862\" style=\"width: 899px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/www.doanduyhai.com\/blog\/wp-content\/uploads\/2015\/12\/ChatRoomDistribution.png\"><img aria-describedby=\"caption-attachment-1862\" loading=\"lazy\" src=\"https:\/\/www.doanduyhai.com\/blog\/wp-content\/uploads\/2015\/12\/ChatRoomDistribution.png\" alt=\"ChatRoom Distribution\" width=\"889\" height=\"421\" class=\"size-full wp-image-1862\" srcset=\"https:\/\/www.doanduyhai.com\/blog\/wp-content\/uploads\/2015\/12\/ChatRoomDistribution.png 889w, https:\/\/www.doanduyhai.com\/blog\/wp-content\/uploads\/2015\/12\/ChatRoomDistribution-300x142.png 300w\" sizes=\"(max-width: 889px) 100vw, 889px\" \/><\/a><p id=\"caption-attachment-1862\" class=\"wp-caption-text\">ChatRoom Distribution<\/p><\/div>\n<p>The <strong>chat_rooms<\/strong> table skeleton is defined below:<\/p>\n<pre class=\"brush: sql; title: ; notranslate\" title=\"\">\r\nCREATE TABLE killrchat.chat_rooms( room_name text,\r\n    room_name text,\r\n    creation_date timestamp,\r\n    banner text,\r\n    creator ???,\r\n    creator_login text, \r\n    participants ???,\r\n    PRIMARY KEY(room_name));\ufffc\r\n<\/pre>\n<p>Whenever an user enters a chat room, we should load:<\/p>\n<ul>\n<li>the room details<\/li>\n<li>the room creator details <\/li>\n<li>the list of all current participants in the room<\/li>\n<\/ul>\n<p>Room details are pretty easy to store. For room creator, we can either persist only the user login and issue an additional select to fetch creator details or de-normalize. The same logic applies to the list of participants, either persisting only participants login and for each of them, issue a SELECT to load his\/her details or de-normalize.<\/p>\n<p>Clearly, persisting only user and participants login will raise the N+1 SELECT issue, so the choice of de-normalizing is obvious. For this, we create an user-defined type user and re-use it in the <strong>chat_rooms<\/strong> table<\/p>\n<pre class=\"brush: sql; title: ; notranslate\" title=\"\">\r\nCREATE TYPE killrchat.user(\r\n   login text,\r\n   firstname text, \r\n   lastname text);\r\n\r\nCREATE TABLE killrchat.chat_rooms(\r\n   ...\r\n   creator frozen&lt;user&gt;,\r\n   ...\r\n   participants set&lt;frozen&lt;user&gt;&gt;,\r\n   PRIMARY KEY(room_name));\r\n<\/pre>\n<p>The corresponding Java mapping for this table:<\/p>\n<pre class=\"brush: java; light: true; title: ; notranslate\" title=\"\">\r\n\r\n@Table(keyspace = KEYSPACE, name = CHATROOMS)\r\npublic class ChatRoomEntity {\r\n    @PartitionKey\r\n    @Column(name = &quot;room_name&quot;)\r\n    private String roomName;\r\n\r\n    @Column\r\n    @Frozen\r\n    private LightUserModel creator;\r\n\r\n    @Column(name = &quot;creator_login&quot;)\r\n    private String creatorLogin;\r\n\r\n    @Column(name = &quot;creation_date&quot;)\r\n    private Date creationDate;\r\n\r\n    @Column\r\n    private String banner;\r\n\r\n    @Column\r\n    @Frozen(&quot;set&lt;frozen&lt;user&gt;&gt;&quot;)\r\n    private Set&lt;LightUserModel&gt; participants = new HashSet&lt;&gt;();\r\n<\/pre>\n<p>Again, as for users, chat room creation will use <strong>LightWeight Transaction<\/strong> to guarantee unicity constraint.<\/p>\n<hr\/>\n<h1>III Chat rooms participants management<\/h1>\n<p>There are some tricky scenarios with chat rooms participants to handle, especially in a fully distributed architecture not using global lock.<\/p>\n<p>Indeed, this architecture raises some interesting questions about concurrency issues:<\/p>\n<ol>\n<li>what happens if a participant just joins a room when it is deleted ? How can we remove this room name from this participant chat rooms list ?<\/li>\n<li>what happens if a participant just leaves a room when it is deleted ?<\/li>\n<li>Same question as above how can we guarantee that only the room creator can delete his own rooms and not someone else ?<\/li>\n<\/ol>\n<h3>A. Participant joining room<\/h3>\n<p>Let\u2019s focus first on the participant joining a \u201c<em>just deleted<\/em>\u201d room scenario. Naively, to add a participant to a room, we can issue this CQL statement:<\/p>\n<pre class=\"brush: sql; title: ; notranslate\" title=\"\">\r\n    UPDATE killrchat.chat_rooms \r\n    SET participants = participants + {...} \r\n    WHERE room_name = \u2018games\u2019;\r\n<\/pre>\n<p>In that case, with a concurrent room deletion by the creator, depending on the ordering of both statements, we can have data corruption as shown below:<\/p>\n<div id=\"attachment_1870\" style=\"width: 1081px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/www.doanduyhai.com\/blog\/wp-content\/uploads\/2015\/12\/ParticipantJoiningRoomIssue.png\"><img aria-describedby=\"caption-attachment-1870\" loading=\"lazy\" src=\"https:\/\/www.doanduyhai.com\/blog\/wp-content\/uploads\/2015\/12\/ParticipantJoiningRoomIssue.png\" alt=\"Participant Joining Room Issue\" width=\"1071\" height=\"495\" class=\"size-full wp-image-1870\" srcset=\"https:\/\/www.doanduyhai.com\/blog\/wp-content\/uploads\/2015\/12\/ParticipantJoiningRoomIssue.png 1071w, https:\/\/www.doanduyhai.com\/blog\/wp-content\/uploads\/2015\/12\/ParticipantJoiningRoomIssue-300x139.png 300w, https:\/\/www.doanduyhai.com\/blog\/wp-content\/uploads\/2015\/12\/ParticipantJoiningRoomIssue-1024x473.png 1024w\" sizes=\"(max-width: 1071px) 100vw, 1071px\" \/><\/a><p id=\"caption-attachment-1870\" class=\"wp-caption-text\">Participant Joining Room Issue<\/p><\/div>\n<p>The creator just deletes the chat room \u201c<em>games<\/em>\u201d (e.g. <strong>creating tombstones on all columns<\/strong>) and right after that, we update the participant list, thus just adding an extra column with the new participant details. The final result is that the chat room \u201c<em>games<\/em>\u201d still exists but with only 1 participant and no other information (<strong>creator = null, banner = null, creation_date = null, \u2026<\/strong>).<\/p>\n<p>To prevent this data corruption, we use again <strong>LightWeight Transaction<\/strong>.<\/p>\n<pre class=\"brush: sql; title: ; notranslate\" title=\"\">\r\n    UPDATE killrchat.chat_rooms \r\n    SET participants = participants + {...} \r\n    WHERE room_name = \u2018games\u2019 IF EXISTS;\r\n<\/pre>\n<p>The <strong>IF EXISTS<\/strong> clauses will reject the update if the room has been already removed.<\/p>\n<h3>B. Participant leaving room<\/h3>\n<p>What\u2019s about a participant leaving a room right after it is deleted ? Can we again rely on <strong>LightWeight Transaction<\/strong> to handle it properly ? The answer is <strong>Yes<\/strong>, but it\u2019s not necessary. Indeed, removing in CQL means creating tombstone markers. So no matter in which order we remove data (removing a room first and then removing a participant or removing a participant first and then the chat room), the result is identical, we\u2019ll have tombstone columns. This particular scenario does not require <strong>LightWeight Transaction<\/strong> because no data corruption is possible.<\/p>\n<div id=\"attachment_1872\" style=\"width: 1081px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/www.doanduyhai.com\/blog\/wp-content\/uploads\/2015\/12\/ParticipantLeavingRoom.png\"><img aria-describedby=\"caption-attachment-1872\" loading=\"lazy\" src=\"https:\/\/www.doanduyhai.com\/blog\/wp-content\/uploads\/2015\/12\/ParticipantLeavingRoom.png\" alt=\"Participant Leaving Room\" width=\"1071\" height=\"483\" class=\"size-full wp-image-1872\" srcset=\"https:\/\/www.doanduyhai.com\/blog\/wp-content\/uploads\/2015\/12\/ParticipantLeavingRoom.png 1071w, https:\/\/www.doanduyhai.com\/blog\/wp-content\/uploads\/2015\/12\/ParticipantLeavingRoom-300x135.png 300w, https:\/\/www.doanduyhai.com\/blog\/wp-content\/uploads\/2015\/12\/ParticipantLeavingRoom-1024x462.png 1024w\" sizes=\"(max-width: 1071px) 100vw, 1071px\" \/><\/a><p id=\"caption-attachment-1872\" class=\"wp-caption-text\">Participant Leaving Room<\/p><\/div>\n<h3>C. Removing a chat room<\/h3>\n<p>Last but not least, removing the entire chat room. For this, there are 2 strong requirements:<\/p>\n<ol>\n<li>only the creator can remove his own rooms<\/li>\n<li>we must also, upon chat room deletion, remove this chat room name from the chat room list of all its <strong>current<\/strong> participants<\/li>\n<\/ol>\n<p>The appropriate CQL statement for a safe chat rooms removal is<\/p>\n<pre class=\"brush: sql; title: ; notranslate\" title=\"\">\r\n    DELETE killrchat.chat_rooms\r\n    WHERE room_name = \u2018games\u2019\r\n    IF creator_login = &lt;current_user_login&gt; \r\n    AND participants = {...};\r\n<\/pre>\n<p>Again, we use <strong>LightWeight Transaction<\/strong> to enforce some invariants. <\/p>\n<p>The <em>IF creator_login = \u2026<\/em> condition ensures that only a creator can delete his own room. How can we inject the <em>creator_login<\/em> value into this query ? By fetching it from the <strong>Spring Security context<\/strong>! Indeed the <strong>Spring Security<\/strong> authentication process gives us a strong guarantee about an user identity, no cheating\/hacking is possible and we can safely use this security context login to compare with the actual <em>creator_login<\/em> stored in <strong>Cassandra<\/strong>.<\/p>\n<p>The second condition about participants will ensure that when deleting a room, we do not miss any new entering participants. Indeed, upon room deletion we must update each participant room list to remove the current room. Any participant that concurrently joins the room when it is being deleted may not be taken into account, unless we rely on <strong>LightWeight Transaction<\/strong>.<\/p>\n<div id=\"attachment_1873\" style=\"width: 1101px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/www.doanduyhai.com\/blog\/wp-content\/uploads\/2015\/12\/RemovingAChatRoom.png\"><img aria-describedby=\"caption-attachment-1873\" loading=\"lazy\" src=\"https:\/\/www.doanduyhai.com\/blog\/wp-content\/uploads\/2015\/12\/RemovingAChatRoom.png\" alt=\"Removing A Chat Room\" width=\"1091\" height=\"505\" class=\"size-full wp-image-1873\" srcset=\"https:\/\/www.doanduyhai.com\/blog\/wp-content\/uploads\/2015\/12\/RemovingAChatRoom.png 1091w, https:\/\/www.doanduyhai.com\/blog\/wp-content\/uploads\/2015\/12\/RemovingAChatRoom-300x139.png 300w, https:\/\/www.doanduyhai.com\/blog\/wp-content\/uploads\/2015\/12\/RemovingAChatRoom-1024x474.png 1024w\" sizes=\"(max-width: 1091px) 100vw, 1091px\" \/><\/a><p id=\"caption-attachment-1873\" class=\"wp-caption-text\">Removing A Chat Room<\/p><\/div>\n<p>Please note that the room deletion and the current participants\u2019 room list update is not atomic, e.g. there might be a tiny time frame where the room is already deleted but one participant room list not updated yet.<\/p>\n<p>Using <strong>LightWeight Transaction<\/strong> has an impact on performance but considering the frequency at which people create and join rooms compared to the time spent on chatting, it is worthwhile.<\/p>\n<h1>IV Chat messages management<\/h1>\n<p>The last data model is about chat messages. A simple and naive approach could be:<\/p>\n<div id=\"attachment_1874\" style=\"width: 1047px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/www.doanduyhai.com\/blog\/wp-content\/uploads\/2015\/12\/ChatMessagesDistribution.png\"><img aria-describedby=\"caption-attachment-1874\" loading=\"lazy\" src=\"https:\/\/www.doanduyhai.com\/blog\/wp-content\/uploads\/2015\/12\/ChatMessagesDistribution.png\" alt=\"Chat Messages Distribution\" width=\"1037\" height=\"427\" class=\"size-full wp-image-1874\" srcset=\"https:\/\/www.doanduyhai.com\/blog\/wp-content\/uploads\/2015\/12\/ChatMessagesDistribution.png 1037w, https:\/\/www.doanduyhai.com\/blog\/wp-content\/uploads\/2015\/12\/ChatMessagesDistribution-300x124.png 300w, https:\/\/www.doanduyhai.com\/blog\/wp-content\/uploads\/2015\/12\/ChatMessagesDistribution-1024x422.png 1024w\" sizes=\"(max-width: 1037px) 100vw, 1037px\" \/><\/a><p id=\"caption-attachment-1874\" class=\"wp-caption-text\">Chat Messages Distribution<\/p><\/div>\n<pre class=\"brush: sql; title: ; notranslate\" title=\"\">\r\nCREATE TABLE killrchat.chat_room_messages(\r\n    room_name text,\r\n    message_id timeuuid,\r\n    content text,\r\n    author frozen&lt;user&gt;, \/\/ denormalization system_message boolean,\r\n    PRIMARY KEY((room_name), message_id)) \r\nWITH CLUSTERING ORDER BY (message_id DESC);\r\n<\/pre>\n<p>The table has <em>room_name<\/em> as partition key and <em>message_id<\/em> as clustering column. The clustering column is sorted in reverse order to fetch the latest chat messages first. We also use user-defined type to persist the message author detail. <\/p>\n<p>The above data model is very convenient for chat message display. To retrieve the last 50 messages:<\/p>\n<pre class=\"brush: sql; title: ; notranslate\" title=\"\">\r\n    SELECT * FROM killrchat.chat_room_messages\r\n    WHERE room_name = \u2018games\u2019\r\n    LIMIT 50;\r\n<\/pre>\n<p>To retrieve the previous page of 50 message, starting from the last fetched <em>message_id<\/em>:<\/p>\n<pre class=\"brush: sql; title: ; notranslate\" title=\"\">\r\n    SELECT * FROM killrchat.chat_room_messages\r\n    WHERE room_name = \u2018games\u2019\r\n    AND message_id &lt; last_message_id\r\n    LIMIT 50;\r\n<\/pre>\n<p>There is still a caveat with this design, the chat room messages can not scale to billions, unless you set a fixed TTL on each message.<\/p>\n<p>Indeed, with a very popular chat room, having thousands of participants, the message count can grow very fast over time and reach the technical limit of 2.10 <sup>9<\/sup> physical columns for the partition.<\/p>\n<p>A remedy is to split the same partition between many bucket (base on date). Depending on the traffic, you may want to create a bucket by day\/week\/month\/year\u2026<\/p>\n<pre class=\"brush: sql; title: ; notranslate\" title=\"\">\r\nCREATE TABLE killrchat.chat_room_messages(\r\n    room_name text,\r\n    time_bucket int, \/\/ format yyyyMMdd\r\n    message_id timeuuid,\r\n    content text,\r\n    author frozen&lt;user&gt;, \/\/ denormalization system_message boolean,\r\n    PRIMARY KEY((room_name,time_bucket), message_id)) \r\nWITH CLUSTERING ORDER BY (message_id DESC);\r\n<\/pre>\n<p>This design allows to scale with the message count but makes querying the table more difficult. While it was trivial to fetch the previous page of message with a simple SELECT in the previous data model, you\u2019ll have to take into account the <em>time_bucket<\/em> to switch partition when necessary. <\/p>\n<p>We can devise a helper table to store <em>time_bucket<\/em> information for each chat room:<\/p>\n<pre class=\"brush: sql; title: ; notranslate\" title=\"\">\r\nCREATE TABLE killrchat.messages_bucket(\r\n    room_name text,\r\n    time_bucket int, \/\/ format yyyyMMdd\r\n    PRIMARY KEY((room_name), time_bucket)) \r\nWITH CLUSTERING ORDER BY (time_bucket DESC);\r\n<\/pre>\n<p>The content of this table, very small, can be put on cache for fast access. Whenever a participant creates a message, we update an application-managed weak <em>HashMap&lt;room_name,List&lt;last_time_buckets&gt;&gt;<\/em>. We flush this map regularly to <strong>Cassandra<\/strong> to update the <em>message_bucket<\/em> table. This map acts as a buffer to avoid hammering <strong>Cassandra<\/strong> with frequent mutations.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In this post, we&#8217;re digging into KillrChat Cassandra data model KillrChat data model focus on 3 main components: users chat rooms chat messages I Users management For a successful chat application, the number of users will grow fast. To be&#8230;<br \/><a class=\"read-more-button\" href=\"https:\/\/www.doanduyhai.com\/blog\/?p=1859\">Read more<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[57,55,10],"tags":[],"_links":{"self":[{"href":"https:\/\/www.doanduyhai.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/1859"}],"collection":[{"href":"https:\/\/www.doanduyhai.com\/blog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.doanduyhai.com\/blog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.doanduyhai.com\/blog\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.doanduyhai.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1859"}],"version-history":[{"count":10,"href":"https:\/\/www.doanduyhai.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/1859\/revisions"}],"predecessor-version":[{"id":1875,"href":"https:\/\/www.doanduyhai.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/1859\/revisions\/1875"}],"wp:attachment":[{"href":"https:\/\/www.doanduyhai.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1859"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.doanduyhai.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1859"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.doanduyhai.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1859"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}