How to Massively Scale a Single Room?
I am working on a project whose theoretical requirements allow for massive numbers of users in a single room. While there may be huge numbers of users in a single room, only the first 100-200 are permitted to generate chat messages. Everyone after that is allowed to passively listen but cannot submit messages.
We've started out toward this goal by doing a couple of things. First, we have disabled presence broadcasting using the PresenceModuleNoBroadcast code we found in your repo. This should eliminate the problem of clients being hammered with presence stanzas from every client in the room. We don't use presence for anything, so this is an acceptable modification for us.
Second, we implemented your ACS MUC component. At present, we are using the default strategy. However, this clearly won't work for our use case since we need to be able to virtualize rooms across many instances in order to scale to our theoretical maximum. When we try to enable the ClusteredRoomStrategy in our config, we are having issues for reasons that we don't yet understand. Perhaps this setup is not compatible with the PresenceModuleNoBroadcast code?
In any event, before investing more energy in this quest, I would like to know what you believe the maximum theoretical number of passive users in a single room is (assuming we allow for significant customization to achieve the goal)? How have other tackled this type of problem in the past? Is there some other alternative approach we should be taking to achieve massive scale in a single room given the requirements outlined above?
Thanks in advance.
Added by Artur Hefczyc almost 4 years ago
With large or huge number of users in a single room (and by large I mean above 1k and huge I mean above 10k) the main issue is not a number of users but a traffic. If you already run some tests and experiments you most likely know what I am talking about.
Let's say we have a room with 25k users and a message is sent to the room, once every 10 seconds by each of permitted 200 users only. Then we have 20 messages per second sent to the room with 25k users who receive each message. This gives us 500k messages per second to process by the server. This is quite a significant traffic to process and deliver to the end user. If each message is just 1kB of characters (body and XMPP payload) you have 500MB of data to send over the network which is approx 5Gb of data to transmit. That is also a significant low level network traffic. This is all assuming you use either a standard XMPP connection or a websockets connection. With Bosh, for example it gets much worse.
We successfully tested Tigase (with ACS for MUC) for a deployment in a scenario in which 10% of room users were active, sending message every 10 seconds.
For 25k room users, this gave us 2.5k sending messages every 10 seconds which equals to traffic of 25k * 250 /sec = 6,250,000/sec. This is massive traffic to process.
Tigase can typically handle about 10k packets per second per single CPU (CPU core). So theoretically, there is no limit, you just need an appropriate number of CPUs to process the traffic and network bandwidth to transmit all the data through. However, I guess, that at some point, the cost of handling such a traffic goes unreasonably high.
I cannot comment on an alternative approach, as I do not now any details of the problem you are working on. It is possible that the project is not really about massive number people chatting in a single room but something entirely else, and MUC is the option you are considering now. I think our ACS for MUC works quite well and I cannot think of a better approach to handle large MUC rooms, however, often, due to specific requirements custom optimizations can be applied to lower the traffic.