Project

General

Profile

Deployment of Tigase in High Performance Configurations

Bruce Chung
Added over 2 years ago

We have a machine with 256G memory.

Tigase server will run on this machine but not alone.

There will be others software such as Jetty,MongoDB,Hadoop etc. running on the machine.

About 100G memory would be allocated to Tigase server.

The Tigase server will run with components MUC/Pub-sub/Http/Archiving-Message.

The Tigase server need to write down log of group chat message and private message.

As show on the chart, some key data is below:

Total user: 50 K,

Online max user: 50 K,

Online avarage user: 20 K,

Max rooms: 1000,

Avarage rooms: 100,

Max users in a room: 10 K,

Avarage users in a room: 1 K,

Max message per second: 10K/second,

Avarage message per second: 2K/second,

Max users login per minute: ?

Avarage users login per minute: ?

Max users joined room per minute: ?

Avarage users joined room per minute: ?

My Question is :

What's the jvm configuration?

How to limit the memory and CPU alocated to Tigase server?

What's the bandwidth?

And how to do some performance test?


Replies (10)

Avatar?id=6023&size=32x32

Added by Artur Hefczyc TigaseTeam over 2 years ago

Some of the specifics of your use-case are pretty much demanding. If your system has to handle the max load you should concentrate on the worse-case scenario, that is on the case with the highest load. What we really need to know what is the expected maximum load on the server to calculate requirements.

See some questions inline below:

Bruce Chung wrote:

We have a machine with 256G memory.

Tigase server will run on this machine but not alone.

There will be others software such as Jetty,MongoDB,Hadoop etc. running on the machine.

About 100G memory would be allocated to Tigase server.

Normally Tigase does not need that much memory, at least not for up to 50k users. This kind of memory is only useful if you expect that there are short-term spikes in the load/traffic which cannot be handled by the server and excessive traffic needs to be kept in memory until it is processed and send over to users over the network. More about it down below...

The Tigase server will run with components MUC/Pub-sub/Http/Archiving-Message.

Ok, below you said something about expected MUC usage but nothing about PubSub. PubSub could generate as much traffic as MUC, so you have to have some estimations for PubSub for accurate calculations.

The Tigase server need to write down log of group chat message and private message.

From your estimations below. You say you would have up to 10k messages per second. If you want them all recorded in DB, you need to make sure your database can handle 10k write queries per second. At least, as, there will be some other DB operations as well, obviously. With a single database, this might be quite tricky, I am not DB expert, so I suggest you consult somebody who can help you setup and optimize DB for such a load.

As show on the chart, some key data is below:

Total user: 50 K,

Online max user: 50 K,

Online avarage user: 20 K,

Max rooms: 1000,

Avarage rooms: 100,

Max users in a room: 10 K,

Avarage users in a room: 1 K,

Ok, just to confirm your numbers are accurate. If you have 100 rooms on average and 1k users in a room on average, this gives you 100k users online, unless you allow a single user to be in multiple rooms or allow users from other servers to connect to MUC on your system. And these are the average numbers.

For the max you have 1k rooms with up to 10k users which result in 10M users in total.

Of course, I might be mistaken and my understanding of your numbers might be wrong, I am trying to figure out the worse case scenario to calculate what you need to handle the max load that you may have.

Assuming that you have max 50k online users and these users are in multiple rooms, which results in let's say some 1000 rooms of 10k users in each. The critical factor here is how often a new message is posted to a MUC room. And this is important because every new message is multiplied 10k times and sent to 10k users in the room. So, if there is a new message every 1 minute only, then it results in roughly 166 messages/second from this one room. If you have 1,000 rooms like this you would have traffic of 166k messages/second. And I guess new messages would come more often than once a minute to a MUC room.

Max message per second: 10K/second,

Avarage message per second: 2K/second,

Well, what do you mean here? Do you mean 10k messages/second to a MUC room or in 1-1 chats between 2 users?

Max users login per minute: ?

Avarage users login per minute: ?

Max users joined room per minute: ?

Avarage users joined room per minute: ?

Given the load generated by MUC (and possibly PubSub), you can probably ignore the load from this activity unless you expect users connecting and disconnecting very frequently and entering MUC room, leaving MUC room every time they login/logout.

My Question is :

What's the jvm configuration?

How to limit the memory and CPU alocated to Tigase server?

Tigase can usually process about 10k messages per second per single CPU/CPU core assuming messages are not stored in DB, which is not true in your case. I expect the DB would be the bottleneck for you.

What's the bandwidth?

To calculate this you need an average message size on top of the number of messages.

And how to do some performance test?

There are tools available on the internet for load tests and performance testing. Alternatively our team can run load tests for you to verify if your system can handle the expected load.

Added by Bruce Chung about 2 years ago

Thank you very much for your Carefully reply.

The simplified model is below:

Some main stories of our project is:

As a host, Amy opens a MUC room, so that she can Communicate will the viewers.

As a viewer, Tom join a MUC room, so that he can Communicate will the host and other viewers.

At the same time, a viewer just can join in one room.

The max number of concurrent users is 5K.

The number of concurrent rooms is 1~200.

The avarage number of concurrent rooms is 100.

  1. The avarage scenario.

The avarage number of concurrent users of a MUC room: 50.

The avarage number of new messages posted to the MUC room per second: 10/second.

So, the messages in a room per second: 10/second * 50 = 500/second

The total messages per second: 500/second * 100 = 50K/second.

  1. The worse-case scenario is:

The max number of concurrent users of a MUC room: 5K. (All the 5K active users are in one room.)

The max number of new messages posted to the MUC room per second: 20/second.

So, the messages in a room per second: 20/second * 5K = 100K/second

The total messages per second: 100K/second.

Tigase can usually process about 10k messages per second per single CPU/CPU core assuming messages are not stored in DB, which is not true in your case. I expect the DB would be the bottleneck for you.

How can I support the wrose-case scenario?

  1. About the log of message:

Now using the tigase.muc.logger.RoomChatLogger class to write the message to the file with test format.

Is it better than store the messages in DB?

Avatar?id=6023&size=32x32

Added by Artur Hefczyc TigaseTeam about 2 years ago

Bruce Chung wrote:

Thank you very much for your Carefully reply.

The simplified model is below:

Some main stories of our project is:

As a host, Amy opens a MUC room, so that she can Communicate will the viewers.

As a viewer, Tom join a MUC room, so that he can Communicate will the host and other viewers.

At the same time, a viewer just can join in one room.

The max number of concurrent users is 5K.

The number of concurrent rooms is 1~200.

The avarage number of concurrent rooms is 100.

  1. The avarage scenario.

The avarage number of concurrent users of a MUC room: 50.

The avarage number of new messages posted to the MUC room per second: 10/second.

So, the messages in a room per second: 10/second * 50 = 500/second

The total messages per second: 500/second * 100 = 50K/second.

  1. The worse-case scenario is:

The max number of concurrent users of a MUC room: 5K. (All the 5K active users are in one room.)

The max number of new messages posted to the MUC room per second: 20/second.

So, the messages in a room per second: 20/second * 5K = 100K/second

The total messages per second: 100K/second.

Ok, now, how many rooms like this, at the same time, do you expect to have? Or maybe you expect to have 1 room like this and 100 other rooms with 50 users in each described above?

Tigase can usually process about 10k messages per second per single CPU/CPU core assuming messages are not stored in DB, which is not true in your case. I expect the DB would be the bottleneck for you.

How can I support the wrose-case scenario?

Assuming, you mitigated DB bottleneck and you are just asking about CPU power, you probably need 1CPU core per each 10k messages per second. Maybe less, it really depends from deployment to deployment. Our estimation is just an estimation, and you should really run load-tests for your system and expected load to confirm that. Some of our customers report that they can handle even 20k messages per second per single CPU and some report much less.

If you have machine with 256G RAM, then I understand that it also has lots of CPU power, so having 10CPU cores or even 16CPU cores dedicated to Tigase should not be a problem. Especially that you will not need that much power all the time, only at peak times the large rooms are in use. For the rest of the time, CPU would be idling, so the CPUs can be shared with other software on the server.

However, our suggestion is always to deploy Tigase on multiple, smaller servers in a cluster mode to make sure your system can sustain HW failure.

  1. About the log of message:

Now using the tigase.muc.logger.RoomChatLogger class to write the message to the file with test format.

Is it better than store the messages in DB?

From your estimations it looks like you will have about, maybe 1,000 messages per second to write to DB/HDD, I think this volume is manageable through SQL database if properly configured.

Added by Bruce Chung about 2 years ago

> Ok, now, how many rooms like this, at the same time, do you expect to have? Or maybe you expect to have 1 room like this and 100 other rooms with 50 users in each described above?

At the first phase of our project, as our estimate, the total active concurrent users will be 5000, so the case of all the 5000 users joining in the same one room will just happen once at the same time.

At the most time, there will be about 100 rooms and every room have about 50 users.

At the worse-case scenario, there will be 200 rooms(There are 200 hosts) but about all 5000 viewers join in one room.

Added by Bruce Chung about 2 years ago

However, our suggestion is always to deploy Tigase on multiple, smaller servers in a cluster mode to make sure your system can sustain HW failure.

Yes, we deploy Tigase on 2 servers in a cluster mode using DNS SRV.

Added by Bruce Chung about 2 years ago

I have the following ideas about testing on my project, please correct me.

For the case of 5000 users in one room.

  1. Register 2 account for example: host@im.com and viewer@im.com

  2. By jid host@im.com, create a room: room5000@muc.im.com

  3. By jid viewer@im.com/r_1 to viewer@im.com/r_100, 100 users connect and join in room5000@muc.im.com

  4. By jid host@im.com, send 1 message to room5000@muc.im.com per second

  5. Watch the report of Tigase statistics.

5.1 How many healthy connection in the room?

5.2 How many messages sent to the room and if any message lost?

5.3 If all the users received the messages successfully?

5.4 How many rooms now?

  1. Watch the CPU/memory/ and how many open files.

Above steps is one round.

We can modify the conditions to watch more case.

Increase the number of the users in room5000@muc.im.com

Increase the message sending rate to 2 messages/second ... 10 messages/second.

Also need to modify the Tigase configuration , modify the JVM to watch the statistics.

Also need to improve the Database TPS...

But first of all, how to watch the statistics of Tigase while running time?

Is there any tool or script can help to do that?

Added by Bruce Chung about 2 years ago

If you have machine with 256G RAM, then I understand that it also has lots of CPU power, so having 10CPU cores or even 16CPU cores dedicated to Tigase should not be a problem. Especially that you will not need that much power all the time, only at peak times the large rooms are in use. For the rest of the time, CPU would be idling, so the CPUs can be shared with other software on the server.

Our two serves have 48 cores CPU and 256G RAM.

But not all for Tigase server.

Is there any settings to limit the resource (cpu/ram/handler) for Tigase,so that other softwares have enough resources.

Avatar?id=6023&size=32x32

Added by Artur Hefczyc TigaseTeam about 2 years ago

There are many tools and ways to load test XMPP server and monitor it's activity, especially if it comes to Tigase. We usually use Tsung as the for users simulation. It does also support MUC as far as I know. You can simulate quite a lot of users connections and relatively high load.

For monitoring Tigase performance (Tigase statistics) in real time I suggest to either use Tigase Monitor which gives you nice GUI application with charts or use the StatsDumper.groovy script (from the same place as Tigase Monitor). The StatsDumper.groovy scripts connect to Tigase server and periodically dumps all the statistics to a flat file with a timestamp. You can later examine all the files to analyze the load and performance. There are also configuration options for the Tigase XMPP Server itself to automatically dump statistics to a file or DB at regular intervals of time for further analysis.

Added by Bruce Chung about 2 years ago

Thank you very much.

At the other hand, ss there any settings of Tigase to limit the resource (cpu/ram/handler) for Tigase,so that other softwares have enough resources?

For example, I read the [[[http://docs.tigase.org/tigase-server/snapshot/Properties_Guide/html_chunk/maxQueueSize.html]]]

Chapter 31. --max-queue-size

Default value: 'depends on RAM size.'

Example: --max-queue-size = 10000

Possible values: 'integer number.'

Description: The --max-queue-size property sets internal queues maximum size to a specified value. By default Tigase sets the queue size depending on the maximum available memory to the Tigase server process. It set’s 1000 for each 100MB memory assigned for JVM. This is enough for most cases. If you have however, an extremely busy service with Pubsub or MUC component generating huge number of packets (presence or messages) this size should be equal or bigger to the maximum expected number of packets generated by the component in a single request. Otherwise Tigase may drop packets that it is unable to process.

If want to limit the RAM size 100G for Tigase process, just need to set JVM -Xmx102400m, no need to set the --max-queue-size property.

Is my above idea correct?

Added by Wojciech Kapcia TigaseTeam about 2 years ago

Bruce Chung wrote:

If want to limit the RAM size 100G for Tigase process, just need to set JVM -Xmx102400m, no need to set the --max-queue-size property.

Is my above idea correct?

Yes, those are calculated on the max heap size.

    (1-10/10)