Project

General

Profile

Very low performance

Gureen Ryuu
Added over 4 years ago

Hi all.

We're currently trying to benchmark the performance of our server on Windows Azure. We're sending around 4000 messages per second. However, it seems the performance is quite horrible, and it barely manages to process 300 packets per second, leading to a delay of having messages received by a 10-20 second delay.

The server we have is a CentOS 6 64 bit, running Tigase 5.3.

The server specs are: 2x AMD Opteron(tm) Processor 4171 HE CPUs

3GB RAM

Here are the JVM options for Tigase:

${GC} ${EX} ${ENC} ${DRV} -server -Xms2000M -Xmx2000M -XX:PermSize=128m -XX:MaxPermSize=512m -XX:MaxDirectMemorySize=256m

Edit:

We just tried it on a server with the following specs:

4 cores CPU E5-2670 0 @ 2.60GHz

15GB RAM

The JVM Options are the following:

JAVA_OPTIONS="${GC} ${EX} ${ENC} ${DRV} -server -Xms9000M -Xmx9000M -XX:PermSize=2560m -XX:MaxPermSize=2560m -XX:MaxDirectMemorySize=2560m "

But this only increased it from 300 packets to 700 packets per second..


Replies (15)

Avatar?id=6023&size=32x32

Added by Artur Hefczyc TigaseTeam over 4 years ago

The most likely reason for such a poor performance is debug logging activated. Make sure, you have logging switched to minimum. If you comment out --debug line in the config file, it should be enough. Even then, have a look at the file to ensure there is no excessive logging. This may happen in case of some error generated during processing which is logged even with debug set to off. This usually happens if DB is misconfigured.

Another potential reason may be slow DB access which may impact performance of the server. Make also sure that you do not send messages to offline users, otherwise messages are recorded to DB which of course slows everything down or if you have to send messages to offline users, than turn off message storage for offline users.

Please also make sure you use Java from Oracle, the latest JDK7. We have encountered performance issues with OpenJDK so our suggestion is not to use it.

Also, a critical element during performance and load tests is to collect the Tigase server statistics. Tigase provides over 1,000 different performance and load metrics which allows you to precisely pinpoint a bottleneck and figure out where is the problem. If you share metrics dump with us we would be able to give you more specific suggestions where is the problem.

I also suggest to connect our Tigase Monitor during the test which gives you a good overview of what is going on inside the Tigase XMPP Server.

Please also, if possible, share more details of your test scenario:

  1. How many users do you have online

  2. What is user login/logout rate

  3. What is average user session duration time

  4. Average user contact list

  5. Average frequency of user states change

  6. And average frequency a user sends a message

  7. What kind of software do you use to simulate users load

  8. Any other details which you feel might be helpful

Added by Gureen Ryuu over 4 years ago

Thank you for your reply.

We've already had the debugging removed, as it was the first thing we suspected might be hampering performance. As for DB access and offline messages, the users are only sending messages to each other, so only online messages are being sent.

After reading your post, I installed the Oracle JDK7 instead of OpenJDK, however that didn't seem to have any effect on the problem.

Regarding your questions about the details of the test scenario:

  1. How many users do you have online: We have around 250 users online.

  2. What is user login/logout rate: There is no rate. We log in all the users, and they stay logged in for the duration of the test.

  3. What is average user session duration time : As long as the test takes.

  4. Average user contact list : The test users, so around 250 contacts.

  5. Average frequency of user states change : None, the users just remain online.

  6. And average frequency a user sends a message : The users all send 5 messages at the same time to all contacts, so around 1000 message in 1-5 seconds and then they stop.

  7. What kind of software do you use to simulate users load : We made our own tool, we log in all 250 users and then they send each other messages.

  8. Any other details which you feel might be helpful : Using PSI we saw that the queue in tigase had around 50k messages.

As for the metrics dump, we'd be glad to provide it, how can we obtain that?

Added by Wojciech Kapcia TigaseTeam over 4 years ago

Gureen Ryuu wrote:

And average frequency a user sends a message : The users all send 5 messages at the same time to all contacts, so around 1000 message in 1-5 seconds and then they stop.

So, 5 messages to each of 250 contacts means 1250 messages in 5 seconds from 1 user; given that this applies to all users this results in 312.500 messages in 5 seconds hence 62.500 messages per second - correct?

As for the metrics dump, we'd be glad to provide it, how can we obtain that?

You can use attached groovy script - StatsDumperPub.groovy (it requires tigase-server.jar in the classpath).

Added by Gureen Ryuu over 4 years ago

Wojciech Kapcia wrote:

Gureen Ryuu wrote:

And average frequency a user sends a message : The users all send 5 messages at the same time to all contacts, so around 1000 message in 1-5 seconds and then they stop.

So, 5 messages to each of 250 contacts means 1250 messages in 5 seconds from 1 user; given that this applies to all users this results in 312.500 messages in 5 seconds hence 62.500 messages per second - correct?

Pretty much. We did a much smaller benchmark, around 100k messages in 30 seconds, so around 3000 messages per second. Messages were taking around a maximum of 130 seconds to arrive to the other users and an average of 60 seconds.

As for the metrics dump, we'd be glad to provide it, how can we obtain that?

You can use attached groovy script - StatsDumperPub.groovy (it requires tigase-server.jar in the classpath).

Thank you. Though I can't find the "Add new script" in PSI's Service Discovery like in the tutorial image below. Or am I supposed to run it from somewhere else?

Added by Wojciech Kapcia TigaseTeam over 4 years ago

Thank you. Though I can't find the "Add new script" in PSI's Service Discovery like in the tutorial image below. Or am I supposed to run it from somewhere else?

This one is to be run from the system console:

$ groovy -cp tigase-server.jar StatsDumperPub.groovy [hostname] [username] [password]

where the parameters define JMX credentials.

In order to run it, you also need to enable JMX monitoring, include following in @init.properties@:

--monitoring=jmx:9050

Added by Gureen Ryuu over 4 years ago

Thank you for your quick replies.

Ok so I put the proper classpath, added the monitoring in the init.properties file.

When I ran the groovy script, I got the following error:

@ep 22, 2014 1:33:26 PM org.codehaus.groovy.runtime.m12n.MetaInfExtensionModule newModule

WARNING: Module [groovy-nio] - Unable to load extension class [org.codehaus.groovy.runtime.NioGroovyMethods]

Caught: java.lang.UnsupportedClassVersionError: tigase/stats/JMXProxyListener : Unsupported major.minor version 51.0

java.lang.UnsupportedClassVersionError: tigase/stats/JMXProxyListener : Unsupported major.minor version 51.0@

I have Tigase 5.3 installed, as well as JDK 7. Is tigase extras 1.1.0 ( http://maven.tigase.org/tigase/tigase-extras/1.1.0-tigase-server-5.2.0-rc2/tigase-extras-1.1.0-tigase-server-5.2.0-rc2.jar ) incompatible with @Tigase 5.3@? I also tried removing the extras to see if it it was installed by default, but I got the same error.

Avatar?id=6023&size=32x32

Added by Artur Hefczyc TigaseTeam over 4 years ago

Try to use JDK8, I think some code could be compiled using JDK8 and some JDK7, hence the error messages.

Regarding your performance tests. Even though you have only 250 users online, you generate pretty intensive load during a very short period of time:

  1. 62,500 presence packets during login of 250 users

  2. 312,500 messages in up to 5 seconds (everybody sends 5 messages to everybody), so again 62,500 messages per second

This is a lot but manageable for Tigase XMPP Server. You definitely need more RAM for such an intense traffic. 2GB might be definitely too little RAM as GC may slow things down. However, I do not thing it is really an issue here. Tigase can normally process about 10k XMPP packets per second per CPU core.

A few more questions here then:

  1. What is a size of each message?

  2. How do you connect clients to the Tigase XMPP Server, standard XMPP or Bosh?

  3. Are you sure that the slowing down factor is not on the client side?

  4. With such a high traffic per single client connection, I would suggest to increase network buffers size: http://www.tigase.org/content/net-buff-standard

Added by Gureen Ryuu over 4 years ago

Installing JDK8 fixed the monitoring error. Please find attached the metric dumps you requested.

The simulation parameters were: 100 online users, each sending 4 messages to each other in a duration of 20 seconds, so around 2000 messages per second.

Allocated memory is: JAVA_OPTIONS="${GC} ${EX} ${ENC} ${DRV} -server -Xms5000M -Xmx5000M -XX:PermSize=1024m -XX:MaxPermSize=1024m -XX:MaxDirectMemorySize=1024m "

CPU is 8x core Intel(R) Xeon(R) CPU E5-2670 0 2.60GHz@

Artur Hefczyc wrote:

Regarding your performance tests. Even though you have only 250 users online, you generate pretty intensive load during a very short period of time:

62,500 presence packets during login of 250 users

312,500 messages in up to 5 seconds (everybody sends 5 messages to everybody), so again 62,500 messages per second

After reading your message, we tried it with 10 users sending 40 messages to each other, but it wind up with the same issue, so it's probably not the presence packets that are causing the issue. But yes, you're right, it is quite the load either way.

This is a lot but manageable for Tigase XMPP Server. You definitely need more RAM for such an intense traffic. 2GB might be definitely too little RAM as GC may slow things down. However, I do not thing it is really an issue here. Tigase can normally process about 10k XMPP packets per second per CPU core.

We upgraded the server. We then increased the memory allocated to Tigase up to 5GB, and the CPU is now 8 cores.

A few more questions here then:

What is a size of each message?

Around 40-80 characters long.

How do you connect clients to the Tigase XMPP Server, standard XMPP or Bosh?

Standard XMPP.

Are you sure that the slowing down factor is not on the client side?

We thought that might be the reason too, but wouldn't the queue in Tigase be empty then?

With such a high traffic per single client connection, I would suggest to increase network buffers size: http://www.tigase.org/content/net-buff-standard

We're going to try that then.

Added by Gureen Ryuu over 4 years ago

Sorry double post.

Avatar?id=6023&size=32x32

Added by Artur Hefczyc TigaseTeam over 4 years ago

I looked at the attached stats: stats_2014-09-23_06-12-52.txt because this is stats pretty late to the test but still with traffic going on. So this is the best overview of the Tigase metrics.

There are 2 non-empty queues:

  1. sess-man/Processor: foo-message-all , Queue: 1335, AvTime: 3 I suppose this is your custom plugin processing messages. Apparently this is what is slowing you down. You need to optimize the plugin to lower down average processing time. It is now about 3ms which means a single thread can handle max of 1,000 / 3 = 300 messages per second. Your CPU is at about 10% utilization so, most likely if you increase number of threads for the plugin, it may handle more packets than it is handling now. I do not remember what is default but if you look at ligase-console.log file, it prints all the plugins when they are loaded along with a number of threads for each plugin. Here is description on how to specify number of threads for a plugin: http://www.tigase.org/content/sm-plugins

  2. sess-man/In queue wait: NORMAL 949 This is really unexpected. Average processing time for SM is between 9ms to 30ms. This is huge! Do you have any custom code in SM or custom preprocessor plugins? Processing of this queue inside SM is normally very fast - 0ms. If you have any custom code which is executed within SM thread pool you need to either optimize it or move it out from SM. This slows you down tremendously.

Added by Gureen Ryuu over 4 years ago

Artur Hefczyc wrote:

I looked at the attached stats: stats_2014-09-23_06-12-52.txt because this is stats pretty late to the test but still with traffic going on. So this is the best overview of the Tigase metrics.

There are 2 non-empty queues:

sess-man/In queue wait: NORMAL 949 This is really unexpected. Average processing time for SM is between 9ms to 30ms. This is huge! Do you have any custom code in SM or custom preprocessor plugins? Processing of this queue inside SM is normally very fast - 0ms. If you have any custom code which is executed within SM thread pool you need to either optimize it or move it out from SM. This slows you down tremendously.

We don't, the only custom code is in the plugins (foo-message-all). We downloaded the @tigase-server.jar@, and not compiled it from source.

sess-man/Processor: foo-message-all , Queue: 1335, AvTime: 3 I suppose this is your custom plugin processing messages. Apparently this is what is slowing you down. You need to optimize the plugin to lower down average processing time. It is now about 3ms which means a single thread can handle max of 1,000 / 3 = 300 messages per second. Your CPU is at about 10% utilization so, most likely if you increase number of threads for the plugin, it may handle more packets than it is handling now. I do not remember what is default but if you look at ligase-console.log file, it prints all the plugins when they are loaded along with a number of threads for each plugin. Here is description on how to specify number of threads for a plugin: http://www.tigase.org/content/sm-plugins

We thought it might be the custom plugin that's the problem, so we enabled the default AMP plugin and we got pretty much the same throughout. We assigned 100-1000 threads for the foo-message-all and AMP when we were testing, it increased the messages from 350 to around 750 per second, but it hit a ceiling of 750 and didn't increase more than that. The CPU usage remained at 15% (each core)..

We've attached the stats that were taken during the vanilla tigase using AMP if it's any help..

Also here's the init.properties for the plugins

--sm-plugins = amp=1000,-message,-msgoffline

I'm not sure if we're supposed to add something else other than --sm-plugins?

Avatar?id=6023&size=32x32

Added by Artur Hefczyc TigaseTeam over 4 years ago

I looked at the attached stats and again. There is long processing time for SM - 5ms and AMP plugin - 8ms. The only explanation which comes to my mind if heavy logging during the test which can slow things down. But you said before that logging was disabled. In such a case I do not have any more ideas what could slow processing down. In order to get to the bottom of this we would need to investigate it closer.

As you said you have 4 core CPU, increasing threads number to 100 or 1,000 does not make much sense. You usually want to have 2x or 4x as many threads as you have CPUs. As you said, even with 1000 threads CPU was used in 15% max. This suggest the processing was slowed down on I/O. Either DB was being access during processing time (which does not happen normally) or hard drive (logging).

Added by Gureen Ryuu over 4 years ago

Artur Hefczyc wrote:

I looked at the attached stats and again. There is long processing time for SM - 5ms and AMP plugin - 8ms. The only explanation which comes to my mind if heavy logging during the test which can slow things down. But you said before that logging was disabled. In such a case I do not have any more ideas what could slow processing down. In order to get to the bottom of this we would need to investigate it closer.

As you said you have 4 core CPU, increasing threads number to 100 or 1,000 does not make much sense. You usually want to have 2x or 4x as many threads as you have CPUs. As you said, even with 1000 threads CPU was used in 15% max. This suggest the processing was slowed down on I/O. Either DB was being access during processing time (which does not happen normally) or hard drive (logging).

We upgraded it to 8 cores. After reading your post, we went ahead and dug out a bit more.. Instead of using the vanilla configuration, we started disabling as many plugins as we could.

After disabling the -jabber:iq:privacy plugin, the throughput shot up to 10k messages (for the entire server, not per core), and the CPU usage shot up to 100% on all 8 cores! It seemed that plugin one of the reason for the bottleneck.. Why might this be? Something with the DB configuration? Or is it not supposed to access the DB at all?

Also any idea why Tigase is doing only 10k instead of 80k messages per second at 100% CPU usage?

Thank you again for your close support!

Avatar?id=6023&size=32x32

Added by Artur Hefczyc TigaseTeam over 4 years ago

We upgraded it to 8 cores. After reading your post, we went ahead and dug out a bit more.. Instead of using the vanilla configuration, we started disabling as many plugins as we could.

After disabling the -jabber:iq:privacy plugin, the throughput shot up to 10k messages (for the entire server, not per core), and the CPU usage shot up to 100% on all 8 cores! It seemed that plugin one of the reason for the bottleneck.. Why might this be? Something with the DB configuration? Or is it not supposed to access the DB at all?

I have created a ticket to investigate performance issues with the plugin: #2309. However, most likely the problem you experience happens because you send all the messages shortly after user login and all the messages are sent at the same time. So they hit Tigase when it loads privacy lists after user login, which means DB access, which means slowdown, low CPU usage and long processing time. I think, once the Tigase loads privacy-lists all subsequent messages would be processed much faster. At least there would be no slowdown in the SM thread. And I think this is what I observed in statistic you attached. Initially SM processing time was 30ms and at the end it was 9ms. I guess, if the test was run for a longer time it would go down to 0 or close.

Still, long processing time of message plugin (3ms) is something to look at as well. It should also be closer to 0. I added a new ticket to investigate the problem: #2310

You are on watchers list for both tickets so you will be informed on any update.

Also any idea why Tigase is doing only 10k instead of 80k messages per second at 100% CPU usage?

It would need some investigation to be sure but likely explanation is that you run your tests within a VM which always offers lower performance than a real HW. Our tests showing 10k XMPP packet/CPU core were run on a real HW.

Avatar?id=6023&size=32x32

Added by Artur Hefczyc TigaseTeam over 4 years ago

I found the why performance was so bad. Somehow, not long ago our privacy lists plugin code was modified in such a way that it introduced in certain cases a DB call on each packet it was processing. I fixed the code. If you check now the code from our source code repository (devel branch) the performance should be again very good.

    (1-15/15)