Project

General

Profile

tigase v 7.1.1 exception :too many open files

flame fire
Added 9 months ago

as the attached file,In our production environment,sometimes it occurs that exception,and tigase server will be dead.

after restart tigase,it will recover and work normally.

had you meet this situation?

what can i do for this?

q1.png (5.71 KB) q1.png too many open files

Replies (12)

Added by flame fire 9 months ago

I had done as u said.

but there are still many non_established tcp connections in our system,

I think some bugs with java nio ?

Added by Wojciech Kapcia TigaseTeam 9 months ago

What kind of connections are those? user connections (port 5222)? Could you share more details about your installation? Which Java version and Operating system do you use?

Added by flame fire 7 months ago

I had update tigase to 7.1.2 ,and our OS is centos 7.3.1611
As attach file,i use "lsof" with tigase PID,
and tigase establishes so many tcp connections until it crashs.

had you meet this situation?

what can i do for this?

Added by Wojciech Kapcia TigaseTeam 7 months ago

flame fire wrote:

I had done as u said.
but there are still many non_established tcp connections in our system,

For which user you have applied this changes?

flame fire wrote:

I had update tigase to 7.1.2 ,and our OS is centos 7.3.1611
As attach file,i use "lsof" with tigase PID,
and tigase establishes so many tcp connections until it crashs.

It looks like the settings from the guide are not applied correctly. You are also running tigase as root user which is not recommended and can cause issues when you configure the limits for different user - which is what most likely happens in this case.

had you meet this situation?
what can i do for this?

Please share full details:

  • Tigase configuration;
  • Used Java version and flavour (i.e. OpenJDK or OracleJDK);
  • output of following command run from root account and dedicated Tigase account: ulimit -a
Avatar?id=6023&size=32x32

Added by Artur Hefczyc TigaseTeam 7 months ago

Wojciech Kapcia wrote:

flame fire wrote:

I had done as u said.
but there are still many non_established tcp connections in our system,

For which user you have applied this changes?

flame fire wrote:

I had update tigase to 7.1.2 ,and our OS is centos 7.3.1611
As attach file,i use "lsof" with tigase PID,
and tigase establishes so many tcp connections until it crashs.

You are also running tigase as root user

On many Linux distribution, the file limits cannot be increased over 1024.
The ulimits settings have no effect on root user.

Added by flame fire 7 months ago

Wojciech Kapcia wrote:

flame fire wrote:

I had done as u said.
but there are still many non_established tcp connections in our system,

For which user you have applied this changes?

flame fire wrote:

I had update tigase to 7.1.2 ,and our OS is centos 7.3.1611
As attach file,i use "lsof" with tigase PID,
and tigase establishes so many tcp connections until it crashs.

It looks like the settings from the guide are not applied correctly. You are also running tigase as root user which is not recommended and can cause issues when you configure the limits for different user - which is what most likely happens in this case.

had you meet this situation?
what can i do for this?

Please share full details:

  • Tigase configuration;
  • Used Java version and flavour (i.e. OpenJDK or OracleJDK);
  • output of following command run from root account and dedicated Tigase account: ulimit -a
  1. here is tigase conf:
config-type=--gen-config-def
--virt-hosts=mline.9yiwu.com
--admins=admin@mline.9yiwu.com,pubsub@mline.9yiwu.com,http@mline-hk-core01
--cluster-mode=true

--cluster-nodes=mline-hk-core01,mline-hk-core03
--cluster-connect-all = true

--cm-ht-traffic-throttling=xmpp:25k:0:disc,bin:200m:0:disc
--cm-see-other-host=none
--debug=server,xmpp.impl,cluster
--user-db-uri=jdbc:mysql://mask/tigasedb?user=mask&password=mask&autoReconnect=true&useUnicode=true&characterEncoding=utf8

--sm-plugins=-amp,+resource_manager,+token_manager,-msgoffline,msgoffline_manager,+session_manager,+message

--comp-name-2=pubsub
--comp-class-2=tigase.pubsub.PubSubComponent

--comp-name-3=http
--comp-class-3=tigase.http.HttpMessageReceiver

http/setup/admin-credentials=mask:mask

c2s/processors[s]=urn:xmpp:sm:3

http/rest/api-keys[s]=open_access

--vhost-tls-required=true
--vhost-anonymous-enabled=false
--vhost-register-enabled=false

basic-conf/logging/java.util.logging.FileHandler.limit=200000000
basic-conf/logging/java.util.logging.FileHandler.count=50

c2s/urn\:xmpp\:sm\:3/max-resumption-timeout[I]=10
c2s/urn\:xmpp\:sm\:3/resumption-timeout[I]=10
  1. java version is 1.8 and it is oracle jdk.

3.our os always use 'root'.

run the result of  'ulimit -a' :
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 31217
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 65535
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 31217
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

Added by Wojciech Kapcia TigaseTeam 7 months ago

You have limit of

open files                      (-n) 65535

And the sole Tigase process uses more than two dozens thousands (third of the limit). Do you have other processes running on that machine? Please check the count of already opened files for that user at that machine…

Please provide exact Java version, best would be $ java -version

Added by flame fire 7 months ago

Wojciech Kapcia wrote:

You have limit of

open files                      (-n) 65535

And the sole Tigase process uses more than two dozens thousands (third of the limit). Do you have other processes running on that machine? Please check the count of already opened files for that user at that machine…

Please provide exact Java version, best would be $ java -version

there are only 310 registered users in tigase.

  1. show result of 'java -version': java version "1.8.0_131" Java(TM) SE Runtime Environment (build 1.8.0_131-b11) Java HotSpot(TM) 64-Bit Server VM (build 25.131-b11, mixed mode)

2."Do you have other processes running on that machine"
No yet.

3."Please check the count of already opened files for that user at that machine"
could u show me the cmd ?

by the way,till now,tigase works well.

(1)

Added by Wojciech Kapcia TigaseTeam 6 months ago

flame fire wrote:

3.our os always use 'root'.

I would highly recommend using dedicated user account for running Tigase.

flame fire wrote:

there are only 310 registered users in tigase.

Actually in terms of established xmpp-connections there were only 4.
I've noticed that you also have some custom code ( +resource_manager,+token_manager,-msgoffline,msgoffline_manager,+session_manager ) - could you expound on them?

3."Please check the count of already opened files for that user at that machine"
could u show me the cmd ?

You could run sysctl fs.file-nr for example.

Added by flame fire 6 months ago

Wojciech Kapcia wrote:

flame fire wrote:

3.our os always use 'root'.

I would highly recommend using dedicated user account for running Tigase.

flame fire wrote:

there are only 310 registered users in tigase.

Actually in terms of established xmpp-connections there were only 4.
I've noticed that you also have some custom code ( +resource_manager,+token_manager,-msgoffline,msgoffline_manager,+session_manager ) - could you expound on them?

3."Please check the count of already opened files for that user at that machine"
could u show me the cmd ?

You could run sysctl fs.file-nr for example.

Thx very much!!!!!
1.I run "sysctl fs.file-nr" ,and result is " fs.file-nr = 1056 0 360000"

2.We have some custom code ( +resource_manager,+token_manager,-msgoffline,msgoffline_manager,+session_manager ) .let me expound on them as below:

a. resource_manager plugin is used for checking that whether client resource name equals to "Mline" or not .

b. token_manager plugin is used for getting the token string.for example:
client send packet :

<iq id='1282W-42' type='set'><query xmlns='com:9yiwu:mline:token'/></iq>

and server response:

<iq type="result"><query><token expire="1">51dd687db1cf4167aee4a31b059f0371</token></query></iq>

in this codes ,plugin will send http request to our other system to get the token string and reply it to client,this maybe spend 1~3ms.

c.session_manager plugin

this plugin had removed.so we can ignore it.

d.msgoffline_manager plugin is similar to msgoffline plugin,except that i add some codes after store offline messages into db , which will copy offline messages and send them to our other system with http request.the codes maybe also spend 1~3ms .

Added by Wojciech Kapcia TigaseTeam 6 months ago

Please try running service on the dedicated account instead of using root.

    (1-12/12)