Project

General

Profile

Severe problems with Tigase

Stanislav Ruzani
Added over 5 years ago

Hi,

we are using Tigase version 5.1.5 and we are experiencing serious stability issues. The server is unable to run for longer time (a few hours) without a complete hangup. It stops to respond to XMPP requests, HTTP statistics are not working, it just dies (process is still running).

We had to write a script that checks the HTTP statistics port and when it becomes unavailable, it reboots the server.

The issue is that we don't have many connections running on the server - only about 4-5 thousands on a powerful machine (with 64GB of Ram). We have tried to analyze thread stack dumps but there was nothing wrong there.

Is there any way how to find out what is wrong with the server once it "freezes"? Thank you!


Replies (3)

Added by Wojciech Kapcia TigaseTeam over 5 years ago

  • Could you provide settings used by Tigase (i.e. etc/tigase.conf and @etc/init.properties@)?

  • Do you have debugging enabled?

  • How much memory Tigase uses?

  • Are there any exceptions in the logs?

Added by Stanislav Ruzani over 5 years ago

Hi,

so here is our hardware:

8 cores x Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz, 32GB Ram, dedicated machine with 1GB up/downlink.

We are using integrated database -derby.

Tigase configuration:

ENC="-Dfile.encoding=UTF-8 -Dsun.jnu.encoding=UTF-8"

DRV="-Djdbc.drivers=com.mysql.jdbc.Driver:org.postgresql.Driver:org.apache.derby.jdbc.EmbeddedDriver"

#GC="-XX:+UseBiasedLocking -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode -XX:ParallelCMSThreads=2 -XX:-ReduceInitialCardMarks"

#EX="-XX:+OptimizeStringConcat -XX:+DoEscapeAnalysis -XX:+UseNUMA"

PF="-d64 -Xms512m -Xmx4g"

JAVA_HOME="/usr/lib/jvm/java-1.6.0-openjdk-amd64"

CLASSPATH=""

JAVA_OPTIONS="${GC} ${PF} ${EX} ${ENC} ${DRV} -server"

TIGASE_CONFIG="etc/tigase.xml"

TIGASE_OPTIONS=" --property-file etc/init.properties "

init.properties:

--user-db = derby

--admins = admin@s0.coremediacenter.com

--user-db-uri = jdbc:derby:/home/stanley/tigasedb

config-type = --gen-config-def

--virt-hosts = s0.coremediacenter.com

--debug = server

--monitoring = jmx:9050,http:9080,snmp:9060

--vhost-tls-required = true

When it happens, we are usually getting sql exceptions like:

013-09-16 11:40:41 Presence.process() WARNING: Error accessing database for presence data: {0}

tigase.db.TigaseDBException: Error getting user data for: 026066657@s0.coremediacenter.com/null/roster

    at tigase.db.jdbc.JDBCRepository.getData(JDBCRepository.java:348)

    at tigase.db.UserRepositoryMDImpl.getData(UserRepositoryMDImpl.java:138)

    at tigase.xmpp.RepositoryAccess.getData(RepositoryAccess.java:266)

    at tigase.xmpp.impl.roster.RosterFlat.loadUserRoster(RosterFlat.java:702)

    at tigase.xmpp.impl.roster.RosterFlat.getUserRoster(RosterFlat.java:664)

    at tigase.xmpp.impl.roster.RosterFlat.getBuddies(RosterFlat.java:266)

    at tigase.xmpp.impl.roster.RosterAbstract.getBuddies(RosterAbstract.java:706)

    at tigase.xmpp.impl.Presence.broadcastProbe(Presence.java:684)

    at tigase.xmpp.impl.Presence.processOutInitial(Presence.java:1523)

    at tigase.xmpp.impl.Presence.process(Presence.java:915)

    at tigase.server.xmppsession.SessionManager$ProcessorWorkerThread.process(SessionManager.java:2135)

    at tigase.util.WorkerThread.run(WorkerThread.java:132)

Caused by: java.sql.SQLException: ResultSet not open. Operation 'next' not permitted. Verify that autocommit is OFF.

When I checked the log last time, it had 150 Gigabytes(!), which was created in about 1 week.

We also see quite often this:

2013-09-16 13:00:05 TLSIO.write() WARNING: Infinite loop detected in write(buff) TLS code, tlsWrapper.getStatus(): NEED_READ

2013-09-16 13:00:05 TLSIO.writeBuff() WARNING: Infinite loop detected in writeBuff(buff) TLS code, tlsWrapper.getStatus(): NEED_READ

2013-09-16 13:03:04 TLSIO.write() WARNING: Infinite loop detected in write(buff) TLS code, tlsWrapper.getStatus(): NEED_READ

2013-09-16 13:03:40 TLSIO.writeBuff() WARNING: Infinite loop detected in writeBuff(buff) TLS code, tlsWrapper.getStatus(): NEED_READ

2013-09-16 13:04:58 TLSIO.write() WARNING: Infinite loop detected in write(buff) TLS code, tlsWrapper.getStatus(): NEED_READ

2013-09-16 13:07:39 TLSIO.write() WARNING: Infinite loop detected in write(buff) TLS code, tlsWrapper.getStatus(): NEED_READ

Could this be caused by derby database? Is there some simple way how to migrate to postgres for example?

Thanks a lot!!

Avatar?id=6023&size=32x32

Added by Artur Hefczyc TigaseTeam over 5 years ago

Derby database is not suitable for any kind of production system. We provide support for Derby for testing and development purposes only. Please use "a real" database such as MySQL or PostgreSQL for running Tigase on a production system. We have seen behavior and problems like this with Derby before and I could not find satisfying solution for Derby. For a short-term solution until you migrate to a different database you could try a more recent version of Derby. Maybe it will work better.

Also we strongly suggest not to use openjdk, especially version 6 had significant performance and stability issues. We suggest to use JDK (JVM) from Oracle/Sun.

We do not know any simple way to migrate data between different databases, we could offer you consulting services and help with the migration though.

    (1-3/3)