Project

General

Profile

Issue with connection between server and MUC external component

Serhii Administrator
Added about 4 years ago

Hi everyone

We use a multi-user chat as an external component.

We have 2 physical servers: one for ligase-server and the second one for MUC component

Tigase-Server has next settings in init.properties:

--comp-name-1 = ext
--comp-class-1 = tigase.server.ext.ComponentProtocol
--external = muc.chatcluster.example.com:muc-secret:listen:5270:IP:ReceiverBareJidLB

MUC component has next settings in init.properties:

--comp-name-1 = muc
--comp-class-1 = tigase.muc.MUCComponent
--external = muc.chatcluster.example.com:muc-secret:connect:5270:chatcluster.example.com;cluster-test-1;cluster-test-2:accept

We noticed one issue - when we rebooted Tigase-Server then MUC component became unreachable for Tigase-Server. Looks like connection is lost and to fix this we have to reboot MUC component.

Is there a way to automatically re-establish the connection from MUC to Tigase-server?


Replies (12)

Added by Wojciech Kapcia TigaseTeam about 4 years ago

Tigase should perform re-connection automatically (from the MUC instance, which is configured with @connect@). Are there any exceptions/errors in the logs?

Added by Serhii Administrator about 4 years ago

In the muc-server logs(tigase-console.log and tigase.log.0) do not have any errors or some info.

In etc/init.properties I have this: --debug = muc,server,tigase,auth,db,xmpp.

Any ideas?

Added by Wojciech Kapcia TigaseTeam about 4 years ago

With such logging enabled on both instances please observe entries from tigase.server.ConnectionManager.reconnectService() regarding reconnection attempts. Focus also on entries coming from @ComponentProtocol@.

Added by Serhii Administrator about 4 years ago

Still nothing in MUC logs. We just switch off tigase-server.

Is there something kind of timeout or keep alive settings? Looks like MUC dosn't know that tigase-server went down.

One more question:

http://docs.tigase.org/tigase-server/snapshot/Administration_Guide/html/#_external_component_and_cluster

--external = muc.xmpp-test.org:muc-secret:connect:5270:xmpp-test.org;blue.xmpp-test.org;green.xmpp-test.org,red.xmpp-test.org:accept

there are ',' and ';' between hosts. What exactly must be used?

Added by Wojciech Kapcia TigaseTeam about 4 years ago

Serhii Administrator wrote:

Still nothing in MUC logs. We just switch off tigase-server.

Is there something kind of timeout or keep alive settings? Looks like MUC dosn't know that tigase-server went down.

No, after TCP connection is broken the re-connect should happen automatically.

--external = muc.xmpp-test.org:muc-secret:connect:5270:xmpp-test.org;blue.xmpp-test.org;green.xmpp-test.org,red.xmpp-test.org:accept

there are ',' and ';' between hosts. What exactly must be used?

hostnames should be separated by @;@.

Added by Serhii Administrator about 4 years ago

after TCP connection is broken

how does MUC know that the connection is broken?

Is there something like periodical checks or tigase-server closes this connection when goes down?

Added by Wojciech Kapcia TigaseTeam about 4 years ago

Tigase closes connections when it's being shut down.

Added by Serhii Administrator about 4 years ago

When everything is working fine, i see that muc-server have established connection to chat-server. I tested this using the command --- lsof -i -n -P.

When I stop the chat server, I can see that the connection is lost.After chat-server restart - it doesn’t reconnect.

Is there any other way to define where is the problem?

Added by Serhii Administrator about 4 years ago

We have done a short investigation and found that the ConnectionManager.Watchdog class is responsible for detecting the status of connections

We see next settings:

1) Watchdog runs every 10 mins (Thread.sleep(10 * MINUTE))

2) MaxInactiveTime for ext services is 1000*24*HOUR

So, according to this the connection never will be closed because the value of 1000*24*HOUR is to high

We use Tigase 5.2.0

Is it the right explanation?

Added by Serhii Administrator about 4 years ago

Also we have enabled 'net' logs and looks like Tigase realises immediately that connection was closed:

2015-01-26 13:26:24.481 [pool-6-thread-4] IOService.isConnected() FINEST: Socket: nullSocket[unconnected], Connected: false, id: null

but ConnectionManager and ComponentProtocol don't know anything about it

any ideas why?

Added by Wojciech Kapcia TigaseTeam about 4 years ago

Can you share complete excerpt from the log?

Added by Serhii Administrator about 4 years ago

Here is the full MUC log immediately after ligase-server went down:

2015-01-28 09:43:46.340 [socketReadThread-3]  SocketThread.run()              FINEST:   Selector AWAKE: sun.nio.ch.EPollSelectorImpl@507ca72d
2015-01-28 09:43:46.340 [socketReadThread-3]  SocketThread.run()              FINEST:   AWAKEN: 182.30.0.175_44301_182.30.0.125_5271, ready for READING, readyOps() = 1
2015-01-28 09:43:46.340 [socketReadThread-3]  SocketThread.addAllWaiting()    FINEST:   waiting.size(): 0
2015-01-28 09:43:46.340 [pool-6-thread-1]  IOService.isConnected()            FINEST:   Socket: nullSocket[addr=cluster-test-1/182.30.0.125,port=5271,localport=44301], Connected: true, id: null
2015-01-28 09:43:46.340 [pool-6-thread-1]  IOUtil$BufferCache.get()           FINEST:   allocating buffer with size = 65,536
2015-01-28 09:43:46.340 [pool-6-thread-1]  IOService.isConnected()            FINEST:   Socket: nullSocket[unconnected], Connected: false, id: null
2015-01-28 09:43:46.340 [ResultsListener-socketWriteThread-3]  IOService.isConnected()  FINEST: Socket: nullSocket[unconnected], Connected: false, id: null
2015-01-28 09:43:46.340 [ResultsListener-socketWriteThread-3]  SocketThread$ResultsListener.run()  FINEST: REMOVED: 182.30.0.175_44301_182.30.0.125_5271

    (1-12/12)