Project

General

Profile

Problem when resuming sessions with Stream Management (XEP-0198)

Kenneth Chan
Added about 5 years ago

Hi,

We are developing the mobile client implemented with XEP-0198 along with Tigase, it works perfect if we test in lab where the client is resuming the stream in a short period of time. (let say in 1x ~ 2x seconds time)

However, we found one problem that, if the resuming time is long (likely the case when we lose network for more than 1 ~ 2 min when we go to somewhere in bad network connection). After we have resumed the message, the lost message cannot be resumed and we got the following result,

After the resume is failed, we have enabled the Stream Management again, and the offline message doesn't contain the message during the loss period.

It looks like it's related to the max attribute under enable, but we cannot increase and Tigase returns max="60" no matter how.

May I know if there is any config that can increase this max attribute?

Or, why the message is not going to the offline storage after the maximum resumption time?

From XEP-0198, "A server SHOULD treat unacknowledged stanzas in the same way that it would treat a stanza sent to an unavailable resource, by either returning an error to the sender or committing the stanza to offline storage."

I'm using tigase-server-5.2.0-RC1.

Thanks.

--

Regards,

Kenneth


Replies (9)

Added by Andrzej Wójcik IoT 1 CloudTigaseTeam about 5 years ago

Hi,

Tigase XMPP Server uses hardcoded 60 seconds as default timeout limit for session resumption to protect against using higher values by client, which could lead to higher memory usage. This default timeout limit can be increased on server side by adding following line to etc/init.properties file

c2s/processors/urn\:xmpp\:sm\:3/resumption-timeout[I]=120

which will increase limit of resumption timeout to 120 seconds

As for the "missing" messages from offline history in case of broken connection which was not resumed (resumption failed), Tigase XMPP Server will return error stanzas to senders of messages if they were not acknowledged by the client. This possibility is suggested by XEP-0198 - "A server SHOULD treat unacknowledged stanzas in the same way that it would treat a stanza sent to an unavailable resource, by either returning an error to the sender or committing the stanza to offline storage." and valid as Tigase XMPP Server will return error stanzas in case of any stanza send to unavailable resource.

Added by Kenneth Chan almost 5 years ago

Do Tigase to support committing the stanza to offline storage in this case instead of return the error stanzas to senders?

Also, I cannot see any error stanzas is returning to senders on this case? Any configuration is related?

--

Regards,

Kenneth

Added by Andrzej Wójcik IoT 1 CloudTigaseTeam almost 5 years ago

Currently it is not possible to change behavior of Tigase XMPP Server on failure of stream resumption by setting/changing any option in configuration as this feature is not implemented but it may be implemented in one of next versions. If you would like this feature to be implemented in Tigase XMPP Server, feel free to add a new issue to Tigase XMPP Server project describing feature you would like to be implemented.

Stanzas are returned to senders as error stanzas when stream for which Stream Management was enabled will timeout without stream being resumed. This is not configurable and not acknowledged stanzas are always returned as error stanzas. I suppose that in your case something may be blocking error stanzas from being delivered to original sender. As I remember there was an issue with clustering enabled that prevented errors stanzas from being delivered to other nodes and fix for this issue will be part of Tigase XMPP Server RC2 which will be released soon.

Added by Natale Vinto over 4 years ago

Hi,

I've read in a previous post: https://projects.tigase.org/boards/4/topics/833?r=875#message-875

that stanzas sent to bare JID with stream management can be saved to DB for offline storage instead of being sent back to senders, because there woudn't be any ambiguity as for full JID thus stanzas are related to just one XMPP user resource. Is this behaviour kept in latest version? I wonder where are saved those stanzas, I mean in which DB tables because giving some try with two clients (sender, receiver) implementing stream management I see that there are messages previous unacked message stanzas received by the receiver client, but I can't find out from where they come.

Server version is tigase-server-5.3.0-SNAPSHOT-b3609

Stream management is actived and session output is :

<enable xmlns="urn:xmpp:sm:3" resume="true"/>
<enabled xmlns="urn:xmpp:sm:3" id="UUID" resume="true" max="60" location="localhost"/>

Is there any stream management settings I have to change in order to modify localhost as server's preferred location for reconnecting? Server is on a remote IP

Added by Andrzej Wójcik IoT 1 CloudTigaseTeam over 4 years ago

If I remember correctly this behavior is kept in latest version, but there are two places where messages are kept by stream management:

  • in memory - if stream was broken and packet was not acked and stream resumption timeout has not ended

  • in offline storage - if stream was broken and stream resumption timeout ended

In first case (in memory) we keep all unacked packet (also addressed for full jid), but when stream resumption fails (ie. due to timeout) we send to offline storage (database) only message packets addressed to bare jid. All offline messages are by default stored in msg_history table. Please keep in mind that same table is used by AMP component to store other messages as well.

It would be possible to change value of location attribute by setting def-hostname parameter for c2s component, but I would suggest to check your DNS settings as Tigase XMPP Server should detect fully qualified domain name of server on which it is running (using reverse DNS and /etc/hosts file) and will try to use it. So most propably you should add fully qualified domain name (under which server is available - name resolves to remote IP) and put it in @/etc/hosts@, so Tigase XMPP Server could detect it and use it.

Added by Natale Vinto over 4 years ago

Hi Andrzej,

I've put in /etc/hosts the IP with all tigase virtualhost that I use

IP domain.net sub.domain.net

then restarted Tigasem but if then I try to connect with stream management, I got always location=localhost is it normal?

Also I coudn't reproduce in offline storage scenario. I've sent a message stanza after simulated a network fault by blocking with iptables all INPUT and OUTPUT connections. After the 60 seconds of the session I cannot see the message in msg_history table nor got any message back. Which component in logs I should check in order to understand how stanza is handled?

Added by Andrzej Wójcik IoT 1 CloudTigaseTeam over 4 years ago

Hi,

I think that I might miss something which needs to be set to have machine name set correctly (fully qualified machine name needs to be resolvable to external IP under which you server will be available). Generally you need to set server that if you run following in console you will receive fully qualified name of server which is resolvable to your external IP)

hostname -f

About storing messages - 60 seconds is just resumption timeout. Please keep in mind that depending on configuration server operation system needs time to detect failure (sometimes it takes up to 2 hours), see section TCP_KEEPALIVE of Linux settings for high load systems

Messages undelivered should be stored by SessionManager as they are sent back to SessionManager and are processed as offline messages.

Added by Natale Vinto over 4 years ago

Hi,

there isn't yet a public DNS in order to do the association between IP and hostnames, is it possible to temporary workaround it? client connect with the triple IP, port, ServiceName which allows dealing without SRV lookups.

Also the server was already set up following your Linux high load systems guide, thus

fs.file-max=1000000
net.ipv4.ip_local_port_range=1024 65000
net.ipv4.tcp_keepalive_time=60
net.ipv4.tcp_keepalive_probes=3
net.ipv4.tcp_keepalive_intvl=90

so any dead TCP connections should be recognized after 90 seconds, but maybe I missed something understaning how xep-0198 helps in faulty networks store and forward behavior:

Imagine a mobile client went in airplane mode for couples minutes, if I enable stream management and i sent a message stanza, for the server the client is online because it didn't announced its unavailibility by Presence, then send the message to a broken socket. Then the resumption timeout ends and the client is still in airplane mode, when it come back:

  • resumption is not possible and it starts a new session that invalidate the previous but the message is still in the broken socket because TCP_KEEPALIVE didn't expired, will be lost? In this case will give "stanza responsibility of the sender" ?

  • resumption is not possible and TCP_KEEPALIVE is expired, SessionManager receives the message to store offline?

when the mobile client came back online extra resumption, any previous sessions would be invalidated. Where is the message? :)

Thanks for clarifications

Added by Andrzej Wójcik IoT 1 CloudTigaseTeam over 4 years ago

You would need to set fully qualified domain name to name known by client so it will not be possible to use without SRV records as location is always a name of server.

When client will come back after timeout then message should be stored in offline store.

If it will came before timeout and will not use resume but will use same resource then previous session would be closed and messages would be sent to offline message store, so new session will retrieve them from offline store.

    (1-9/9)