Project

General

Profile

User hangs in memory

Julia Zashchitina
Added almost 4 years ago

Hello.

We use groovy scripts AddUser, ChangeUserPassword and DeleteUser for managing users by rest services. Sometimes we get the following result:

2015-05-14 11:37:10.701 [in_2-message-router]  MessageRouter.processPacket()  FINEST:   Processing packet: from=sess-man@domain.com, to=http@ip-xxx-xx-xx-xxx.us-west-2.compute.internal/eb746266-4f11-4911-8b6f-2127a5040b14, 
DATA=
<iq type="result" from="sess-man@domain.com" to="admin@domain.com" id="e000ae0a-c4cb-4883-9305-f09840e2dc92">
  <command node="http://jabber.org/protocol/admin#add-user" status="completed" xmlns="http://jabber.org/protocol/commands">
    <x type="result" xmlns="jabber:x:data">
      <field type="fixed" var="Note"><value>Script execution error.</value></field>
      <field type="fixed" var="Error message"><value>java.lang.NullPointerException</value></field>
      <field type="text-multi" var="Debug info">
        <value>java.lang.NullPointerException</value>
        <value>javax.script.ScriptException: java.lang.NullPointerException</value>
        <value>org.codehaus.groovy.jsr223.GroovyScriptEngineImpl.eval(GroovyScriptEngineImpl.java:349)</value>
        <value>org.codehaus.groovy.jsr223.GroovyCompiledScript.eval(GroovyCompiledScript.java:41)</value>
        <value>tigase.server.script.Script.runCommand(Script.java:159)</value>
        <value>tigase.server.BasicComponent.processScriptCommand(BasicComponent.java:912)</value>
        <value>tigase.server.AbstractMessageReceiver$QueueListener.run( ... , SIZE=1101, XMLNS=null, PRIORITY=NORMAL, PERMISSION=NONE, TYPE=result

For other usernames add-user operation succeeds or returns duplicate (that is correct behavior for this user). Changing password or deleting this user is not working as well (the result is "user not found"). Also Tigase constantly sends presence to this user. Seems like he is hanging somewhere in server memory. Restarting the server fixes the issue but that's not an option for us. Could you please clarify what is happening and is there any way to prevent this or make a cleanup (remove this user from server memory) without restarting the server?


Replies (5)

Avatar?id=6023&size=32x32

Added by Artur Hefczyc TigaseTeam almost 4 years ago

This is certainly not a correct situation, neither the script error nor the fact that the user stays online and only the server restart can fix it. Unfortunately we have too little information to be able to tell anything. It requires a proper investigation. Some basic questions which come to my mind:

  1. Is there any other log entries around the script execution related to the user's ID?

  2. Does the user has a correct entry in DB?

  3. Do you have any DB related error?

  4. What kind of DB do you use?

  5. What Tigase server version do you use?

  6. Does the Tigase sever run in a single or in a cluster mode?

  7. What kind of client the user uses?

  8. And actually, why this is such a problem for you that the user stays online and the server sends some presences to the user? I guess it does not overload your server, and the impact would be next to none.

  9. Do you have any custom code?

You could make the user offline without restarting the server. You just need an ad-hoc admin command which logs out the user. I just looked at the admin commands included in the Tigase server distribution but there is no such command right now but basing on what is in there, you should be able to create a new command to do the trick.

Added by Julia Zashchitina almost 4 years ago

Here’s our use case scenario: this user was registered in Tigase before. After he re-installs client application, our server first tries to register user again (using rest services with add-user script). If there’s a DUPLICATE response from Tigase server, we change user’s password.

Because of user hanging in memory (or some inconsistent user state) there’s a NullPointerException response from Tigase. Neither add-user, change-password or delete-user scripts are working.

1) Please find log file attached for that period (user jid is 375293333333@domain.com)

2) DB entry for this user doesn’t seem to differ from other users (with correct behavior).

3) There was a quick database disconnect, so this probably is related to https://projects.tigase.org/boards/15/topics/4914?r=4987

4) Tigase is configured to run in cluster mode but at that time there was only one node running.

5) We do not have any custom code in Tigase server.

6) Database is MySQL 5.6 on Amazon RDS.

7) To properly handle emoji we upgraded Tigase mysql-connector to version 5.1.35 and changed encoding to ‘utf8mb4’.

8) Tigase server version is 7.0.0.

9) We used PSI, Miranda and other clients. The user is not able to connect even with the correct password.

Added by Wojciech Kapcia TigaseTeam almost 4 years ago

Julia Zashchitina wrote:

3) There was a quick database disconnect, so this probably is related to https://projects.tigase.org/boards/15/topics/4914?r=4987

Does Tigase re-establish connections to the DB afterwards (i.e. are there ESTABLISHED socket connections to the DB seen in the operating system and also shown in the DB statistics?)

9) We used PSI, Miranda and other clients. The user is not able to connect even with the correct password.

I assume that after disconnect in (3) attempts to connect this user yields same exception (NPE in JDBCRepository)?

Added by Julia Zashchitina almost 4 years ago

1) As shown in the attached log file, other users were able to connect to Tigase server and add-user or change-password scripts for them worked normally. Password for this users was changed in database as well, so db connection was successful.

2) There were no same NPE’s in tigase-console.log.

Added by Wojciech Kapcia TigaseTeam almost 4 years ago

Julia Zashchitina wrote:

1) As shown in the attached log file, other users were able to connect to Tigase server and add-user or change-password scripts for them worked normally. Password for this users was changed in database as well, so db connection was successful.

If still reproducible - can you wrap the add-user/change-user-passwords scripts in try/catch (after imports) and share resulting stacktrace?

    (1-5/5)