Pubsub with MongoDB can NOT deliver message with chinese characters

martian kuo
Added about 3 years ago

After I switched from mysql to mongodb, any message published with Chinese

can not be delivered through the pubsub mechanism. If I stay with

english only, then it is working again.

Any idea on how to fix it???


Added by Andrzej Wójcik IoT 1 CloudTigaseTeam about 3 years ago

You say that messages cannoot be delivered, but are they published?

As I understand you published item using PubSub with Chinese chars and this item was not delivered as a message notification, however it works without Chinese chars.

Could you check if message is published? If you would try to list published items then this items should be visible if message was published correctly. You also should receive publication confirmation - iq stanza with type = result and id matching publication request.

However at this point it would be good to look into logs and check if there is no exception being logged as it looks like an issue with storage of data to MongoDB, so this may be problem inside Tigase or in driver we use.

Maybe error will appear after adding --debug=server,pubsub to etc/ file to enable debugging.

Added by martian kuo about 3 years ago

Thanks for the reply!

The message was published alright but not delivered (if message contains chinese characters)

I received the publication confirmation iq stanza OK (see publisher.png).

The message was properly received in mongodb tig_pubsub_items OK (see mongodb.png).

Tigase server did not show any relevant error (see tigase.png)

If i switched back to only english message, then the message is delivered OK.

Thanks for the help.

(I resolved this temporarily by encode the chinese message with base64 first, published Ok

, delivered Ok, then decode with base64)

Added by Andrzej Wójcik IoT 1 CloudTigaseTeam about 3 years ago

I would suggest to encoded only single chars using XML entities i.e. & or Ӓ which if later processed as HTML or XML are encoded versions of proper chars. This way it would be no need to encode messages without chinese chars.

To get more informations about what is going on in PubSub Component I would suggest to enable debugging of this component by adding line: --debug=server,pubsub instead of line --debug=server in etc/ file. Then more informations about PubSub component will be logged to logs/tigase.x.log where x is number of part of log file.

Could you also check if you can use PubSub protocol and retrieve this published item with chinese chars from PubSub component?