Clustering and HA recommendations
Hi, i'm setting up a cluster environment for tigase which will have three servers and i want to ask you the better way to do it.
Now we are connecting throw the public ip address of the server and actually we have a single node of tigase.
If we set up the cluster environment with the three servers but using the same way to connect when we restart the main server to implement new functionality the service will be down because we point to its public ip.
What do you recommend to avoid this and having HA? proxy, dns round robin, other better solution?
Thanks in advance.
Added by Juan Ignacio Paz about 4 years ago
Thanks for the answer.
I saw other posts related but i don't read one that acomplish the 100% availability.
If i understand correctly with tigase LB the main node to where is the initial connection made (with public ip, or dns) is a full functionality node, so if we need to implement a new functionality let's say to AMP we need to replace the jar in every server and restart all, and when the main server is restarted the service is down.
In other hand if we use dns round robin like a lot of people said that will distribute the connections to all the ip's listed, and don't analyze if the server is down.
I apologize if this was already answered.
Thanks a lot.
Added by Artur Hefczyc about 4 years ago
There is not "main node" in Tigase's cluster. All nodes have the full functionality and all nodes have the LB logic. So what you can do is to have DNS round-robin set to all Tigase nodes and the initial connection is made to a random node given by DNS round-robin. Then the LB on Tigase decides whether the client connected to a correct node or if a redirect is required. So there is no single point of failure and not problem if one of the nodes (any of them is down).
Even if DNS round-robin points to a node which is down, then a client tries to connect to the node which is down. Connection attempt fails but then the client can retry to connect querying DNS first. DNS should then give another cluster node and then connection should be successful.
Now, if you want to implement a new functionality and install a new jar, then indeed, you have to restart all cluster nodes but you do not need to restart them all at the same time. You can restart one node a day, so the whole service is still working OK and there is no interruption in the service.