High Availability
FioranoMQ introduces High Availability, which allows its applications to take advantage of its in-built fault tolerance capabilities. High availability is a system design protocol and associated implementation that ensures a certain absolute degree of operational continuity during a given measurement period.
Today's real-time enterprise solutions often deploy a messaging middleware that enables communication between various sub-components. This middleware is entrusted with important data that should be delivered reliably and as fast as possible to the recipient application. The middleware server might also be required to store this data in its data store until it is picked up.
A failure of this middleware message bus might bring the entire system down within seconds. Hence, it is absolutely imperative for the messaging backbone to provide its backup, which allows messaging operations to resume quickly in the event of a failure of the running server. This backup server should restore the state prior to failure of the original message server. Any data that was stored previously in the server's data store should be accessible through this backup server and most importantly this operation of shifting from one server to its backup should be automatic and transparent to the client application.
This chapter discusses the salient features of FioranoMQ's HA solution. It explains the functions and underlying architecture of the entire solution.
Features
Shared and Replication of Databases
FioranoMQ provides complete flexibility to administrators giving them an option of either using a shared database (between active and passive servers) or using database replication (from active to passive server). "Shared" HA typically provides much better performance in comparison with "Replicated" HA. If it is not possible to share a database, administrators can still use FioranoMQ's HA using inbuilt replication support.
Application Failover
If the primary server becomes unavailable, all the client applications connected to it are automatically re-connected to the secondary server. The process of shifting from the primary server to the backup server or vice-versa is transparent to the application. The application does not need to implement any reconnect logic in its code. Re-connection achieved by connecting to the server through a Durable Connection. If a backup server is available, the Durable Connection will connect to the backup server. Otherwise, it waits for the server to restart and during the disconnected period stores all data into a local repository. This data is re-transferred to the server as soon as a connection is re-established, making the system highly reliable and robust even in the event of network failures.
Note: Durable connections implement 'client side persistence' and are a proprietary feature of FioranoMQ (though it does not require any proprietary APIs) and should not be confused with Durable Subscribers.
Data Store Consistency (maintained between server switches)
When the primary server becomes unavailable, its backend database state is preserved. This state is picked up by the secondary server when it becomes available. This avoids loss of persistent information between server switches while, at the same time, providing access to information stored on the backup server. For example, all the messages published on various destinations residing on the primary server are available to valid consumers through the secondary backup server without loss.
Expensive HA Hardware Not Required
Fiorano's HA solution is implemented using software and is not dependent on expensive hardware solutions. It can run on any java-supported platform. With the shared database option, one might want to use RAID or SAN disks (if using HA over Fiorano's proprietary file-based data store) for enhanced speed and stability, but this hardware is not necessary for Fiorano's HA solution. Using either replication support or using a central RDBMS server as the message store in the Enterprise Server avoids the need for additional hardware.
Implementing a Cluster
The Enterprise Server can be clustered with other Enterprise Servers or even stand-alone FioranoMQ servers. The cluster can share destinations (using a common naming store) and provide load-balancing facilities.
Known HA implementations / solutions
To enable any HA implementation, one need to provide certain guarantees. In case an active server is down, a secondary server needs to take all the existing connections and then deliver the undelivered messages. To achieve this task, the state information of servers must be exchanged between the two servers and the database used by servers must be consistent.
There are two ways to make the database consistent:
- Replicated database: Both the servers have their own copy of database but changes made to the active server's database will be propagated to the passive server's database. In this way, consistency of these two servers is maintained.
- Shared database: Both the servers use a common shared database. In this case, only the active server makes changes to the common database. As this database is also available to the passive server on the failure of the active server (say primary), when the passive server (say secondary) takes over, it has all the changes made by the Primary server.
The following sections discuss the various aspects of FioranoMQ's HA solution, the underlying architecture of the entire solution, it's working, and also provides step by step instructions for enabling HA in FioranoMQ.