On 11/24/06, Steven Shaw <moc.liamg|wahsets#moc.liamg|wahsets> wrote:
I thought you guys at iMatix had already implemented clustering for HA
in OpenAMQ?
We have implemented the clustering model described on wiki.amqp.org but we're doing transient messaging, so no persistence, no transactions.
We're using AMQP like IP, and the traffic-control layers are implemented at a higher level (in the API framework at the application side). This lets us do very rapid (and simple) transient messaging for market data, and also end-to-end transactions for order processing.
This is a key question about AMQP: does the broker (a) need to act as a guaranteed data store, or does it (b) need only to act as a message switch. I'm well aware that probably everyone on this list thinks the answer is obviously (a). But what this obliges us to do is define a full data-safe HA model, and as we've seen this has serious complexifying effects on the protocol.
I do not think we have considered this basic question sufficiently. We have assumed that the classic MQ Series model of a "mainframe broker" is the only plausible one. Note that even MQ Series still sometimes drops messages, and that HA clustering is probably one of the most complex problems for any database or middleware. Do we really want to solve this problem? Is this even a problem we should be solving?
I'd like to suggest an alternate, simpler vision for an AMQP network, based on our real experience with a large scale deployments.
First, the protocol would be stripped down to remove all persistence and transaction-oriented functionality. These could be moved to optional classes (content classes), but for my proposal, they are irrelevant.
Second, all brokers are seen as fully transient black boxes, where their only durable data is a security profile. All broker wiring and data is transient. This is largely our current philosophy, but it can be reinforced.
Thirdly, brokers create HA pairs (using our proven design). A HA pair appears as a single broker (perhaps with two IP addresses) to the outside world. HA failover and recovery is done by a dialogue between the HA pair.
Forthly, HA pairs (or stand-alone brokers) can be internetworked into large architectures, to allow geographical distribution and (more importantly) high-volume fanout. Clients connect to a well-specified local broker/HA pair.
So far this gives us a very simple and scalable model, with brokers that can be cast into hardware, and where the reliability of individual pieces increases to the point where failure is extremely rare. Something like a modern IP network.
Next, application frameworks implement reliability on top of that architecture, using proven traffic-control mechanisms, namely acknowledgement and retry. The entire TC layer is point-to-point and 100% ignorant of the network architecture, and HA configuration. TC is only used for those parts of the work that need it.
This "thin-AMQP" approach gives us some significant advantages:
1. We can simplify the protocol, and finish the basic WLP work rapidly.
2. We can simplify the HA question, which we've already solved & proven.
3. We can experiment with different TC mechanisms *without* affecting AMQP.
4. Brokers will be simpler, so more reliable.
5. High-performance scenarios can ignore the TC layer.
Given the significant advantages of such an approach, and given that it is very close to what we are doing in production at JPMC, and given that it's close to the IP - TCP/IP model, I'd like to ask why we're not going in that direction, instead of the direction of more complexity?
Note that vendors can still differentiate themselves, by making brokers that are faster, easier to administrate, and run on smaller boxes. They can define better TC layers, even proprietary ones. And because the protocol would be simpler, it would get adoption much faster.
I could go on with this… but perhaps someone can falsify my basic assumption, which is that TC can be done between the end-layers.
-Pieter