[openamq-dev] round-robin queues
Terry Jones
terry at jon.es
Tue Feb 12 13:05:59 CET 2008
Hi Martin
>>>>> "Martin" == Martin Sustrik <sustrik at imatix.com> writes:
Martin> Sorry about this. I've been messing with the stuff so long that
Martin> what seems obvious to me may be unobvious to people from outside.
Yes, I understand.
Some pieces of your reply:
Martin> Firstly, if the load balancing is done on the exchange level and
Martin> there is a queue for each separate service instance, failure of a
Martin> service instance will cause messages still be dispatched to its
Martin> queue with nobody to process them. This will increase the response
Martin> time for a particular request, that was dispatched to the failed
Martin> service instance to grow to several hours or however long it takes
Martin> to fix the problem.
Martin> Secondly, if you would like to avoid the problem using auto-delete
Martin> queues (i.e. queue disappears once the service instance fails)
Martin> you'll end up with no queue to dispatch the message to when there
Martin> is no instance of the service running. That'll result in the
Martin> request being lost.
Martin> If you want to change the topology of your services, you just have
Martin> to do something. It won't come for free. It doesn't matter whether
Martin> whether it's done on your app's level or by instructing OpenAMQ
Martin> broker to do so, whether the task is automated or done by hand
Martin> etc. You still to do *something*.
Martin> Now, the question is how to make this *something* as minimal as
Martin> possible.
Martin> My solution would be...
[snip]
OK, I think we're completely on the same page, and we're all trying to
minimize the *something*.
Apart from issues with patching OpenAMQ, do you see a problem with the
following?
1. Use auto-delete queues.
2. Have the exchange tell us when a message is unroutable (this seems
possible, according to para 1 of p22 of the 0.9 AMQP spec).
3. Implement Esteve's suggestion to have OpenAMQ do round-robin.
This would seem to remove the need for us to maintain an additional db
table that indicates which queue (or queues) each resource is on. If we
tried to access a resource that's not available, we'd get a message to that
effect (i.e., saying that the original message was unroutable). We would
then arrange for a box to handle the resource and re-send the request. If a
box serving resources went away completely, we wouldn't (in theory) have to
do anything at all.
The alternative is to do as you suggest, and as we've considered, and
maintain additional db information ourselves. That's quite a bit more work.
For example, if a box that's managing a collection of resources goes away,
something needs to be informed so we can update the table that indicates
the queue(s) where all the resources on that box can be found. Can AMQP
help in telling us when a queue is being destroyed due to the consumer
vanishing? I guess not. If not, we'd need to be monitoring the boxes
ourselves or perhaps to add heartbeat messages. As you described, there's
also other work in having the boxes update the db that knows where
resources are. And then the box with the auxiliary db might itself go
away. That's all painful in building a distributed system as it just
multiplies the number of ways in which things can go wrong. I'm sure you
know all about that, much better than I do. Hence the questions about
minimizing the *something* by making the small suggested change to OpenAMQ.
If you agree the first solution would work, and if we could get a patch
into OpenAMQ, the *something* would seem to be significantly smaller (we'd
then just have to consider not being able to move to qpid or rabbit etc).
If not, and we had to maintain the patch ourselves, it might be better for
us just to just do the extra work with our own db.
Thanks again,
Terry
More information about the openamq-dev
mailing list