Tradeoff Analysis - Outbox Pattern

In my previous post related to outbox pattern, I have described different flow and what advantage or benefit It provides and specially for reliable messaging and surety that message never loss as it performs under transaction boundary of database.

If "Outbox Pattern" is ultimate solution for problem related to reliable messaging then why it is not implemented as out-of-box solution and available as part of system framework or some other way.

Yes It gives benefit but there is other side of it. In Software Architecture world whenever any patterns or solution is being selected, there is always trade-off associated with it. I will try to discuss some of the things related to Outbox Pattern in same context.

I also advice that before going into this article please read my previous article as well.

Delay

Without outbox pattern, whenever message is being publish, It will publish to message broker directly and so all other service that listening to queue or topic, will get that message and process. It is kind of reactive things.

With outbox pattern, whenever message is being publish, It will publish or say store in some data store. From this data store based on some timer interval worker process read message and process it. Here worker get involved and also depends on interval of message read cycle, It will add delay in message processing.

Solution

  • There are some of the messages are important to process quickly so for that worker, put less delay between two read cycle and for other message put more delay and this way overall system has distributed load.
  • Also all important message that required to process in order or there is dependency then you have to put under same delay cycle for processing otherwise it may not give desirable result.

Responsiveness

This is related to first point Delay but if any process where UI is involved and if delay between two read cycle is high then it will not look good as it looks like that process took much time to process. This is one of the factors needs to consider when there is time to decide between two read cycles.

By saying this, It is not good to assume that without outbox there is always good responsiveness. There is also slowdown without outbox and specially system is under load of processing too many requests. With outbox, delay introduce by read cycle will be always there even if system is under load or not.

Solution

This is actually not fully solution but instead of waiting at UI by some spinner or simply wait for response from server for processing, it is not good one in this scenario.

  • If It is possible to put less delay and process really required this (like some financial transaction or process that required immediate processing) then it is good.
  • Another solution is that, give user message that It will take some time to process the request and check after certain period.

Cost

There two types of cost involved in this scenario.

  • Data Storage Cost
  • Reading Data Cost

Storage Cost

Mostly in distributed system and specially when cloud involved, there is always cost associated with data stored. This point needs to consider. It is good to place proper policy that clear data after certain period of time if message is successfully processed by worker. One can also say that immediate after processing by worker, It is good to delete data. That all depends on your requirement and use case. For example, immediately after worker put message to broker and worker delete data then It is good in happy flow but message is still not processed by broker and broker get crashed then data is loss at both end and it is not possible to recover that.

Reading Cost

Worker needs to read data from data store, this will overall increase no. of read operation happen in data store. Also, some of the cloud solution has storage read cost per read operation so this will extra overhead.

If I talk of about Azure Sql Database then there is special offering for serverless database. This configuration put database is in idle state after certain period of time if there is no operation perform against database. This is cost effective solution for application that don't required database continuously or at least some services in Microservice scenario. Now with outbox pattern worker read every 1 min, 2 mins or 10 mins but it hits the database so this serverless configuration will never be cost effective as db never goes to idle state.

One possible solution is hybrid solution so think about following situation.

How frequently your message broker solution has to go through disaster recovery, or it is not available for operation?

  • If answer is very less than depending on business use case you can completely avoid use of Outbox pattern
  • If you still have to outbox pattern then as soon as message stored in outbox store, publish message to message queue that there is something new in outbox table to process. You have to write normal message queue handler that process these messages.
  • Put delay bit higher so you get advantage of serverless configuration.
  • Put extra check in some worker internal process that last time when message is being read from storage for processing and when timer related work hit, check that time and keep it in some threshold like if it reads during last 10 minutes or so no need to read it again as message queue handler working fine. (This can be also mix with health metrics of broker service).

That's it. I hope above sharing will help.

One more thing, as I said earlier as well, there is never a perfect solution each one has drawback. In above scenario for Hybrid approach, now it is possible that timer and queue handler hit at same time and if there is no extra check then it is possible that message put to broker multiple times. Discuss more on this on next article.