Outbox Pattern For Reliable Messaging

Before jumping into "Outbox Pattern", Let's understand the scenario and problem it solves.

When working with Microservices Architecture, there is a need to communication between different microservice. There are mainly two types of communication "Synchronous" and "Asynchronous". "Outbox Pattern" mainly used during "Asynchronous" communication. Apart from this this also helps when there is a need to store state of application and events (application or domain) into persistent state. By this way it provides consistency for data store across services along with reliable message producing.

Normal Asynchronous processing within Microservices.

image.png

Scenario-1 (Happy Flow)

  1. Client Request to update for profile update and Microservice1 receive request.
  2. It begin the database transaction.
  3. save profile data.
  4. it close the transaction.
  5. Also it raises event that profile updated so other services in application can act on that and in this step it sent message to message broker.
  6. Message reply success. (More on this later)
  7. MicroService1 response client with success.

Scenario-2 (database operation fail)

In this scenario, up to initiating db transaction it get success. It get failed on "Save Profile" step and overall operation return failure.

Here operation is failed but nothing wrong with overall consistency of data.

Scenario-3 (Message broker operation failed)

In this everything successful up to transaction commit but send message or event to broker failed. Overall operation seems to failure or partially success depends on use case.

There are many things wrong over here.

  • Database operation is success, so database is consistent, and it states updated as transaction get finish.
  • Now communication to broker failed so if other microservices depends on this event will not receive this notification.
  • In this case other service will have old data of profile. One can comment that this is very much possible as services in Microservice architecture depends on eventual consistent. This is true but in current scenario event will be loss unless entire request initiate again.

Note: One solution is that it retries to save event to broker, and it may get success if there is any transient issue but there is still chance of failure.

Let's explore few more option.

By Looking at above scenario, one simple solution come across is "put save to message broker" before commit database transaction.

image.png

Scenario-4 (Save to Broker Before Commit Transaction)

In this case, before commit the database transaction, It complete the operation with Message Broker. In Happy flow there is no issue. As everything went well.

Only issue with this approach that database transaction open for more time than approach discussed in "Scenario-1". This will reduce the no of transaction available for process more request. This is not a big issue for small application or application with less load but for high performant application this is the issue.

Scenario-5 (Save to Broker Before Commit Transaction with Failure for message broker)

In this case, as operation with message broker get failed and so entire request. As transaction is not committed so it keeps database in consistent state. This is one of the solution for Issue faced in "Scenario-3".

Scenario-6 (Save to Broker Before Commit Transaction with Failure to Commit transaction)

This scenario we have not think earlier but in distributed system there is always chance that application get failed when it try to communicate other application over network. As we have discussed that "Save to Message Broker" get failed and same way it is also possible that "Commit to database transaction get failed".

In this scenario, message successfully published to message broker and when application transaction at that time operation get failed.

This is somewhat opposite of "Scenario-3", in which database has updated data but Message broker failed. Now Message broker published updated data but database transaction failed so it has old data. This has huge impact as other service may act on published message.

What is reliable solution?

All scenario, we discussed above main problem is, application try to maintain data consistency across two different types of system. One is database and another is message broker. If both can participate in one transaction than it is more reliable as it is easy to maintain data consistency. Just for your note that even if you have two same system but transaction across those system like two different database then also it become challenge so distributed transaction is always challenge.

Solution to above problem is if we store entire operation into transaction and then other component take responsibility to make it consistent based on that data. It makes solution more reliable. This is what essence of Outbox pattern. In normal "Email" client, Outbox hold the message that needs to send. Now if network is not available then message stays in outbox and then once network available it sends to respective recipient. This is what "Outbox pattern" in microservice world do. It holds message needs to send in database table.

image.png

  • For this pattern, there is one outbox table that can hold any message that needs to send to message broker with all metadata information.
  • Normally outbox table present in same db as other operational data. Due to this application can utilize the database transaction to get best data consistency. This is normal when we have monolith application, so outbox pattern was not popular at that time.
  • In above diagram it displays this thing that what happen when database transaction get success and what happen when it get failed.

Now message is reach to "Outbox" table. To send message to broker again, we need one component and that component known as a "Message Relay" or "Outbox Relay".

image.png

This is simple

  • Read pending outbox message and convert it to "Message Broker" format.
  • Send message to message broker.
  • Mark message as process.

There is always a downside of any approach and It is always decision to make that what needs to trade-off depends on situation and use case.

In outbox flow mention above,

  • If there is any failure during send to message broker it try again as message not mark as success so we get reliability
  • It is possible it send message multiple time. Like when message send to message broker it success but not receive ACK from broker due to network issue so "Outbox relay" send that message again.
  • If message successfully send to Message broker but during mark to "Outbox" about success get fail. It processes that message again.

But in all case, Message will at-least process once. This is good solution for certain application where loosing message in communication is not good or acceptable considering the business scenario.