The Outbox Pattern

Introduction

Sometimes, when processing a business operation, you need to communicate with an external component in the Fire-and-forget mode. That component can be, for example:
– external service
– message bus
– mail server
– same database but different database transaction
– another database

Examples of this type of integration with external components:
– sending an e-mail message after placing an order
– sending an event about new client registration to the messaging system
– processing another DDD Aggregate in different database transaction – for example after placing an order to decrease number of products in stock

The question that arises is whether we are able to guarantee the atomicity of our business operation from a technical point of view? Unfortunately, not always, or even if we can (using 2PC protocol), this is a limitation of our system from the point of latency, throughput, scalability and availability. For details about these limitations, I invite you to read the article titled It’s Time to Move on from Two Phase Commit.

The problem I am writing about is presented below:

After execution of line 24 transaction is committed. In line 28 we want to send an event to event bus, but unfortunately 2 bad things can happen:
– our system can crash just after transaction commit and before sending the event
– event bus can be unavailable at this moment so the event cannot be sent

Outbox pattern

If we cannot provide atomicity or we don’t want to do that for the reasons mentioned above, what could we do to increase the reliability of our system? We should implement the Outbox Pattern.

Outbox pattern

The Outbox Pattern is based on Guaranteed Delivery pattern and looks as follows:

Outbox pattern

When you save data as part of one transaction, you also save messages that you later want to process as part of the same transaction. The list of messages to be processed is called an Outbox, just like in e-mail clients.

The second element of the puzzle is a separate process that periodically checks the contents of the Outbox and processes the messages. After processing each message, the message should be marked as processed to avoid resending. However, it is possible that we will not be able to mark the message as processed due to communication error with Outbox:

Outbox messages processing

In this case when connection with Outbox is recovered, the same message will be sent again. What all this means to us? Outbox Pattern gives At-Least-Once delivery. We can be sure that message will be sent once, but can be sent multiple times too! That’s why another name for this approach is Once-Or-More delivery. We should remember this and try to design receivers of our messages as Idempotents, which means:

In Messaging this concepts translates into a message that has the same effect whether it is received once or multiple times. This means that a message can safely be resent without causing any problems even if the receiver receives duplicates of the same message.

Ok, Enough theory, let’s see how we can implement this pattern in .NET world.

Implementation

Outbox message

At the beginning, we need to define the structure of our OutboxMessage:

What is important, the OutboxMessage class is part of the Infrastructure and not the Domain Model! Try to talk with business about Outbox, they will think about the outlook application instead of the messaging pattern. πŸ™‚ I didn’t include ProcessedDate property because this class is only needed to save message as part of transaction so this property always will be NULL in this context.

Saving the message

For sure I do not want to program writing messages to the Outbox every time in each Command Handler, it is against DRY principle. For this reason, the Notification Object described in the post about publishing Domain Events can be used. Following solution is based on linked article with little modification – instead of processing the notifications immediately, it serializes them and writes them to the database.

As a reminder, all Domain Events resulting from an action are processed as part of the same transaction. If the Domain Event should be processed outside of the ongoing transaction, you should define a Notification Object for it. This is the object which should be written to the Outbox. The code looks like:

Example of Domain Event:

And Notification Object:

First thing to note is Json.NET library usage. Second thing to note are 2 constructors of CustomerRegisteredNotification class. First of them is for creating notification based on Domain Event. Second of them is to deserialize message from JSON string which is presented in following section regarding processing.

Processing the message

The processing of Outbox messages should take place in a separate process. However, instead of a separate process, we can also use the same process but another thread depending on the needs. Solution which is presented below can be used in both cases.

At the beginning, we need to use a scheduler that will periodically run Outbox processing. I do not want to create the scheduler myself (it is known and solved problem) so I will use one the mature solution in .NET – Quartz.NET. Configuration of Quartz scheduler is very simple:

Firstly, scheduler is created using factory. Then, new instance of IoC container for resolving dependencies is created. The last thing to do is to configure our job execution schedule. In case above it will be executed every 15 seconds but its configuration really depends on how many messages you will have in your system.

This is how ProcessOutboxJob looks like:

The most important parts are:
Line 1 – [DisallowConcurrentExecution] attribute means that scheduler will not start new instance of job if other instance of that job is running. This is important because we don’t want process Outbox concurrently.
Line 25 – Get all messages to process
Line 30 – Deserialize message to Notification Object
Line 32 – Processing the Notification Object (for example sending event to bus)
Line 38 – Set message as processed

As I wrote earlier, if there is an error between processing the message (line 32) and setting it as processed (line 38), job in the next iteration will want to process it again.

Notification handler template looks like this:

Finally, this is view of our Outbox:

Outbox view

Summary

In this post I described what are the problems with ensuring the atomicity of the transaction during business operation processing. I’ve raised the topic of 2PC protocol and motivation to not use it. I presented what the Outbox Pattern is and how we can implement it. Thanks to this, our system can be really more reliable.

Source code

If you would like to see full, working example – check my GitHub repository.

Additional Resources

Refactoring Towards Resilience: Evaluating Coupling – Jimmy Bogard
Asynchronous message-based communication – Microsoft

Related posts

Domain Model Encapsulation and PI with Entity Framework 2.2
Simple CQRS implementation with raw SQL and DDD
How to publish and handle Domain Events

REST API Data Validation

Introduction

This time I would like to describe how we can protect our REST API applications from requests containing invalid data (data validation process). However, validation of our requests is not enough, unfortunately. In addition to validation, it is our responsibility to return the relevant messages and statuses to our API clients. I wanted to deal with these two things in this post.

Data Validation

Definition of Data Validation

What is data validation really? The best definition I found is from UNECE Data Editing Group:

An activity aimed at verifying whether the value of a data item comes from the given (finite or infinite) set of acceptable values.

According to this definition we should verify data items which are coming to our application from external sources and check if theirs values are acceptable. How do we know that the value is acceptable? We need to define data validation rules for every type of data item which is processing in our system.

Data vs Business Rules validation

I would like to emphasize that data validation is totally different concept than validation of business rules. Data validation is focused on verifying an atomic data item. Business rules validation is a more broad concept and more close to how business works and behaves. So it is mainly focused on behavior. Of course validating behavior depends on data too, but in a more wide range.

Examples of data validation:

– Product order quantity cannot be negative or zero
– Product order quantity should be a number
– Currency of order should be a value from currencies list

Examples of business rules validation

– Product can be ordered only when Customer age is equal or greater than product minimal age.
– Customer can place only two orders in one day.

Returning relevant information

If we acknowledge that the rules have been broken during validation, we must stop processing and return the equivalent message to the client. We should follow the following rules:

– we should return message to the client as fast as possible (Fail-fast principle)
– the reason for the validation error should be well explained and understood for the client
– we should not return technical aspects for security reasons

Problem Details for HTTP APIs standard

The issue of returned error messages is so common that a special standard was created describing how to handle such situations. It is called “Problem Details for HTTP APIs standard” and his official description can be found here. This is abstract of this standard:

This document defines a “problem detail” as a way to carry machine-readable details of errors in a HTTP response to avoid the need to define new error response formats for HTTP APIs.

Problem Details standard introduces Problem Details JSON object, which should be part of the response when validation error occurs. This is simple canonical model with 5 members:

– problem type
– title
– HTTP status code
– details of error
– instance (pointer to specific occurrence)

Of course we can (and sometimes we should) extend this object by adding new properties, but the base should be the same. Thanks to this our API is easier to understand, learn and use. For more detailed information about standard I invite you to read documentation which is well described.

Data validation localization

For the standard application we can put data validation logic in three places:

  • GUI – it is entry point for users input. Data is validated on the client side, for example using Javascript for web applications
  • Application logic/services layer – data is validated in specific application service or command handler on the server side
  • Database – this is exit point of request processing and last moment to validate the data
Data validation localization
Data validation localization

In this article I am omitting GUI and Database components and I am focusing on the server side of the application. Let’s see how we can implement data validation on Application Services layer.

Implementing Data Validation

Suppose we have a command AddCustomerOrderCommand:

Suppose we want to validate 4 things:

1. CustomerId is not empty GUID.
2. Products list is not empty
3. Each product quantity is greater than 0
4. Each product currency is equal to USD or EUR

Let me show 3 solutions to this problem – from simple to the most sophisticated.

1. Simple validation on Application Service

The first thing that can come to mind is a simple validation in the Command Handler itself. In this solution we need to implement private method which validates our command and throws exception if validation error occurs. Closing this kind of logic in separate method is better from the Clean Code perspective (see Extract Method too).

The result of invalid command execution:

This is not so bad approach but has two disadvantages. Firstly, it involves from us writing a lot of easy and boilerplate code – comparing to nulls, defaults, values from list etc. Secondly, we are losing here part of separation of concerns because we are mixing validation logic with orchestrating our use case flow. Let’s take care of boilerplate code first.

2. Validation using FluentValidation library

We don’t want to reinvent the wheel so the best solution is to use library. Fortunately, there is a great library for validation in .NET world – Fluent Validation. It has nice API and a lot of features. This is how we can use it to validate our command:

Now, the Validate method looks like:

The result of validation is the same as earlier, but now our validation logic is more cleaner. The last thing to do is decouple this logic from Command Handler completely…

3. Validation using Pipeline Pattern

To decouple validation logic and execute it before Command Handler execution we arrange our command handling process in Pipeline (see NServiceBus Pipeline also).

For the Pipeline implementation we can use easily MediatR Behaviors. First thing to do is behavior implementation:

Next thing to do is to register behavior in IoC container (Autofac example):

This way we achieved separation of concerns and Fail-fast principle implementation in nice and elegant way.

But this is not the end. Finally, we need to do something with returned messages to clients.

Implementing Problem Details standard

Just as in the case of validation logic implementation, we will use a dedicated library – ProblemDetails. The principle of the mechanism is simple. Firstly, we need to create custom exception:

Secondly, we have to create own Problem Details class:

Last thing to do is to add Problem Details Middleware with definition of mapping between InvalidCommandException and InvalidCommandProblemDetails class in startup:

After change in CommandValidationBehavior (throwing InvalidCommandExecption instead Exception) we have returned content compatible with the standard:

Problem details

Summary

In this post I described:
– what Data validation is and where is located
– what Problem Details for HTTP APIs is and how could be implemented
– 3 methods to implement data validation in Application Services layer: without any patterns and tools, with FluentValidation library, and lastly – using Pipeline Pattern and MediatR Behaviors.

Source code

If you would like to see full, working example – check my GitHub repository

Related posts

Domain Model Encapsulation and PI with Entity Framework 2.2
Simple CQRS implementation with raw SQL and DDD
How to publish and handle Domain Events
10 common broken rules of clean code

Using Database Project and DbUp for database management

Introduction

In previous post I described two popular ways to manage database changes.

The first one was state versioning where you keep whole current design of your database in repository and when you need change something then you need only change this state. Later, when you want to deploy changes, your schema and target database is compared and the migration script is generated.

The second way is to versioning transitions to desired state, which means creating migration script for every change.

In this post I wanted to show implementation of these two approaches in .NET environment combined together – what I think is the best way to manage database changes.

Step one – Database Project

The first thing to do is create Database Project. This type of project is available only when you have SQL Server Data Tools installed. It can be installed together with Visual Studio 2017 or separately – see this page for more information.

When you have SQL Server Data Tools you can add new Database Project the standard way:

Now we can add database objects to our project in the form of SQL scripts. Each script should define one database object – table, view, procedure, function and so on. It is common to create root folders as schemes are named.

TIP: I do not recommend creating database objects in “dbo” schema. I advise to create good named schemes per module/purpose/functionality. Creating your own schemes also allow you to better manage your object namespaces.

The sample database project may look like this:

What is worth to notice is the Build Action setting of every script is set to Build. This is the setting after which Visual Studio recognizes database objects from ordinary scripts and build them together. If we for example remove script defining orders schema, VS will not be able to build our project:

This is great behavior because we have compile-time check and we can avoid more runtime errors.

When we finished database project, we can compare it to other project or database and create migration script. But as I described in previous post this is not optimal way to migrate databases. We will use DbUp library instead.

Step two – DbUp

DbUp is open source .NET library that provide you a way to deploy changes to database. Additionally, it tracks which SQL scripts have been run already, has many sql scripts providers available and other interesting features like scripts pre-processing.

You can ask a question why DbUp and not EF Migrations or Fluent Migrator? I have used all of them and I have to say that DbUp seems to me the most pure solution. I don’t like C# “wrapers” to generate SQL for me. DDL is easy language and I think we don’t need special tool for generating it.

DbUp is library so we can reference it to each application we want. What we need is simple console application which can be executed both on developer environment and CI build server. Firstly, we need reference DbUp NuGet package. Then we can add simple code to Main method:

This console application accepts two parameters: connection string to target database and file system path to scripts directory. It assumes following directory layout:
/PreDeployment
/Migrations
/PostDeployment

For “pre” and “post” deployment scripts we are defining NullJournal – in this way scripts will be run every time.

We should keep directory scripts in Database Project created earlier. DbUp executes scripts in alphabetical order. It can look like this:

Finally, we run migrations running our console application:

Executed scripts are listed in app.MigrationsJournal table:

And that’s all! We can develop and change our database in effective way now. πŸ™‚

Summary

In this post I described how to implement both state and transitions versioning using Database Project na DbUp library. What has been achieved is:
– Compile-time checks (Database project)
– Ease of development (Both)
– History of definition of all objects (Database project)
– Quick access to schema definition (Database project)
– Ease of resolving conflicts (Database project)
– IDE support (Database project)
– Full control of defining transitions (DbUp)
– Pre and post deployment scripts execution (DbUp)
– Deployment automation (DbUp)
– The possibility of manual deployment (DbUp)
– History of applied transitions (DbUp).

Using this machinery the development of database should be definitely easier and less error-prone.