REST API Data Validation

Introduction

This time I would like to describe how we can protect our REST API applications from requests containing invalid data (data validation process). However, validation of our requests is not enough, unfortunately. In addition to validation, it is our responsibility to return the relevant messages and statuses to our API clients. I wanted to deal with these two things in this post.

Data Validation

Definition of Data Validation

What is data validation really? The best definition I found is from UNECE Data Editing Group:

An activity aimed at verifying whether the value of a data item comes from the given (finite or infinite) set of acceptable values.

According to this definition we should verify data items which are coming to our application from external sources and check if theirs values are acceptable. How do we know that the value is acceptable? We need to define data validation rules for every type of data item which is processing in our system.

Data vs Business Rules validation

I would like to emphasize that data validation is totally different concept than validation of business rules. Data validation is focused on verifying an atomic data item. Business rules validation is a more broad concept and more close to how business works and behaves. So it is mainly focused on behavior. Of course validating behavior depends on data too, but in a more wide range.

Examples of data validation:

– Product order quantity cannot be negative or zero
– Product order quantity should be a number
– Currency of order should be a value from currencies list

Examples of business rules validation

– Product can be ordered only when Customer age is equal or greater than product minimal age.
– Customer can place only two orders in one day.

Returning relevant information

If we acknowledge that the rules have been broken during validation, we must stop processing and return the equivalent message to the client. We should follow the following rules:

– we should return message to the client as fast as possible (Fail-fast principle)
– the reason for the validation error should be well explained and understood for the client
– we should not return technical aspects for security reasons

Problem Details for HTTP APIs standard

The issue of returned error messages is so common that a special standard was created describing how to handle such situations. It is called “Problem Details for HTTP APIs standard” and his official description can be found here. This is abstract of this standard:

This document defines a “problem detail” as a way to carry machine-readable details of errors in a HTTP response to avoid the need to define new error response formats for HTTP APIs.

Problem Details standard introduces Problem Details JSON object, which should be part of the response when validation error occurs. This is simple canonical model with 5 members:

– problem type
– title
– HTTP status code
– details of error
– instance (pointer to specific occurrence)

Of course we can (and sometimes we should) extend this object by adding new properties, but the base should be the same. Thanks to this our API is easier to understand, learn and use. For more detailed information about standard I invite you to read documentation which is well described.

Data validation localization

For the standard application we can put data validation logic in three places:

  • GUI – it is entry point for users input. Data is validated on the client side, for example using Javascript for web applications
  • Application logic/services layer – data is validated in specific application service or command handler on the server side
  • Database – this is exit point of request processing and last moment to validate the data
Data validation localization
Data validation localization

In this article I am omitting GUI and Database components and I am focusing on the server side of the application. Let’s see how we can implement data validation on Application Services layer.

Implementing Data Validation

Suppose we have a command AddCustomerOrderCommand:

Suppose we want to validate 4 things:

1. CustomerId is not empty GUID.
2. Products list is not empty
3. Each product quantity is greater than 0
4. Each product currency is equal to USD or EUR

Let me show 3 solutions to this problem – from simple to the most sophisticated.

1. Simple validation on Application Service

The first thing that can come to mind is a simple validation in the Command Handler itself. In this solution we need to implement private method which validates our command and throws exception if validation error occurs. Closing this kind of logic in separate method is better from the Clean Code perspective (see Extract Method too).

The result of invalid command execution:

This is not so bad approach but has two disadvantages. Firstly, it involves from us writing a lot of easy and boilerplate code – comparing to nulls, defaults, values from list etc. Secondly, we are losing here part of separation of concerns because we are mixing validation logic with orchestrating our use case flow. Let’s take care of boilerplate code first.

2. Validation using FluentValidation library

We don’t want to reinvent the wheel so the best solution is to use library. Fortunately, there is a great library for validation in .NET world – Fluent Validation. It has nice API and a lot of features. This is how we can use it to validate our command:

Now, the Validate method looks like:

The result of validation is the same as earlier, but now our validation logic is more cleaner. The last thing to do is decouple this logic from Command Handler completely…

3. Validation using Pipeline Pattern

To decouple validation logic and execute it before Command Handler execution we arrange our command handling process in Pipeline (see NServiceBus Pipeline also).

For the Pipeline implementation we can use easily MediatR Behaviors. First thing to do is behavior implementation:

Next thing to do is to register behavior in IoC container (Autofac example):

This way we achieved separation of concerns and Fail-fast principle implementation in nice and elegant way.

But this is not the end. Finally, we need to do something with returned messages to clients.

Implementing Problem Details standard

Just as in the case of validation logic implementation, we will use a dedicated library – ProblemDetails. The principle of the mechanism is simple. Firstly, we need to create custom exception:

Secondly, we have to create own Problem Details class:

Last thing to do is to add Problem Details Middleware with definition of mapping between InvalidCommandException and InvalidCommandProblemDetails class in startup:

After change in CommandValidationBehavior (throwing InvalidCommandExecption instead Exception) we have returned content compatible with the standard:

Problem details

Summary

In this post I described:
– what Data validation is and where is located
– what Problem Details for HTTP APIs is and how could be implemented
– 3 methods to implement data validation in Application Services layer: without any patterns and tools, with FluentValidation library, and lastly – using Pipeline Pattern and MediatR Behaviors.

Source code

If you would like to see full, working example – check my GitHub repository

Related posts

Domain Model Encapsulation and PI with Entity Framework 2.2
Simple CQRS implementation with raw SQL and DDD
How to publish and handle Domain Events
10 common broken rules of clean code

Processing commands with Hangfire and MediatR

In previous post about processing multiple instance aggregates of the same type I suggested to consider using eventual consistency approach. In this post I would like to present one way to do this.

Setup

In the beginning let me introduce stack of technologies/patterns:
1. Command pattern – I am using commands but they do not look like theses described in GoF book. They just simple classes with data and they implement IRequest  marker interface of MediatR.

2. Mediator pattern. I am using this pattern because i want to decouple my clients classes (commands invokers) from commands handlers. Simple but great library created by Jimmy Bogard named MediatR implements this pattern very well. Here is simple usage.

and handler:

3. Hangfire. Great open-source library for processing and scheduling background jobs even with GUI monitoring interface. This is where my commands are scheduled, executed and retried if error occured.

Problem

For some of my uses cases, I would like to schedule processing my commands, execute them parallel with retry option and monitor them. Hangfire gives me all these kind of features but I have to have public method which I have to pass to Hangifre method (for example BackgroundJob.Enqueue). This is a problem – with mediator pattern I cannot (and I do not want) pass public method of handler because I have decoupled it from invoker. So I need special way to integrate MediatR with Hangfire without affecting basic assumptions.

Solution

My solution is to have three additional classes:
1. CommandsScheduler – serializes commands and sends them to Hangfire.

2. CommandsExecutor – responods to Hangfire jobs execution, deserializes commands and sends them to handlers using MediatR.

3. MediatorSerializedObject – wrapper class for serialized/deserialized commands with additional properties – command type and additional description.

Finally with this implementation we can change our client clasess to use CommandsScheduler:

and our commands are scheduled, invoked and monitored by Hangfire. I sketched sequence diagram which shows this interaction:

Processing commands with MediatR and Hanfire

Additionally, we can introduce interface for CommandsSchedulerICommandsScheduler. Second implementation will not use Hangfire at all and only will execute MediatR requests directly – for example in development process when we do not want start Hangfire Server.

Summary

I presented the way of processing commands asynchronously using MediatR and Hangfire. With this approach we have:
1. Decoupled invokers and handlers of commands.
2. Scheduling commands mechanism.
3. Invoker and handler of command may be other processes.
4. Commands execution monitoring.
5. Commands execution retries mechanism.

These benefits are very important during development using eventual consistency approach. We have more control over commands processing and we can react quickly if problem will appear.