REST API Data Validation

Introduction

This time I would like to describe how we can protect our REST API applications from requests containing invalid data (data validation process). However, validation of our requests is not enough, unfortunately. In addition to validation, it is our responsibility to return the relevant messages and statuses to our API clients. I wanted to deal with these two things in this post.

Data Validation

Definition of Data Validation

What is data validation really? The best definition I found is from UNECE Data Editing Group:

An activity aimed at verifying whether the value of a data item comes from the given (finite or infinite) set of acceptable values.

According to this definition we should verify data items which are coming to our application from external sources and check if theirs values are acceptable. How do we know that the value is acceptable? We need to define data validation rules for every type of data item which is processing in our system.

Data vs Business Rules validation

I would like to emphasize that data validation is totally different concept than validation of business rules. Data validation is focused on verifying an atomic data item. Business rules validation is a more broad concept and more close to how business works and behaves. So it is mainly focused on behavior. Of course validating behavior depends on data too, but in a more wide range.

Examples of data validation:

– Product order quantity cannot be negative or zero
– Product order quantity should be a number
– Currency of order should be a value from currencies list

Examples of business rules validation

– Product can be ordered only when Customer age is equal or greater than product minimal age.
– Customer can place only two orders in one day.

Returning relevant information

If we acknowledge that the rules have been broken during validation, we must stop processing and return the equivalent message to the client. We should follow the following rules:

– we should return message to the client as fast as possible (Fail-fast principle)
– the reason for the validation error should be well explained and understood for the client
– we should not return technical aspects for security reasons

Problem Details for HTTP APIs standard

The issue of returned error messages is so common that a special standard was created describing how to handle such situations. It is called “Problem Details for HTTP APIs standard” and his official description can be found here. This is abstract of this standard:

This document defines a “problem detail” as a way to carry machine-readable details of errors in a HTTP response to avoid the need to define new error response formats for HTTP APIs.

Problem Details standard introduces Problem Details JSON object, which should be part of the response when validation error occurs. This is simple canonical model with 5 members:

– problem type
– title
– HTTP status code
– details of error
– instance (pointer to specific occurrence)

Of course we can (and sometimes we should) extend this object by adding new properties, but the base should be the same. Thanks to this our API is easier to understand, learn and use. For more detailed information about standard I invite you to read documentation which is well described.

Data validation localization

For the standard application we can put data validation logic in three places:

  • GUI – it is entry point for users input. Data is validated on the client side, for example using Javascript for web applications
  • Application logic/services layer – data is validated in specific application service or command handler on the server side
  • Database – this is exit point of request processing and last moment to validate the data
Data validation localization
Data validation localization

In this article I am omitting GUI and Database components and I am focusing on the server side of the application. Let’s see how we can implement data validation on Application Services layer.

Implementing Data Validation

Suppose we have a command AddCustomerOrderCommand:

Suppose we want to validate 4 things:

1. CustomerId is not empty GUID.
2. Products list is not empty
3. Each product quantity is greater than 0
4. Each product currency is equal to USD or EUR

Let me show 3 solutions to this problem – from simple to the most sophisticated.

1. Simple validation on Application Service

The first thing that can come to mind is a simple validation in the Command Handler itself. In this solution we need to implement private method which validates our command and throws exception if validation error occurs. Closing this kind of logic in separate method is better from the Clean Code perspective (see Extract Method too).

The result of invalid command execution:

This is not so bad approach but has two disadvantages. Firstly, it involves from us writing a lot of easy and boilerplate code – comparing to nulls, defaults, values from list etc. Secondly, we are losing here part of separation of concerns because we are mixing validation logic with orchestrating our use case flow. Let’s take care of boilerplate code first.

2. Validation using FluentValidation library

We don’t want to reinvent the wheel so the best solution is to use library. Fortunately, there is a great library for validation in .NET world – Fluent Validation. It has nice API and a lot of features. This is how we can use it to validate our command:

Now, the Validate method looks like:

The result of validation is the same as earlier, but now our validation logic is more cleaner. The last thing to do is decouple this logic from Command Handler completely…

3. Validation using Pipeline Pattern

To decouple validation logic and execute it before Command Handler execution we arrange our command handling process in Pipeline (see NServiceBus Pipeline also).

For the Pipeline implementation we can use easily MediatR Behaviors. First thing to do is behavior implementation:

Next thing to do is to register behavior in IoC container (Autofac example):

This way we achieved separation of concerns and Fail-fast principle implementation in nice and elegant way.

But this is not the end. Finally, we need to do something with returned messages to clients.

Implementing Problem Details standard

Just as in the case of validation logic implementation, we will use a dedicated library – ProblemDetails. The principle of the mechanism is simple. Firstly, we need to create custom exception:

Secondly, we have to create own Problem Details class:

Last thing to do is to add Problem Details Middleware with definition of mapping between InvalidCommandException and InvalidCommandProblemDetails class in startup:

After change in CommandValidationBehavior (throwing InvalidCommandExecption instead Exception) we have returned content compatible with the standard:

Problem details

Summary

In this post I described:
– what Data validation is and where is located
– what Problem Details for HTTP APIs is and how could be implemented
– 3 methods to implement data validation in Application Services layer: without any patterns and tools, with FluentValidation library, and lastly – using Pipeline Pattern and MediatR Behaviors.

Source code

If you would like to see full, working example – check my GitHub repository

Related posts

Domain Model Encapsulation and PI with Entity Framework 2.2
Simple CQRS implementation with raw SQL and DDD
How to publish and handle Domain Events
10 common broken rules of clean code

Simple CQRS implementation with raw SQL and DDD

Introduction

I often come across questions about the implementation of the CQRS pattern. Even more often I see discussions about access to the database in the context of what is better – ORM or plain SQL.

In this post I wanted to show you how you can quickly implement simple REST API application with CQRS using the .NET Core. I immediately point out that this is the CQRS in the simplest edition – the update through the Write Model immediately updates the Read Model, therefore we do not have here the eventual consistency. However, many applications do not need eventual consistency, while the logical division of writing and reading using two separate models is recommended and more effective in most solutions.

Especially for this article I prepared sample, fully working application, see full source on Github.

My goals

These are my goals that I wanted to achieve by creating this solution:
1. Clear separation and isolation of Write Model and Read Model.
2. Retrieving data using Read Model should be as fast as possible.
3. Write Model should be implemented with DDD approach. The level of DDD implementation should depend on level of domain complexity.
4. Application logic should be decoupled from GUI.
5. Selected libraries should be mature, well-known and supported.

Design

High level flow between components looks like:

As you can see the process for reads is pretty straightforward because we should query data as fast as possible. We don’t need here more layers of abstractions and sophisticated approaches. Get arguments from query object, execute raw SQL against database and return data – that’s all.

It is different in the case of write support. Writing often requires more advanced techniques because we need execute some logic, do some calculations or simply check some conditions (especially invariants). With ORM tool with change tracking and using Repository Pattern we can do it leaving our Domain Model intact (ok, almost).

Solution

Read model

Diagram below presents flow between components used to fulfill read request operation:

The GUI is responsible for creating Query object:

Then, query handler process query:

The first thing is to get open database connection and it is achieved using SqlConnectionFactory class. This class is resolved by IoC Container with HTTP request lifetime scope so we are sure, that we use only one database connection during request processing.

Second thing is to prepare and execute raw SQL against database. I try not to refer to tables directly and instead refer to database views. This is a nice way to create abstraction and decouple our application from database schema because I want to hide database internals as much as possible.

For SQL execution I use micro ORM Dapper library because is almost as fast as native ADO.NET and does not have boilerplate API. In short, it does what it has to do and it does it very well.

Write model

Diagram below presents flow for write request operation:

Write request processing starts similar to read but we create the Command object instead of the query object:

Then, CommandHandler is invoked:

Command handler looks different than query handler. Here, we use higher level of abstraction using DDD approach with Aggregates and Entities. We need it because in this case problems to solve are often more complex than usual reads. Command handler hydrates aggregate, invokes aggregate method and saves changes to database.

Customer aggregate can be defined as follows:

Architecture

Solution structure is designed based on well-known Onion Architecture as follows:

Only 3 projects are defined:
– API project with API endpoints and application logic (command and query handlers) using Feature Folders approach.
– Domain project with Domain Model
– Infrastructure project – integration with database.

Summary

In this post I tried to present the simplest way to implement CQRS pattern using raw sql scripts as Read Model side processing and DDD approach as Write Model side implementation. Doing so we are able to achieve much more separation of concerns without losing the speed of development. Cost of introducing this solution is very low and and it returns very quickly.

I didn’t describe DDD implementation in detail so I encourage you once again to check the repository of the example application – can be used as a kit starter for your app the same as for my applications.

Related posts

Domain Model Encapsulation and PI with Entity Framework 2.2
How to publish and handle Domain Events
REST API Data Validation