REST API Data Validation

Introduction

This time I would like to describe how we can protect our REST API applications from requests containing invalid data (data validation process). However, validation of our requests is not enough, unfortunately. In addition to validation, it is our responsibility to return the relevant messages and statuses to our API clients. I wanted to deal with these two things in this post.

Data Validation

Definition of Data Validation

What is data validation really? The best definition I found is from UNECE Data Editing Group:

An activity aimed at verifying whether the value of a data item comes from the given (finite or infinite) set of acceptable values.

According to this definition we should verify data items which are coming to our application from external sources and check if theirs values are acceptable. How do we know that the value is acceptable? We need to define data validation rules for every type of data item which is processing in our system.

Data vs Business Rules validation

I would like to emphasize that data validation is totally different concept than validation of business rules. Data validation is focused on verifying an atomic data item. Business rules validation is a more broad concept and more close to how business works and behaves. So it is mainly focused on behavior. Of course validating behavior depends on data too, but in a more wide range.

Examples of data validation:

– Product order quantity cannot be negative or zero
– Product order quantity should be a number
– Currency of order should be a value from currencies list

Examples of business rules validation

– Product can be ordered only when Customer age is equal or greater than product minimal age.
– Customer can place only two orders in one day.

Returning relevant information

If we acknowledge that the rules have been broken during validation, we must stop processing and return the equivalent message to the client. We should follow the following rules:

– we should return message to the client as fast as possible (Fail-fast principle)
– the reason for the validation error should be well explained and understood for the client
– we should not return technical aspects for security reasons

Problem Details for HTTP APIs standard

The issue of returned error messages is so common that a special standard was created describing how to handle such situations. It is called “Problem Details for HTTP APIs standard” and his official description can be found here. This is abstract of this standard:

This document defines a “problem detail” as a way to carry machine-readable details of errors in a HTTP response to avoid the need to define new error response formats for HTTP APIs.

Problem Details standard introduces Problem Details JSON object, which should be part of the response when validation error occurs. This is simple canonical model with 5 members:

– problem type
– title
– HTTP status code
– details of error
– instance (pointer to specific occurrence)

Of course we can (and sometimes we should) extend this object by adding new properties, but the base should be the same. Thanks to this our API is easier to understand, learn and use. For more detailed information about standard I invite you to read documentation which is well described.

Data validation localization

For the standard application we can put data validation logic in three places:

  • GUI – it is entry point for users input. Data is validated on the client side, for example using Javascript for web applications
  • Application logic/services layer – data is validated in specific application service or command handler on the server side
  • Database – this is exit point of request processing and last moment to validate the data
Data validation localization
Data validation localization

In this article I am omitting GUI and Database components and I am focusing on the server side of the application. Let’s see how we can implement data validation on Application Services layer.

Implementing Data Validation

Suppose we have a command AddCustomerOrderCommand:

Suppose we want to validate 4 things:

1. CustomerId is not empty GUID.
2. Products list is not empty
3. Each product quantity is greater than 0
4. Each product currency is equal to USD or EUR

Let me show 3 solutions to this problem – from simple to the most sophisticated.

1. Simple validation on Application Service

The first thing that can come to mind is a simple validation in the Command Handler itself. In this solution we need to implement private method which validates our command and throws exception if validation error occurs. Closing this kind of logic in separate method is better from the Clean Code perspective (see Extract Method too).

The result of invalid command execution:

This is not so bad approach but has two disadvantages. Firstly, it involves from us writing a lot of easy and boilerplate code – comparing to nulls, defaults, values from list etc. Secondly, we are losing here part of separation of concerns because we are mixing validation logic with orchestrating our use case flow. Let’s take care of boilerplate code first.

2. Validation using FluentValidation library

We don’t want to reinvent the wheel so the best solution is to use library. Fortunately, there is a great library for validation in .NET world – Fluent Validation. It has nice API and a lot of features. This is how we can use it to validate our command:

Now, the Validate method looks like:

The result of validation is the same as earlier, but now our validation logic is more cleaner. The last thing to do is decouple this logic from Command Handler completely…

3. Validation using Pipeline Pattern

To decouple validation logic and execute it before Command Handler execution we arrange our command handling process in Pipeline (see NServiceBus Pipeline also).

For the Pipeline implementation we can use easily MediatR Behaviors. First thing to do is behavior implementation:

Next thing to do is to register behavior in IoC container (Autofac example):

This way we achieved separation of concerns and Fail-fast principle implementation in nice and elegant way.

But this is not the end. Finally, we need to do something with returned messages to clients.

Implementing Problem Details standard

Just as in the case of validation logic implementation, we will use a dedicated library – ProblemDetails. The principle of the mechanism is simple. Firstly, we need to create custom exception:

Secondly, we have to create own Problem Details class:

Last thing to do is to add Problem Details Middleware with definition of mapping between InvalidCommandException and InvalidCommandProblemDetails class in startup:

After change in CommandValidationBehavior (throwing InvalidCommandExecption instead Exception) we have returned content compatible with the standard:

Problem details

Summary

In this post I described:
– what Data validation is and where is located
– what Problem Details for HTTP APIs is and how could be implemented
– 3 methods to implement data validation in Application Services layer: without any patterns and tools, with FluentValidation library, and lastly – using Pipeline Pattern and MediatR Behaviors.

Source code

If you would like to see full, working example – check my GitHub repository

Related posts

Domain Model Encapsulation and PI with Entity Framework 2.2
Simple CQRS implementation with raw SQL and DDD
How to publish and handle Domain Events
10 common broken rules of clean code

Domain Model Encapsulation and PI with Entity Framework 2.2

Introduction

In previous post I presented how to implement simple CQRS pattern using raw SQL (Read Model) and Domain Driven Design (Write Model). I would like to continue presented example focusing mainly on DDD implementation. In this post I will describe how to get most out of the newest version Entity Framework v 2.2 to support pure domain modeling as much as possible.

I decided that I will constantly develop my sample on GitHub. I will try to gradually add new functionalities and technical solutions. I will also try to extend domain so that the application will become similar to the real ones. It is difficult to explain some DDD aspects on trivial domains. Nevertheless, I highly encourage you to follow my codebase.

Goals

When we create our Domain Model we have to take many things into account. At this point I would like to focus on 2 of them: Encapsulation and Persistence Ignorance.

Encapsulation

Encapsulation has two major definitions (source – Wikipedia):

A language mechanism for restricting direct access to some of the object’s components

and

A language construct that facilitates the bundling of data with the methods (or other functions) operating on that data

What does it mean to DDD Aggregates? It just simply mean that we should hide all internals of our Aggregate from the outside world. Ideally, we should expose only public methods which are required to fulfill our business requirements. This assumption is presented below:

Persistence Ignorance

Persistence Ignorance (PI) principle says that the Domain Model should be ignorant of how its data is saved or retrieved. It is very good and important advice to follow. However, we should follow it with caution. I agree with opinion presented in the Microsoft documentation:

Even when it is important to follow the Persistence Ignorance principle for your Domain model, you should not ignore persistence concerns. It is still very important to understand the physical data model and how it maps to your entity object model. Otherwise you can create impossible designs.

As described, we can’t forget about persistence, unfortunately. Nevertheless, we should aim at decoupling Domain Model from rest parts of our system as much as possible.

Example Domain

For a better understanding of the created Domain Model I prepared the following diagram:

It is simple e-commerce domain. Customer can place one or more Orders. Order is a set of Products with information of quantity ( OrderProduct). Each Product has defined many prices ( ProductPrice) depending on the Currency.

Ok, we know the problem, now we can go to the solution…

Solution

1. Create supporting architecture

First and most important thing to do is create application architecture which supports both Encapsulation and Persistence Ignorance of our Domain Model. The most common examples are:
Clean Architecture
Onion Architecture
Ports And Adapters / Hexagonal Architecture

All of these architectures are good and and used in production systems. For me Clean Architecture and Onion Architecture are almost the same. Ports And Adapters / Hexagonal Architecture is a little bit different when it comes to naming, but general principles are the same. The most important thing in context of domain modeling is that each architecture Business Logic/Business Layer/Entities/Domain Layer 1) is in the center and 2) has no dependency to other components/layers/modules. It is the same in my example:

What this means in practice for our code in Domain Model?
1. No data access code.
2. No data annotations for our entities.
3. No inheritance from any framework classes, entities should be Plain Old CLR Object

2. Use Entity Framework in Infrastructure Layer only

Any interaction with database should be implemented in Infrastructure Layer. It means you have to add there entity framework context, entity mappings and implementation of repositories. Only interfaces of repositories can be kept in Domain Model.

3. Use Shadow Properties

Shadow Properties are great way to decouple our entities from database schema. They are properties which are defined only in Entity Framework Model. Using them we often don’t need to include foreign keys in our Domain Model and it is great thing.

Let’s see the Order Entity and its mapping which is defined in CustomerEntityTypeConfiguration mapping:

As you can see on line 15 we are defining property which doesn’t exist in Order entity. It is defined only for relationship configuration between Customer and Order. The same is for Order and ProductOrder relationship (see lines 23, 24).

4. Use Owned Entity Types

Using Owned Entity Types we can create better encapsulation because we can map directly to private or internal fields:

Owned types are great solution for creating our Value Objects too. This is how MoneyValue looks like:

5. Map to private fields

We can map to private fields not only using EF owned types, we can map to built-in types too. All we have to do is give the name of the field and column:

6. Use Value Conversions

Value Conversions are the “bridge” between entity attributes and table column values. If we have incompatibility between types, we should use them. Entity Framework has a lot of value converters implemented out of the box. Additionally, we can implement custom converter if we need to.

This converter simply converts “StatusId” column byte type to private field _status of type OrderStatus.

Summary

In this post I described shortly what Encapsulation and Persistence Ignorance is (in context of domain modeling) and how we can achieve these approaches by:
– creating supporting architecture
– putting all data access code outside our domain model implementation
– using Entity Framework Core features: Shadow Properties, Owned Entity Types, private fields mapping, Value Conversions

Related posts

Simple CQRS implementation with raw SQL and DDD
How to publish and handle Domain Events
REST API Data Validation

Simple CQRS implementation with raw SQL and DDD

Introduction

I often come across questions about the implementation of the CQRS pattern. Even more often I see discussions about access to the database in the context of what is better – ORM or plain SQL.

In this post I wanted to show you how you can quickly implement simple REST API application with CQRS using the .NET Core. I immediately point out that this is the CQRS in the simplest edition – the update through the Write Model immediately updates the Read Model, therefore we do not have here the eventual consistency. However, many applications do not need eventual consistency, while the logical division of writing and reading using two separate models is recommended and more effective in most solutions.

Especially for this article I prepared sample, fully working application, see full source on Github.

My goals

These are my goals that I wanted to achieve by creating this solution:
1. Clear separation and isolation of Write Model and Read Model.
2. Retrieving data using Read Model should be as fast as possible.
3. Write Model should be implemented with DDD approach. The level of DDD implementation should depend on level of domain complexity.
4. Application logic should be decoupled from GUI.
5. Selected libraries should be mature, well-known and supported.

Design

High level flow between components looks like:

As you can see the process for reads is pretty straightforward because we should query data as fast as possible. We don’t need here more layers of abstractions and sophisticated approaches. Get arguments from query object, execute raw SQL against database and return data – that’s all.

It is different in the case of write support. Writing often requires more advanced techniques because we need execute some logic, do some calculations or simply check some conditions (especially invariants). With ORM tool with change tracking and using Repository Pattern we can do it leaving our Domain Model intact (ok, almost).

Solution

Read model

Diagram below presents flow between components used to fulfill read request operation:

The GUI is responsible for creating Query object:

Then, query handler process query:

The first thing is to get open database connection and it is achieved using SqlConnectionFactory class. This class is resolved by IoC Container with HTTP request lifetime scope so we are sure, that we use only one database connection during request processing.

Second thing is to prepare and execute raw SQL against database. I try not to refer to tables directly and instead refer to database views. This is a nice way to create abstraction and decouple our application from database schema because I want to hide database internals as much as possible.

For SQL execution I use micro ORM Dapper library because is almost as fast as native ADO.NET and does not have boilerplate API. In short, it does what it has to do and it does it very well.

Write model

Diagram below presents flow for write request operation:

Write request processing starts similar to read but we create the Command object instead of the query object:

Then, CommandHandler is invoked:

Command handler looks different than query handler. Here, we use higher level of abstraction using DDD approach with Aggregates and Entities. We need it because in this case problems to solve are often more complex than usual reads. Command handler hydrates aggregate, invokes aggregate method and saves changes to database.

Customer aggregate can be defined as follows:

Architecture

Solution structure is designed based on well-known Onion Architecture as follows:

Only 3 projects are defined:
– API project with API endpoints and application logic (command and query handlers) using Feature Folders approach.
– Domain project with Domain Model
– Infrastructure project – integration with database.

Summary

In this post I tried to present the simplest way to implement CQRS pattern using raw sql scripts as Read Model side processing and DDD approach as Write Model side implementation. Doing so we are able to achieve much more separation of concerns without losing the speed of development. Cost of introducing this solution is very low and and it returns very quickly.

I didn’t describe DDD implementation in detail so I encourage you once again to check the repository of the example application – can be used as a kit starter for your app the same as for my applications.

Related posts

Domain Model Encapsulation and PI with Entity Framework 2.2
How to publish and handle Domain Events
REST API Data Validation