Modular Monolith: Integration Styles

This post is part of articles series about Modular Monolith architecture:

1. Modular Monolith: A Primer
2. Modular Monolith: Architectural Drivers
3. Modular Monolith: Architecture Enforcement
4. Modular Monolith: Integration Styles (this)

Introduction

No module or application in a larger system works in 100% isolation. In order to deliver business value, individual elements must somehow integrate with each other. Let me write here a quote from the book “Thinking in Systems: A Primer”, where Donella H. Meadows defines the system concept in general:

A system is an interconnected set of elements that is coherently organized in a way that achieves something. If you look at that definition closely for a minute, you can see that a system must consist of three kinds of things: elements, interconnections, and a function or purpose.

The concept of systems integration is defined as follows (wiki):

process of linking together different computing systems and software applications physically or functionally, to act as a coordinated whole.

As you can see from the definitions above, in order to provide a system that fulfills its purpose, we must integrate elements to form a whole. In previous articles in this series, we discussed the attributes of these elements which are, in our terminology, called modules.

In this post, I would just like to discuss the missing part – Integration Styles for modules in Modular Monolith architecture.

Enterprise Integration Patterns book

The title of this post is not accidental. It sounds exactly like Chapter 2 of Gregor Hohpe and Bobby Wolf great book Enterprise Integration Patterns. This book is considered a bible of information about systems integration and messaging. This article takes some knowledge from this chapter and relates it to the monolithic and modular architecture.

In any case, everyone interested in the topic of integration, I invite you to read the book or materials that are available online at https://www.enterpriseintegrationpatterns.com/ site.

Integration Styles

Integration Criteria

Like everything in nature, each Integration Style has its pros and cons. Therefore, we must define criteria on the basis of which we will compare all styles. Then, based on that criteria, we will decide on the method of integration in the future.

We can distinguish the following criteria: Coupling, Complexity, Data Timeliness.

1. Coupling

Coupling is a measure of the degree to which 2 modules are dependent on each other (wiki):

coupling is the degree of interdependence between software modules; a measure of how closely connected two routines or modules are; the strength of the relationships between modules.

If you’ve read the previous posts in the series, you already know that one of the most important attributes of modular design is independence. Therefore, it will be easy to guess that coupling is one of the more important criteria in terms of integration style.

2. Complexity

The second criterion for evaluating the Integration Style is its level of complexity. Some integration methods are simple – require little work, are easy to understand and use. However, others are more complicated, require more commitment, knowledge, and discipline.

3. Data Timeliness

The last criterion is the length of time between when one module decides to share some data and other modules have that data. This means how soon after a state change in a given module, the rest of the modules concerned will take this change into account. Of course, the shorter this time, the better.

Now that we know all the most important criteria, let’s move on to the ways of integrating our modules. Let’s discuss 4 Integration Styles: File Transfer, Shared Database Data, Direct Call and Messaging.

File Transfer

The first option is to integrate our modules using a regular file. Such a file must be exported from the source module and imported into the target module. This can happen in 3 ways:
– manual, where the user manually imports/exports
– automatic, where files are imported and exported automatically by systems
– hybrid, where the file is imported/exported automatically on one side and vice versa on the other

Modular Monolith Integration Styles - File Transfer
File Transfer

One of the main tasks of this type of integration is to determine the format of a given file. What is important, it is the only dependency that two modules integrating in this way have. You can visualize it as a really huge message that is carried over by the filesystem. For this reason, it can be assumed that the coupling is very low in this case.

As for the level of complexity of this approach, it can be evaluated as an average. On the one hand, generating a file in a specific format is not difficult in these times. On the other hand, uploading to a shared resource, managing files, handling duplicates, and so on is more complicated and time-consuming.

From a timeliness point of view, modules integration via files is slow (not to mention manual export/import). Most often it is performed in larger batches at some time intervals (so-called batches), often at night. For this reason, the delay can be a day, a week or more.

To be honest, I’ve seen file-sharing integration many, many times between systems, and probably never in a monolith – which is rather understandable. This Integration Style I have described for the sake of completeness of this topic. The most popular integration method for monoliths is the Shared Database Data.

Shared Database Data

In the EIP book, this method of integration is called Shared Database, but I believe that it is not quite the right name. Sharing the database does not always have to mean sharing data, because modules can store its data in separate tables (most often it is done through database schemas). Therefore, in my opinion, Shared Database Data is a better term.

Modular Monolith Integration Styles - Shared Data
Integration Style: Shared Data

In Shared Database Data modules share a certain set of data in the database. In this way, the data are always integrated and consistent with each other because, generally speaking, they are the same data. If module A writes data to table X, module B can read the data immediately after the database transaction is completed.

The level of complexity of such a solution is very small. Nowadays every application/module needs a database, so there is no need to add anything extra with this approach.

The solution looks perfect at first glance. However, its biggest disadvantage is a very high coupling. By sharing data, modules share their state which couples them together. The high coupling means no autonomy for the module. In addition, one little change to database structure or even data itself can break another module without notice. It implies that each change to the database must be consulted and coordinated. This way database becomes bottle-neck of changes. The whole solution is not evolutionary anymore.

The shared state has another significant disadvantage – it is very hard or even impossible to create one, unified data model which will ensure that the requirements of all modules are met. The attempt to unify most often ends with a very weak, ambiguous model that is difficult to understand, develop and maintain.

To reduce coupling while still maintaining the same level of data timeliness we can use Direct Call.

Direct Call

The third option is to directly call the method of the module we’re integrating with. In this case, we use the encapsulation mechanism. The module exposes only what is needed. The whole behavior is closed in a method. In this way, the state of our module is not exposed to the outside as it is in the case of the Shared Database Data approach. Thanks to this, the caller is not able to break anything from the outside.

Modular Monolith Integration Styles - Direct Call
Integration Styles – Direct Call

Not sharing the data implies that each module has its own data set. It can be the same database broken down by schemas or each module can have even a separate database created in different technology. In a scenario with one database, it is important to keep the data really in isolation. It means no constrains between tables from separate modules and no transactions between them.

Both the caller and the callee should treat each other as external. Both modules will use a different language and have different concepts modeled. Therefore, the Anti-Corruption Layer (ACL) should be applied. On the caller side, it could just be a gateway, on the callee side a facade. Thanks to this, modules encapsulation are kept.

In the case of a distributed system, the Direct Call is known as Remote Procedure Invocation/Call (RPI/RPC). Unfortunately, this technique is very often used in Microservice architecture and can lead to the so-called Distributed Monolith anti-pattern architecture. As the call is always synchronous, we are dealing with temporal coupling. Both caller and calle must be available in the same time. In the case of a monolith, it is not a problem because this is its nature, in the case of microservices it is much worse – it reduces architecture quality attributes like autonomous development and deployment. Read other article in this series about architectural drivers for more details.

The Direct Call Integration Style seems to be a very good choice when it comes to integrating our modules, but has some drawbacks too. First, the call is synchronous, so the caller has to wait for the result. Second, the calling module needs to know about the module it is calling, it has to have a direct dependency. Moreover, it has to know the intent of what it wants to do. Coupling is lower than in Shared Database Data, but still exists. If we want to avoid these drawbacks, we can use the last integration style: Messaging

Messaging

The File Transfer integration style has a great advantage – it does not create dependencies between modules. However, it has a big drawback – the data timeliness in most cases is unacceptable. Messaging does not have this disadvantage. The data timeliness is not so good as in the case of Direct Call because it is asynchronous communication, but it can be safely said that it is very good and acceptable in most cases.

Modular Monolith Integration Styles - Messaging - in memory
Integration Styles – Messaging (in memory)
Modular Monolith Integration Styles - Messaging - separate process
Integration Styles – Messaging (separate process)

Appropriate use of Messaging for the implementation of Event-Driven Architecture causes no dependency between modules. Modules integrate through events. However, these are not Domain Events because a domain event is local and should be encapsulated in a given Bounded Context. Integration Events contain only as much as needed to avoid the so-called Fat Events. Integration Events should be as small as possible, as they are part of the contract made available by the given module. As you know, the smaller the contract, the more stable it is -> the less frequently other modules have to change.

In addition, asynchronous processing, that causes Eventual Consistency, on the other hand supports performance advantages, scales better and it is more reliable.

What are the disadvantages of Messaging? First, due to the nature of asynchronicity, the state of our entire system can be eventually consistent as described above. That is why it is so important to have the boundaries of the modules well defined. This is almost as important as in Microservices architecture, but the advantage of the Modular Monolith architecture is that it is much easier to change these boundaries.

The second disadvantage of Messaging is that it is more complex. To provide asynchronous processing and Event-Driven Architecture, we’re going to need some sort of Event Bus. It can be an in-memory broker or a separate component (eg RabbitMQ). In addition, we will also need job processing mechanisms for internal processing – outbox, inbox, internal commands messages. There is need to write some of this infrastructure code. Fortunately, this is a generic problem – we do it only once and there are a lot of libraries and frameworks which support this.

Comparison

Below I present a comparison of all 3 Integration Styles taking into account 3 criteria – Coupling, Data Timeliness and Complexity.

Comparison - Coupling vs Complexity
Comparison – Coupling vs Complexity
Comparison - Coupling vs Data Timeliness
Comparison – Coupling vs Data Timeliness

What can we deduce from these diagrams? First of all, there is no one perfect style. Fortunately, we can see some heuristics:

1. If the system is very, very simple (essential complexity is low) and you do not care about modularity: choose simplicity and use Shared Database Data style.
2. If the system is complex (essential complexity is high) you must care about modularity. Options are:
– choose Messaging if you prefer highest level of autonomy and eventual consistency between modules is acceptable
– choose Messaging if you care a lot about performance, reliability, scalability
– choose Direct Call if you must have strong consistency, don’t need maximum level of modules autonomy or Data Timeliness is the primary factor.

Additionally, you can mix different styles. Most often, when we are talking about modular architecture, it is best to use Direct Call and Messaging together. Some modules can communicate synchronously and some asynchronously, depending on the need.

Summary

As I mentioned at the beginning, no module, component, or system lives in complete isolation. We must follow the “Divide and Conquer” principle so we divide our solution into smaller parts, but finally – we have to integrate these parts together to create a system.

Let’s summarize all 4 styles again:

File transfer – provides low coupling but has almost always unacceptable data timeliness so it is impractical in the monolith
Shared Database Data – the simplest, quick, but couples modules together
Direct Call – provides lower coupling than Shared Database Data, encapsulates modules, relatively simple
Messaging – ensures the lowest coupling, modularity, autonomy but at the cost of complexity

Which Integration Style to choose? Everything, as usual, “it depends”. However, I hope that I managed to at least to some extent explain what it depends on. No silver bullet, again.

Related Posts

1. Modular Monolith: A Primer
2. Modular Monolith: Architectural Drivers
3. Modular Monolith: Architecture Enforcement
4. How to publish and handle Domain Events
5. Handling Domain Events: Missing Part
6. Processing multiple aggregates – transactional vs eventual consistency

Handling concurrency – Aggregate Pattern and EF Core

Introduction

In this post I would like to discuss a frequently overlooked, though in some circumstances very important topic – concurrency handling in context of protection of the so-called Domain Invariants.

In other words, the question is the following: how to ensure that even in a multi-threaded environment we are able to always guarantee the immediate consistency for our business rules?

It’s best to describe this problem with an example…

Problem

Let’s assume that our domain has the concept of Order and Order Line. Domain expert said that:

One Order can have a maximum of 5 Order Lines.

In addition, he said that this rule must be met at all times (without exception). We can model it in the following way:

Conceptual Model
Conceptual Model

The Aggregate

One of the main goals of our system is to enforce business rules. One way to do this is to use the Domain-Driven Design tactical building block – an Aggregate. The Aggregate is a concept created to enforce business rules (invariants). Its implementation may vary depending on the paradigm we use, but In object-oriented programming, it is an object-oriented graph as Martin Fowler describes it:

A DDD aggregate is a cluster of domain objects that can be treated as a single unit.

And Eric Evans in DDD reference describes:

Use the same aggregate boundaries to govern transactions and distribution. Within an aggregate boundary, apply consistency rules synchronously

Back to the example. How can we ensure that our Order will never exceed 5 Order Lines? We must have an object that will take care of it. To do this, it must have information about the current number of Order Lines. So it must have a state and based on this state its responsibility is to decide whether invariant is broken or not.

In this case, the Order seems to be the perfect object for this. It will become root of our Aggregate, which will have Order Lines. It will be his responsibility to add the order line and check that the invariant has not been broken.

Order Aggregate
Order Aggregate

Let’s see how the implementation of such a construct may look like:

Everything’s fine right now, but this is only a static view of our model. Let’s see what the typical system flow is when invariant is not broken and when is broken:

Process of adding Order Line - one thread
Process of adding Order Line – success
Process of adding Order Line - rule broken
Process of adding Order Line – rule broken

Simplified implementation bellow:

Concurrency problem

Everything works nice and clean if we operate in a not heavy loaded environment. However, let’s see what can happen when 2 threads almost at the same time want to perform our operation:

Process of adding Order Line - business rule broken
Process of adding Order Line – business rule broken

As you can see in the diagram above, 2 threads load exactly the same aggregate at the same time. Let’s assume that the Order has 4 Order Lines. The aggregate with 4 Order Lines will be loaded in both the first and second threads. The exception will not be thrown, because of 4 < 5. Finally, depending on how we persist the aggregate, the following scenarios are possible:

a) If we have a relational database and a separate table for Order Lines then 2 Order Lines will be added giving a total of 6 Order Lines – the business rule is broken.

b) If we store aggregate in one atomic place (for example in document database as a JSON object), the second thread will override the first operation and we (and the User) won’t even know about this.

The reason for this behavior is that the second thread read data from the database (point 2.1) before the first one committed it (point 1.4.3).

Let’s see how we can solve this problem.

Solution

Pessimistic concurrency

The first way to ensure that our business rule will not be broken is to use Pessimistic concurrency. In that approach, we allow only one thread to process a given Aggregate. This leads to the fact that the processing thread must block the reading of other threads by creating a lock. Only when the lock is released, the next thread can get the object and process it.

Pessimistic concurrency
Pessimistic concurrency

The main difference from the previous approach is that the second thread waits for the previous one to finish (time between points 2.1 and 1.3). This approach causes us to lose performance because we can process transactions only one after the other. Moreover, it can lead to deadlocks.

How can we implement this behavior using EntityFramework Core and SQL server?

Unfortunately, EF Core does not support Pessimistic Concurrency. However, we can do it easily by ourselves using raw SQL and the query hint mechanism of the SQL Server engine.

Firsty, the database transaction must be set. Then the lock must be set. This can be done in two ways – either read data with query hint (XLOCK, PAGELOCK) or by updating the record at the beginning:

In this way, the transaction on the first thread will receive a Exclusive Lock (on write) and until it releases it (through the transaction commit) no other thread will be able to read this record. Of course, assuming that our queries operate in at least read committed transaction isolation level to avoid the so-called dirty reads.

Optimistic concurrency

An alternative, and most often preferred solution is to use Optimistic Concurrency. In this case, the whole process takes place without locking the data. Instead, the data in the database is versioned and during the update, it is checked – whether there was a change of version in the meantime.

Optimistic Concurrency
Optimistic Concurrency

What does the implementation of this solution look like? Most ORMs support Optimistic Concurrency out of the box. It is enough to indicate which fields should be checked when writing to the database and it will be added to the WHERE statement. If it turned out that our statement has updated 0 records, it means that the version of the record has changed and we need to do a rollback. Though often the current fields are not used to check the version and special column called “Version” or “Timestamp” is added.

Back to our example, does adding a column with a version and incrementing it every time the Order entity is changed solve the problem? Well no. The aggregate must be treated as a whole, as a boundary of transaction and consistency.

Therefore, the version incrementation must take place when we change anything in our aggregate. If we have added the Order Line and we keep it in a separate table, ORM support for optimistic concurrency will not help us because it works on updates and here there are inserts and deletes left.

How can we know that the state of our Aggregate has changed? If you follow this blog well, you already know the answer – by Domain Events. If a Domain Event was thrown from the Aggregate, it means that the state has changed and we need to increase Aggregate version.

The implementation can look like this.

First, we add a _version field to each Aggregate Root and the method to increment that version.

Secondly, we add a mapping for the version attribute and indicate that it is a EF Concurrency Token:

The last thing to do is incrementing the version if any Domain Events have been published:

With this setup, all updates will execute this statement:

For our example, for second thread no record will be updated ( @@ ROWOCOUNT = 0 ), so EntityFramework will throw the following message:

Microsoft.EntityFrameworkCore.DbUpdateConcurrencyException: Database operation expected to affect 1 row(s) but actually affected 0 row(s). Data may have been modified or deleted since entities were loaded. See http://go.microsoft.com/fwlink/?LinkId=527962 for information on understanding and handling optimistic concurrency exceptions.

and our Aggregate will be consistent – the 6th Order Line will not be added. The business rule is not broken, mission accomplished.

Summary

In summary, the most important issues here are:

  • The Aggregate’s main task is to protect invariants (business rules, the boundary of immediate consistency)
  • In a multi-threaded environment, when multiple threads are running simultaneously on the same Aggregate, a business rule may be broken
  • A way to solve concurrency conflicts is to use Pessimistic or Optimistic concurrency techniques
  • Pessimistic Concurrency involves the use of a database transaction and a locking mechanism. In this way, requests are processed one after the other, so basically concurrency is lost and it can lead to deadlocks.
  • Optimistic Concurrency technique is based on versioning database records and checking whether the previously loaded version has not been changed by another thread.
  • Entity Framework Core supports Optimistic Concurrency. Pessimistic Concurrency is not supported
  • The Aggregate must always be treated and versioned as a single unit
  • Domain events are an indicator, that state was changed so Aggregate version should be changed as well
  • GitHub sample repository

    Especially for the needs of this article, I created a repository that shows the implementation of 3 scenarios:

    – without concurrency handling
    – with Pessimistic Concurrency handling
    – with Optimistic Concurrency handling

    Link: https://github.com/kgrzybek/efcore-concurrency-handling

    Related Posts

    Handling Domain Events: Missing Part
    Domain Model Encapsulation and PI with Entity Framework 2.2
    Attributes of Clean Domain Model

Strangling .NET Framework App to .NET Core

Introduction

Each technology becomes obsolete after some time. It is no different with the .NET Framework – it can be safely said that after the appearance of the .NET Core platform, the old Framework is slowly disappearing. Few people write about this platform, at conferences it has not been heard for a long time in this topic. Nobody starts a project in this technology. Only the .NET Core everywhere… except for a beloved legacy systems!

Well, despite the fact that there is .NET Core in the new solutions, a huge number of systems stand on the old .NET Framework. If you are not participating in a greenfield project but more in maintenance, then you are very likely to be sitting in an old framework. Who likes to play with old toys? Nobody. Everyone would like to use new technology, especially if it is better, more efficient and better designed.

Is the only solution to change the project and even the employer to use the new technology? Certainly not. We have 2 other options: “Big Bang Rewrite” or the use of “Strangler Pattern”

Big Bang Rewrite

Big Bank Rewrite means rewriting the whole application to new technology/system. This is usually the first thought that comes to our minds. After rewriting the entire system, turn off the old system and start the new one. It sounds easy but is not.

First of all, users do not use the new system until we have the whole system rewritten. This approach resembles the methodology of running a Waterfall project with all its drawbacks.

Secondly, it is an “all or nothing” approach. If during the project it turns out that we have run out of budget, time or resources, the End User is left with nothing – we have not delivered any business value. Users use the old system, which is still difficult to maintain. We can put our unfinished product on the shelf.

Big Bang Rewrite
Big Bang Rewrite

Can we do it better? Yes, and here comes the so-called “Strangler Pattern”.

Strangler Pattern

The origin of this pattern comes from an article written by Martin Fowler called Strangler Fig Application. The general point is that in tropical forests live figs that live on trees and are slowly “strangling” them.

Apart from the issues of nature, the metaphor is quite interesting and transferring it to the context of rewriting the system is described by Martin Fowler as follows:

gradually create a new system around the edges of the old, letting it grow slowly over several years until the old system is strangled

The keyword here is the word “gradually”. So, unlike the “Big Bang Rewrite”, we rewrite our system piece by piece. The functionalities implemented in the new system immediately replace the old ones. This approach has a great advantage because we immediately provide value and get immediate feedback. It is definitely a more “agile” approach with all the advantages.

Strangler Pattern
Strangler Pattern

Motivation

Before proceeding to implement this approach, I would like to present the motivations for rewriting the system into new technology. Because as with everything – “there is no such thing as a free lunch” – rewriting the system, even in an incremental approach, will cost money, time and resources anyway.

So below is a list of Architectural Drivers that may affect this decision:

  • Development will be easier, more productive
  • We can use new techniques, methods, patterns, approaches
  • System will be more maitainable
  • New technology gives us more tools, libraries, frameworks
  • New technology is more efficient and we want to increase performance of our solution
  • New technology has more support from vendor, community
  • Support of old technology is ended, no patches and upgrades are available
  • Old technology can have security vulnerabilities
  • Developers will be happy
  • It will be easier to hire new developers
  • We want to be innovative, increase Employer branding
  • We must meet some standards and old technology doesn’t make it possible for us

As you can see the list is long and you can probably mention other things. So if you think it’s worth it, here’s one way to do it in the Microsoft ecosystem – rewriting the .NET Framework application to .NET Core using “Strangler Pattern”.

Design

The main element of this pattern is the Proxy / Facade, which redirects requests to either the legacy system or the new one. The most common example is the use of a load balancer that knows where to direct the request.

If we have to rewrite the API, e.g. Web API, the matter seems to be simple. We rewrite each Web API endpoint and implement the appropriate routing. There is nothing to describe here.

Rewriting using load balancer
Rewriting using load balancer. FE and BE are separate.

However, in .NET, many systems are written combining both Backend and Frontend. I am thinking of all applications like WebForms, ASP.NET MVC with server-side rendering, WPF, Silverlight etc. In this case, placing the facade in front of the entire application would require generating FrontEnd by a new application as well.

FrontEnd and BackEnd combined
Frontend and Backend combined

Of course, this can be done but I propose a different solution – to rewrite only Backend.

If do not want to put a proxy in front of our old application, we can use the same old application as a proxy. In this way, all requests will go through the old application, while some of them will be processed on the side of the new system. This approach has the following implications:

Let’s see how the architecture of such a solution looks like:

Strangling .NET Framework App to .NET Core
Strangling .NET Framework App to .NET Core

The request process is as follows:

1. Client sends a request

2. Old Frontend receives the request and a forwards it to the old Backend (the same application) or new Backend (new .NET Core application).

3. Both backends communicate with the same database

Communication with the new backend takes place only through the Gateway (see Gateway pattern). This is very important because it highlights the synchronous API call (see Remote Procedure Call). Under no circumstances don’t hide it! On the contrary – by using the Gateway we show clearly what is happening in our code.

The main goal is that our proxy should be as thin as possible. It should know as little as possible and have the least responsibility. To do this, when communicating with the new system, we can use the CQRS architectural style using the fact that queries and commands can be serialized. In this way, our proxy is only responsible for invoking the appropriate query or command – nothing more. Let’s see how it can be implemented.

Implementation

Backend API

Let’s start by defining the API of the new system – the endpoints responsible for receiving queries and commands. Thanks to the Mediatr and JSON.NET library it is very easy:

We define a simple Data Transfer Object for these endpoints. It holds the type of object and data in serialized form (JSON):

At the moment, our API contract is just a list of queries and commands. For our old .NET Framework application to benefit from this contract using strong typing, we must put all of these objects to separate assembly. We want both .NET frameworks and .NET Core to be able to refer to it, so it must be set to target .NET Standard

.NET Standard Contract assembly sharing
.NET Standard Contract assembly sharing

The Gateway

As I wrote earlier, communication and integration with the new Backend must go through the gateway. The gateway responsibility is:

1. Serialization of the request object
2. Providing additional metadata for a request like the context of the user. This is important because the new Backend should be stateless.
3. Send a request to the appropriate address
4. Deserialization of response / error handling

However, for the client of our Gateway, we want to hide all these responsibilities behind the interface:

The implementation of Gateway looks like this:

Finally, we come to usage of our Gateway. Most often it will be a Controller (in the GRASP terms):

Represents the overall “system”, “root object”, device that the software is running within, or a major subsystem (these are all variations of a facade controller)

In other words, it will be a place in the old application that starts processing the request. In ASP.NET MVC applications this will be the MVC Controller, in WebForms application – the code behind class and in WPF application – ViewModel (from MVVM pattern).

For the purposes of this article, I have prepared a console application to make the example as simple as possible:

This sample application adds 3 products by executing 3 commands and finally gets the list of added products by executing the query. Of course, in a real application Gateway interface is injected in accordance with the principle of Dependency Inversion Principle.

Example repository

The entire implementation can be found on my specially prepared GitHub repository: https://github.com/kgrzybek/old-dotnet-to-core-sample

Summary

In this article I discussed the following topics:

  • Approaches and motivations for rewriting the system
  • Strangler Pattern
  • Designing the system for rewriting
  • Implementation of incremental migration from .NET Framework applications to .NET Core

The decision to rewrite the system is often not easy because it is a long-term investment. At the very beginning, the development of a new system rarely brings any business value – only costs.

That is why it is always important to clearly explain to all stakeholders what benefits the rewriting of the system will bring and what risks the lack of such rewriting may result. In this way, it will be much easier to convince anyone to rewrite the system and we will not look like people who only care about playing with new toys. Good luck!

Related Posts

1. Simple CQRS implementation with raw SQL and DDD
2. GRASP – General Responsibility Assignment Software Patterns Explained
3. Processing commands with Hangfire and MediatR

Modular Monolith: Architecture Enforcement

This post is part of articles series about Modular Monolith architecture:

1. Modular Monolith: A Primer
2. Modular Monolith: Architectural Drivers

3. Modular Monolith: Architecture Enforcement (this)
4. Modular Monolith: Integration Styles

Introduction

In previous posts we discussed what is the architecture of Modular Monolith and architectural drivers that can affect its choice. In this post, I would like to focus on ways to enforce chosen architecture.

The methods described below are not just about the modular monolith architecture, it can be said that there are universal. Nevertheless, due to the monolithic nature, the size of the codebase and the ease of its changes, they are particularly important in enforcing architecture.

Model-code gap

Let’s assume that based on current architectural drivers, you decided on the architecture of a Modular Monolith. Let’s also assume that you have predefined your module boundaries and solution architecture. You chose the technology, approach, way of communication between modules, way of persistence.

Everything has been documented in the form of a solution architecture document/description (SAD) or you made just a few diagrams (using UML, C4 model or simply arrows and boxes). You’ve done enough up-front design, you can start the first iterations of implementation.

At the beginning it is very simple. It does not have much functionality, there is little code, it is easily maintained and consistent with modeled architecture. There is a lot of time and even if something goes wrong it is easy to refactor. So far so good.

However, at some point it is not easy anymore. Functionalities and code are increasing, requirements changes are starting to appear, deadlines are chasing. We start making shortcuts and our implementation begins to differ significantly from the design. In the case of the Modular Monolith architecture, we most often lose in this way modularity, independence and everything begins to communicate with everything. Another Big Ball Of Mud made:

It was supposed to be like never before, it ended as always

George Fairbanks in his book Just Enough Software Architecture: A Risk-Driven Approach the phenomenon described above defines as follows:

Your architecture models and your source code will not show the same things. The difference between them is the model-code gap.

and later:

Whether you start with source code and build a model, or do the reverse, you must manage two representations of your solution. Initially, the code and models might correspond perfectly, but over time they tend to diverge. Code evolves as features are added and bugs are fixed. Models evolve in response to challenges or planning needs. Divergence happens when the evolution of one or the other yields inconsistencies.

Are we always doomed to such an end in the long run? Well no. It certainly requires a lot of discipline from ourselves, but discipline is not everything. We need to apply appropriate practices and approaches that keep our architecture in check. What are these approaches then?

Architecture enforcement

When describing the tools to check whether our implementation is consistent with the assumed design, we must take into account 2 aspects.

The first aspect is the possibilities that the tool gives us. As we know architecture is a set of rules at different levels of abstraction, sometimes hard to define. Not to mention that we have to check them out.

The second aspect is how quickly we get feedback. Of course, the sooner the better because we are able to fix something faster. The faster we fix something, the less impact this error has on our architecture later.

Considering the following assumptions, when it comes to architectural enforcement we can do it on 3 different levels: through the compiler, automated tests and code review.

Architecture enforcement approaches
Architecture enforcement approaches

Compile-time

The compiler is your best friend. It is able to quickly check for you many things that would take you a long time. In addition, the compiler cannot be wrong, people can. In that case, why so rarely do we use the compiler to take responsibility for compliance with our chosen architecture? Why we do not want to use its possibilities to the maximum?

The first main sin is the everything is public principle. According to the definition of modularity, modules should communicate through well-defined interfaces, which means they should be encapsulated. If everything is public, there is no encapsulation.

Unfortunately, the programming community favors this phenomenon by:

– tutorials
– sample projects
– IDE (creating public classes by default)

We should definitely change the approach to private by default. If something cannot be private, let it be available within the module’s range, but still inaccessible to others.

How to do it? Unfortunately, we have limited options in .NET. The only thing we can do is to separate the module into a separate assembly and use the “internal” access modifier. There is almost a war between supporters of all code in one project (assembly) and supporters of splitting into many projects.

The former say assembly is an implementation unit. Yes, but since we have no other way to encapsulate our modules, the division into projects seems to be a sensible solution. Additionally, thanks to checking references, adding incorrect dependencies (e.g. from the domain to the infrastructure) will be difficult or even impossible.

Lack of encapsulation is one of the most common sins I see, but not the only one. Others are not using immutability (unnecessary setters) or strong typing (primitive obsession) for example.

Generally speaking, we should use our language in such a way that the compiler can catch as many mistakes for us. It is the most efficient approach to enforce architecture of system.

Architecture enforcement - compile-time
Architecture enforcement – compile-time

Automated tests

Not everything can be checked using a compiler. This does not mean, however, that we must check it manually. On the contrary, the computer can still do it for us. In that case, we can use 2 mechanisms – static code analysis and automated tests.

Static code analysis

I will start with a more familiar and common method – a static code analyzer. Certainly, most have heard about such tools as SonarQube or NDepend. These are tools that automatically perform static analysis of our code and based on they provide metrics information that may be very useful to us. Of course, static code analyzers we can connect to CI process and get feedback on a regular basis.

Architecture Enforcement - static analysis
Architecture Enforcement – static analysis

Architecture tests

Architecture tests are another way less known but gaining popularity. These are unit tests, but instead of testing business functionalities they test our codebase in the context of architecture. Most often, such tests are written based on a library dedicated to this type of tests. Such a test may look like this:

We can check many things with these tests. Libraries for this (such as NetArchTests or ArchUnit) allow a lot and writing something different is not a difficult task. A complete example of using such tests can be found here.

Architecture Enforcement – architecture tests
Architecture Enforcement – architecture tests

Code review

If we are not able to check the compliance of our solution with the chosen architecture using a computer (compiler, automated tests), we have the last tool – code review. Thanks to the code review, we can check everything that a computer cannot do for us, but it has some disadvantages.

The first disadvantage is that people can be wrong, so the probability of missing an architectural decision break attempt is relatively high.

The second drawback, of course, is the large amount of time we need to spend on code-review. Of course, this is not a waste of time and we can not give it up, but it must always be included in the estimates of the project.

The conclusion is obvious – to enforce architecture we should use the computer as much as possible and treat code-review as the last line of defense. The question is how to strengthen this line of defense, i.e. how to reduce the time and probability of missing something during code-review? We can use Architecture Decisions Records (ADR).

Architecture Decisions Records (ADR)

What is the Architecture Decisions Record? Let me quote a definition from the most popular GitHub repository related to this topic:

An architectural decision record (ADR) is a document that captures an important architectural decision made along with its context and consequences.

Such a document is usually stored in the version control system, which is also recommended (as the approach itself) by the popular technology radar from ThoughtWorks company.

My advice is to start by describing your decisions as simply and quickly as possible. Without unnecessary ceremonies, choose a simple template (e.g. the one proposed by Michael Nygard), where are the most important elements – context, decision, and consequences. But how does this relate to the discussed code-review?

First, all decisions become public, everyone has access to them and they are described. There is no such thing that someone says “I did not know”. As such decisions are by definition important, everyone must know them and follow them.

The second thing is it speeds up the code-review process, because instead of writing why something is wrong, you can just paste the link to the appropriate ADR instead of explaining why we do it like that, what was the decision, when and in what context.

Summary

Each system has some architecture. The question is: whether you will shape the architecture of your system or whether it will shape itself? Certainly, the first option is better because the second one may condemn us to big failure.

Architecture enforcement is the responsibility of every team member (not just the architect), that’s why the way we do it is so important. It’s a process that requires commitment. The techniques I mentioned can significantly facilitate and improve the process of architectural enforcement while maintaining the quality of our system at the appropriate level.

Additional Resources

1. Unit Test Your Architecture with ArchUnit – Jonas Havers, article
2. Architecture Decision Records in Action presentation – Michael Keeling, Joe Runde, presentation
3. Design It! – Michael Keeling, book
4. Modular Monolith with DDD – Kamil Grzybek, GitHub repository
5. “Modular Monoliths” – Simon Brown, video

Related Posts

1. Modular Monolith: A Primer
2. Modular Monolith: Architectural Drivers
3. Domain Model Encapsulation and PI with Entity Framework 2.2
4. Attributes of Clean Domain Model

Image credits: nanibystudio

Modular Monolith: Architectural Drivers

This post is part of articles series about Modular Monolith architecture:

1. Modular Monolith: A Primer
2. Modular Monolith: Architectural Drivers (this)
3. Modular Monolith: Architecture Enforcement
4. Modular Monolith: Integration Styles

Introduction

In the first post about the architecture of Modular Monolith I focused on the definition of this architecture and the description of modularity. As a reminder, the Modular Monolith:

  • is a system that has exactly one deployment unit
  • is a explicit name for a Monolith system designed in a modular way
  • modularization means that module:
    – must be independent, autonomous
    – has everything necessary to provide desired functionality (separation by business area)
    – is encapsulated and has a well-defined interface/contract

In this post I would like to discuss some of the most popular, in my opinion, Architectural Drivers which can lead to either a Modular Monolith or Microservices architecture.

But what exactly are Architectural Drivers?

Architectural Drivers

In general, you can’t say that X architecture is better than the other. You can’t say that Monolith is better than Microservices, Clean Architecture is better than Layered Architecture, 3 layers are better/worse than 4 layers and so on.

The same rule applies to other considerations such as ORM vs raw SQL, “Current State” Persistence vs Event Sourcing, Anemic Domain Model vs Rich Domain Model, Object-Oriented Design vs Functional Programming… and a lot of more.

So how we can choose the architecture/approach/paradigm/tool/library if there is, unfortunately, no best?

Context is king

Each of our decisions are made in a given context. Each project is different (it results from the project definition) so each context is different. It implies that the same decision made in one context can bring great results, while in another it can cause devastating failure. For this reason, using other people’s/companies approaches without critical thinking can cause a lot of pain, wasted money and finally – the end of the project.

Project Contexts
Every Project is different and has different Context

However, context is a too general concept and we need something more specified to put into practice. That’s why the Architectural Drivers concept was defined. Michael Keeling writes about them in his blog article in the following way:

Architectural drivers are formally defined as the set of requirements that have significant influence over your architecture.

Simon Brown in the book Software Architecture for Developers describes Architectural Drivers similarly:

Regardless of the process that you follow (traditional and plan-driven vs lightweight and adaptive), there’s a set of common things that really drive, influence and shape the resulting software architecture.

Architectural Drivers have their categorization. The main categories are:

  • Fuctional Requirements – what and how problems does the system solve
  • Quality Attributes – a set of attributes that determine the quality of architecture like maintainability or scalability.
  • Technical Constraints – technology standards, tools limitations, team experience
  • Business Constraints – budget, hard deadline
Architectural Drivers
Architectural Drivers

Most importantly, all Architectural Drivers are connected to each other and often focus on one causes loss of another (trade-offs everywhere, unfortunately). Let’s consider this example.

You have some service that calculates some important thing (Functional Requirement) in 3 seconds (Quality Attribute – performance). A new requirement appears, calculation is more complex and takes now 5 seconds (Performance decreased). To go back to 3 seconds another technology could be used, but there is no time for it (Business Constraint – hard deadline) and nobody has used it in the company yet (Technical Constraint – team experience). The only option to increase performance is to move the calculation to the stored procedure, which decreases maintainability and readability (Quality Attributes).

Architectural Drivers example
Architectural Drivers example

As you can see, the software architecture is a continuous choice between one driver and another. There is no one “right” solution. There is No Silver Bullet.

With this in mind, let’s see some of the popular Architectural Drivers and attributes which are discussed during the considerations of Modular Monolith and Microservices architectures

Level of Complexity

At the beginning, let’s consider one of the greatest advantages of a Modular Monolith compared to distributed architectures – Complexity. The definition of complexity on the Wiki is as follows:

Complexity characterizes the behavior of a system or model whose components interact in multiple ways and follow local rules, meaning there is no reasonable higher instruction to define the various possible interactions. The term is generally used to characterize something with many parts where those parts interact with each other in multiple ways, culminating in a higher order of emergence greater than the sum of its parts.

As you see above, Complexity is about components and their interactions. In Modular Monolith architecture interactions between modules are simple because each module is located in the same process. This means that the module that wants to interact with another module:

  • Knows the exact address where he will direct the request and is sure that this address will not change
  • The request is just a method call, no network needed
  • The target module is always available
  • Security issue is not a concern
Modular Monolith Complexity
Complexity – Modular Monolith

On the other hand, consider distributed system architecture. In this architecture, the modules / services are located on other servers and communicate via the network. This means that when a service wants to communicate with another, it must deal with the following concerns:

  • It needs to get somehow address of target module, because it may be changed
  • Communication takes place via the network, which necessitates the use of special protocols like HTTP and serialization.
  • Network may be unavailable (CAP theorem)
  • Secure communication between modules must be ensured

Of course, You can find solutions for these issues. For example, to solve the addressing issue you can add Service Registry and implement Service Discovery pattern. However, it means adding more components and algorithms to the system so complexity rapidly increases.

To be aware of the scale of the problems generated by the Microservices architecture, I recommend that you familiarize yourself with the patterns that are used to solve them. The list is large, and most of them are not needed at all in the Monolith architecture.

Complexity - Distributed System
Complexity – Distributed System

In summary, the architecture of the Modular Monolith is definitely less complex than that of distributed systems. High complexity reduces maintainability, readability, observability. It needs an experienced team, advanced infrastructure, specific organizational culture and so on. If simplicity is your key architectural driver then consider Monolith First.

Productivity

The team’s productivity in delivering changes can be measured in two dimensions: in the context of the entire system and the single module.

In the context of the whole system, the matter is clear. The architecture of the Modular Monolith is less complex => the less complex the easier to understand => the productivity is higher.

From the point of view of the ease of running the entire system, the Modular Monolith ensures productivity at the maximum level – just download the code and run it on the local machine. In a distributed architecture, the matter is not so simple despite the technologies and tools (like Docker and Kubernetes) that facilitate this process.

Running entire system – Monolith vs Distributed

On the other hand, we have productivity related to the development of a single module. In this case, the microservice architecture will be better, because we do not have to run the entire system to test one specific module.

Which architecture, then, supports the team’s productivity? In my opinion, for most systems, the Modular Monolith, but for really large projects (tens or hundreds of modules) are microservices. If your architectural driver is development speed and the system is not huge, a better choice will be a Modular Monolith and in case of system expansion, a possible transition to microservices could be right move to do.

Deployability

The deployability of a software system is the ease with which it can be taken from development to production. However, we must consider 2 situations here. The deployment of the entire system and the single module.

In the context of the entire system, is it easier to deploy one application of several applications? Of course, one application is easier to deploy so it seems that Modular Monolith is better option.

Deployment - Modular Monolith
Deployment – Modular Monolith

On the other hand, in a Modular Monolith, we always have to deploy the whole system. We can not deploy one particular module and this is one of the most important disadvantages. In this architecture, we do not have deployment autonomy so deployment process must be coordinated and may be more difficult.

Deployability - Distribiuted System
Deployability – Distribiuted System

In summary, if you do not mind the deployment of the whole system and you do not care about the autonomy of deployment- this is the point behind the Modular Monolith. Otherwise, consider distributed architecture.

Performance

Performance is about how fast something is, usually in terms of response time, duration of processing or latency.

Assuming the scenario that all requests are processed in a sequential manner, Monolith architecture will always be more efficient than a distributed system. All modules operate in the same process, so there is no overhead on communication between them.

The distributed system has overhead caused by communication over the network – serialization and deserialization, cryptography and speed of sending packets.

Even in real scenarios, the Monolith will be more efficient, but only for some time. With the increase in users, requests, data, and complexity of calculations, it may turn out that performance decreases. Then we come to one of the main drivers of the Microservices architecture: scalability.

Scalability

What is scalability? Wikipedia says:

Scalability is the property of a system to handle a growing amount of work by adding resources to the system

In other words, scalability is about the ability for software to deal with more requests or data.

It’s best to show this by example. Let’s assume that one of our modules must now handle more requests than we initially assumed. To do this, we must increase the resources that are responsible for the operation of this module.

We can always do it in two ways. Increase node computing power (called Vertical Scaling) or add new nodes (called Horizontal Scaling). Let’s see how it looks from the point of view of Monolith and Microservices architecture:

Scaling
Scaling

As can be seen above, both architectures can be scaled. Monolith can be scaled too. Vertical Scaling is the same, but the difference is in Horizontal Scaling. Using this approach, we can scale the Modular Monolith only as a whole, which leads to inefficient resource utilization. In the Microservices architecture, we scale only those modules that we need to scale, which leads to better utilization of resources. This is the main difference.

The more instances of the modules must work, the more significant the difference. On the other hand, if you don’t have to scale a lot, maybe you better accept less efficient resource utilization and stay with the Monolith and take its other advantages? This is a good question that we should ask ourselves in such a situation.

Failure impact

Sometimes our architectural driver may be limiting the impact of failure. Let’s say we have a very unstable module that crashes the entire process once in a while.

In the case of a Modular Monolith, as the whole system works in one process, the whole system suddenly stops working and our availability decreases.

In the case of Microservices architecture, the “risky” module can be moved into a separate process and if it is stopped the rest of the system will work properly.

Failure impact
Failure impact

To increase the availability of the Modular Monolith, you can increase the number of nodes, but as with scalability, resource utilization will not be at the highest level compared to Microservices architecture.

Heterogeneous Technology

One of the attributes of a Modular Monolith that cannot be bypassed in any way is the inability to use heterogeneous technology. The whole system is in the same process, which means that it must be running in the same runtime environment. This does not mean that it must be written in the same language because some platforms support multiple languages (for example .NET CLR or JAVA JVM). However, the use of completely separate technologies is not possible.

Heterogeneous Technology
Heterogeneous Technology

A feature of heterogeneous technology can be decisive to switch to Microservices architecture, but it doesn’t have to be. Often, companies use one technology stack and no one even thinks about the implementation of components in different technologies because team competence or software license does not allow it.

On the other hand, larger companies and projects more often use different technologies to maximize productivity using tailor-made tools to solve specific problems.

A common case associated with heterogeneous technology is the maintenance and development of the legacy system. The legacy system is often written in old technology (and often in a very bad way). To use the new technology, a new service/system is often created that implements new functionalities and the old system only delegates requests to the new one. Thanks to this, the development of the legacy system can be faster and it is easier to find people willing to work with it. The disadvantage here is that because of two systems instead of one – the whole system becomes distributed – with all of cons this architecture.

Summary

This post was not intended to describe all architectural drivers in favor of a Modular Monolith or Microservices. Separate books are being created on this topic.

In this post, I wanted to describe the most common discussed architectural drivers (in my opinion) and make it very clear that the shape of the architecture of our system is influenced by many factors and everything depends on our context.

Summarizing:

  • There is no better or worse architecture – it all depends on the context and Architectural Drivers
  • Architectural Drivers have their categorization – Functional Requirements, Quality Attributes, Technical Constraints, Business Constraints
  • Monolith architecture is less complex than a distributed system. Microservices architecture requires much more tools, libraries, components, team experience, infrastructure management and so on
  • At the beginning, the Monolith implementation will be more productive (Monolith first approach). Later, migration to Microservices architecture can be considered but only if architectural driver for that migration exists
  • Deployment of Monolith is easier but does not support autonomous deployment.
  • Both architectures supports scalability, but Microservices are way more efficient (resources utilization)
  • Monolith has better performance than Microservices until the need for scaling appears – then it depends on scaling possibilities
  • Failure impact is greater in Monolith because everything works in the same process. Risk can be mitigated by duplication but it will cost more than in Microservices architecture
  • Monolith from definition does not support heterogeneous technology

Additional Resources

1. Architectural Drivers – chapter from Designing Software Architectures: A Practical Approach book – Humberto Cervantes, Rick Kazman
2. Software Architecture for Developers book – Simon Brown
3. Design It! book – Michael Keeling
4. Collection of articles about Monolith and Microservices architectures named “When microservices fail…”
5. Modular Monolith with DDD – GitHub repository

Related Posts

1. Modular Monolith: A Primer

Modular Monolith: A Primer

This post is part of articles series about Modular Monolith architecture:

1. Modular Monolith: A Primer (this)
2. Modular Monolith: Architectural Drivers
3. Modular Monolith: Architecture Enforcement
4. Modular Monolith: Integration Styles

Introduction

Many years have passed since the rise of the popularity of microservice architecture and it is still one of the main topics discussed in the context of the system architecture. The popularity of cloud solutions, containerization and advanced tools supporting the development and maintenance of distributed systems (such as Kubernetes) is even more conducive to this phenomenon.

Observing what is happening in the community, companies and during conversations with programmers, it can be concluded that most of the new projects are implemented using the microservice architecture. Moreover, some legacy systems are also moving towards this approach.

Ok, the subject of the post is Modular Monolith and I dwell on microservices, the question is why? Namely, because I think that as an IT industry we have made a false start adopting microservice architecture to such an extent. Instead of focusing on architectural drivers, we believed that microservices are medicine for all the evil that sits in monolithic applications. If you have participated in the development of a system that consists of more than one deployment unit, you already know that this is not the case. Each architecture has its pros and cons – microservices are no exception. They solve some problems by generating others in return.

With this entry, I would like to start a series of articles on the architecture of Modular Monolith. I do it for several reasons.

First of all, I would like to refute the myth that you cannot make a high-class system in monolithic architecture. Secondly, I would like to dispel doubts about the definition of this architecture and its appearance – many people interpret it differently. Thirdly, I treat this series of posts as an extension and addition to my implementation of Modular Monolith with DDD architecture, which I shared a few months ago on GitHub and which was very well received (1k stars a month after publication).

In this introductory post, I will focus on the definition of a Modular Monolith architecture.

What is Modular Monolith?

I always try to be precise when I talk or write about technical and business issues, especially when it comes to architecture. I believe that a clear and coherent message is very important. That is why I would like to clearly define what the architecture of the Modular Monolith means to me and how I perceive it.

Let’s start with the simpler concept, what is Monolith?

Monolith

Wikipedia describes “monolithic architecture” in terms of building construction and not computer science as follows:

Monolithic architecture describes buildings which are carved, cast or excavated from a single piece of material, historically from rock.

In terms of computer science, building is the system and the material is our executable code. So in Monolith Architecture, our system consists of exactly one piece of executable code and nothing more.

Let’s see 2 technical definitions: first one about Monolith System:

A software system is called “monolithic” if it has a monolithic architecture, in which functionally distinguishable aspects (for example data input and output, data processing, error handling, and the user interface) are all interwoven, rather than containing architecturally separate components.

Second one about Monolithic Architecture:

A monolithic architecture is the traditional unified model for the design of a software program. Monolithic, in this context, means composed all in one piece. Monolithic software is designed to be self-contained; components of the program are interconnected and interdependent rather than loosely coupled as is the case with modular software programs

These 2 definitions above (one of the first results in Google) have 2 shared assumptions.

First, they define that this architecture assumes that all parts of the system form one deployment unit – I will agree with that.

The second shared assumption of these definitions is that they assume a lack of modularity in such architecture and I will definitely disagree with that. The phrases “interwoven, rather than containing architecturally separate components” and “components of the program are interconnected and interdependent rather than loosely coupled” very negatively characterize this architecture, assuming that everything is mixed in them. It may be so, but it doesn’t have to be. It is not the ultimate attribute of the Monolith.

To sum up, Monolith is nothing more than a system that has exactly one deployment unit. No less no more.

Modularization

I’ve defined what Monolith means, let’s get to second aspect: Modularity.

What does it mean that something is modular according to the English Dictionary?

Consisting of separate parts that, when combined, form a complete whole/made from a set of separate parts that can be joined together to form a larger object

and Modularization itself:

The design or production of something in separate sections

Because it is a general definition, it is not enough for the programming world. Let’s use a more specific technical one about Modular programming:

Modular programming is a software design technique that emphasizes separating the functionality of a program into independent, interchangeable modules, such that each contains everything necessary to execute only one aspect of the desired functionality. A module interface expresses the elements that are provided and required by the module. The elements defined in the interface are detectable by other modules. The implementation contains the working code that corresponds to the elements declared in the interface.

Several important issues have been raised here. In order to have modular architecture, you must have modules and these modules:

  • a) must be independent and interchangeable and
  • b) must have everything necessary to provide desired functionality and
  • c) must have defined interface

Let’s see what these assumptions mean.

Module must be independent and interchangeable

For the module to meet these assumptions, as the name implies, it should be independent. Of course, it is impossible for it to be completely independent because then it means that it does not integrate with other modules. The module will always depend on something, but dependencies should be kept to a minimum. According to the principle: Loose Coupling, Strong Cohesion.

In the diagram below on the left we have a module that has a lot of dependencies and you can definitely not say that it is independent. On the other hand, on the right, the situation is the opposite – the module contains a minimum of dependencies and they are more loose, it is finally more independent:

Module independence
Module independence

However, the number of dependencies is just one measure of how well our module is independent. The second measure is how strong the dependency is. In other words, do we call it very often using multiple methods or occasionally using one or a few methods?

Strong/Weak dependency
Strong/Weak dependency

In the first case, it is possible that we have defined the boundaries of our modules incorrectly and we should merge both modules if they are closely related:

Modules merged
Modules merged

The last attribute affecting the independence of the module is the frequency of changes of the modules on which it depends on. As you can guess – the less often they are changed, the more the module is independent. On the other hand, if changes are frequent – we must change our module often and it loses its independence:

Module stability
Module stability

To sum up, the module’s independence is determined by three main factors:

  • number of dependencies
  • strength of dependenies
  • stability of the modules on which the module depends on

Module must have everything necessary to provide desired functionality

The module is a very overloaded word and can be used in many contexts with different meanings. A common case here is to call logical layers as modules, e.g. GUI module, application logic module, database access module. Yes, in this context these are also modules but they provide technical, not business functionality.

Thinking about a module in a technical context, only technical changes cause exactly one module to change:

Technical modules and technical change
Technical modules and technical change

Adding or changing business functionality usually goes through all layers causing changes in each technical module:

Technical modules - new/change business feature
Technical modules – new/change business feature

The question we have to ask ourselves is: do we more often make changes related to the technical part of our system or changes in business functionality? In my opinion – definitely more often the latter. We rarely exchange the database access layer, logging library or GUI framework. For this reason, the module in the Modular Monolith is a business module that is able to fully provide a set of desired features. This kind of design is called “Vertical Slices” and we group these slices in the module:

Business modules and vertical slices
Business modules and vertical slices

In this way, frequent changes affect only one module – it becomes more independent, autonomous and is able to provide functionality by itself.

Module must have defined interface

The last attribute of modularity is a well-defined interface. We can’t talk about modular architecture if our modules don’t have a Contract:

Modules without contract (interface)
Modules without contract (interface)

A Contract is what we make available outside so it is very important. It is an “entry point” to our module. Good Contract should be unambiguous and contain only what clients of a given contract need. We should keep it stable (to not break our clients) and hide everything else behind it (Encapsulation):

Modules with contract
Modules with contract

As you can see in the diagram above, the contract of our module can take different forms. Sometimes it is some kind of facade for synchronous calls (e.g. public method or REST service), sometimes it can be an published event for asynchronous communication. In any case, everything that we share outside becomes the public API of the module. Therefore, encapsulation is an inseparable element of modularity.

Summary

1. Monolith is a system that has exactly one deployment unit.
2. Monolith architecture does not imply that the system is poor designed, not modular or bad. It does not say anything about quality.
3. Modular Monolith architecture is a explicit name for a Monolith system designed in a modular way.
4. To achieve a high level of modularization each module must be independent, has everything necessary to provide desired functionality (separation by business area), encapsulated and have a well-defined interface/contract.

In the next post I will discuss the pros and cons of Modular Monolith architecture comparing it to the microservices.

Additional resources

1. Modular Monoliths Video – Simon Brown
2. Majestic Modular Monliths – Axel Fontaine
3. Modular programming – Wikipedia
4. Monolithic application – Wikipedia
5. Modular Monolith with DDD – GitHub repository
6. Vertical Slice Architecture – Jimmy Bogard

Related posts

1. GRASP – General Responsibility Assignment Software Patterns Explained
2. Attributes of Clean Domain Model
3. Domain Model Encapsulation and PI with Entity Framework 2.2
4. Simple CQRS implementation with raw SQL and DDD

Image credits: Magnasoma

Attributes of Clean Domain Model

Introduction

There is a lot of talk about clean code and architecture nowadays. There is more and more talk about how to achieve it. The rules described by Robert C. Martin are universal and in my opinion, we can use them in various other contexts.

In this post I would like to refer them to the context of the Domain Model implementation, which is often the heart of our system. We want to have a clean heart, aren’t we?

As we know from the Domain-Driven Design approach, we have two spaces – problem space and solution space. Vaughn Vernon defines these spaces in DDD Distilled book as follows:

Problem space is where you perform high-level strategic analysis and design steps within the constraints of a given project.

Solution space is where you actually implement the solution that your problem space discussions identify as your Core Domain.

So the implementation of the Domain Model is a solution to a problem. To talk about whether the Domain Model is clean, we must define what a clean solution means first.

What is Clean Solution

I think that everyone has a definition of what a clean solution means or at least feels if what he does is clean or not. In any case, let me describe what it means to me.

First of all – a given solution has to solve a defined problem. This is mandatory. Even most-elegant solution which doesn’t solve our problem is worth nothing (in context of problem).

Secondly, a clean solution should be optimal from the point of view of time. It should take no more or less time than necessary.

Thirdly, it should solve exactly one defined problem with simplicity in mind but with adaption to change. We should keep always the balance – no less (no design or under-design), no more (over-design, see YAGNI). This could be summarized by a quote from Antoine Airman’s Odyssey book:

Perfection is achieved, not when there is nothing more to add, but when there is nothing left to take away.

Note: Of course we should strive for perfection by being aware that we will never achieve it.

Fourthly, we should solve problems based on the experience of others and learn from their mistakes. This is why architectural and design patterns or principles are so useful. We can take someone’s solution and adapt it to our needs. Moreover, we can take tools created by others and use them. This is game-changer in the IT world.

Last, but not least attribute is the level of understanding of the solution. We should be able to understand our solution even after a lot of time. More importantly, others should be able to understand it. The code is much more read than written. Martin Fowler said:

Any fool can write code that a computer can understand. Good programmers write code that humans can understand.

This is one of my favorite IT quotes. I try to remember it constantly as I write the code.

To summarize, the clean solution should solve exactly our problem in optimal time with a good balance of simplicity vs complexity. It should be developed based on other experiences, common knowledge, using appropriate tools and should be understandable. Considering all these factors, how can we determine that our Domain Model is a clean solution?

Clean Domain Model

1. Uses Ubiquitous Language

Ubiquitous Language which is one of the fundamentals of DDD strategic patterns in Domain Model plays key role. I like to think about writing code in a Domain Model like writing a book for business. Business people should be able to understand all logic, terms, and concepts which are used – even this is a computer program and they don’t know programming language! If we use the same language, concepts and meaning everywhere in a defined context, understanding of our problems and solutions can only increase the probability of success of our project.

2. Is Rich in Behavior

Domain Model, as the name suggests, models the domain. We have behavior in the domain, so our model should also have it. If this behavior is absent, we have dealing with the so-called Anemic Domain Model anti-pattern. For me, this is really a Data Model, not model of domain. This kind of model is useful sometimes but not when we have complex behvior and logic. It makes our code procedural and all the benefits of object-oriented programming disappear. The real Domain Model is rich in behavior.

3. Is encapsulated

Encapsulation of Domain Model is very important. We want to expose only the minimum information about our model to the outside world. We should prevent leaking business logic to our application logic or even worse – GUI. Ideally, we should expose only public methods on our aggregates (only entry to Domain Model). More information how achieve good level of Domain Model encapsulation you can read in my eariler post Domain Model Encapsulation and PI with Entity Framework 2.2.

4. Is Persistence Ignorant

Persistence Ignorance (PI) is another well-known concept and desirable attribute of Domain Model. We don’t want to have high coupling between our model and the persistence store. These are different concepts, tasks and responsibilities. Change in one thing should have minimal impact on other and vice-versa. Low coupling in this area means greater adaption of our system to change. Again, you can read how to minimize this coupling in .NET world using EF Core in the post which is linked above.

5. Sticks to SOLID Principles

As the Domain Model is object-oriented, all SOLID principles should be followed. Examples:

  • SRP – one Entity represents one business concept. One method does one thing. One event represents one fact.
  • O/C – business logic implementation should be easy to extend without changing other places. For that case, Strategy/Policy Pattern is most often used.
  • LSP – because of sticking to the composition over inheritance rule, sticking to LSP rule should be easy. Inheritance is the strongest level of coupling between classes and this is what we want to avoid
  • ISP – interfaces of our Domain Services or policies should be small – ideally should have one method
  • DI – to decouple our Domain Model from the rest of the application and infrastructure we need to use the Dependency Injection and Dependency Inversion principle. This combination gives us a powerful weapon – we can keep the same level of abstraction in our model, use business language in the whole model and hide implementation details from him.

6. Uses Domain Primitives

Most of the codebases I see operate on primitive types – strings, ints, decimals, guids. In the Domain Model, this level of abstraction is definitely too low.

We should always try to express significant business concepts using classes (entities or value objects). In this way, our model is more readable, easier to maintain, rich in behavior. In addition, our code is more secure because we can add validation at the class level. More about Domain Primitives you can read here.

7. Is efficient

Even the best-written Domain Model that cannot handle a request in a satisfactory time is useless. That is why entity size and aggregate boundaries are so important.

When designing aggregates, you need to know how they will be processed – how often, how many users will try to use them at the same time, whether there will be periods of increased activity and so on. The smaller the aggregate, the shorter data needs to be loaded and saved and the transaction is shorter. On the other hand, the smaller the aggregate, the smaller the consistency boundary so the correct balance is needed.

8. Is expressive

Class structure in Object-Oriented languages is a powerful tool to model business concepts. Unfortunately, it is often used incorrectly. The most common mistake is modeling many concepts with one class. For this (often unconsciously) statuses and bit flags are used.

Most often, such a class can be divided into several classes, which supports SRP. A good heuristics here are looking at the business language and looking at our entity’s behavior.

9. Is testable and tested

Domain Models are used to solve complicated problems. A complicated problem means complicated business logic to solve it. If you have a complicated solution you have to be sure that it works correctly and you can’t be afraid of changing this logic.

This is where unit tests enter, which are an integral part of a clean Domain Model. Clean Domain Model is testable and should have a maximum test coverage factor. This is the heart of our application and we should always be one hundred percent sure that it works as we expected.

10. It should be written with ease

This point is a summary of all the above. If writing your Domain Model code is a problem every time – it’s hard to change it, understand it, test it, use it – something is wrong. What you should do?

Review all the points described by me above and see what your model does not meet? Maybe it depends on the infrastructure? Maybe he speaks a non-business language? Maybe it’s anemic, poorly encapsulated or untestable? If your model meets all attributes described in this post, then solving business problems should be easy and pleasant. I assure you.

So I have a question – today are you relaxed by writing your business logic or did you hit your head against the wall again? Is your Domain Model clean?

Related posts

Domain Model Encapsulation and PI with Entity Framework 2.2
Domain Model Validation
Handling Domain Events: Missing Part

Handling Domain Events: Missing Part

Introduction

Some time ago I wrote post about publishing and handling domain events. In addition, in one of the posts I described the Outbox Pattern, which provides us At-Least-Once delivery when integrating with external components / services without using the 2PC protocol.

This time I wanted to present a combination of both approaches to complete previous posts. I will present a complete solution that enables reliable data processing in the system in a structured manner taking into account the transaction boundary.

Depth of the system

At the beginning I would like to describe what is a Shallow System and what is a Deep System.

Shallow System

The system is considered to be shallow when, most often, after doing some action on it, there is not much going on.

A typical and most popular example of this type of system is that it has many CRUD operations. Most of the operations involve managing the data shown on the screen and there is little business logic underneath. Sometimes such systems can also be called a database browser. 😉

Another heuristic that can point to a Shallow System is the ability to specify the requirements for such a system practically through the GUI prototype. The prototype of the system (possibly with the addition of comments) shows us how this system should work and it is enough – if nothing underneath is happening then there is nothing to define and describe.

From the Domain-Driven Design point of view, it will most often look like this: the execution of the Command on the Aggregate publishes exactly one Domain Event and… nothing happens. No subscribers to this event, no processors / handlers, no workflows defined. There is no communication with other contexts or 3rd systems either. It looks like this:

Shalow system in context of DDD
Shallow system in context of DDD.

Execute action, process request, save data – end of story. Often, in such cases, we do not even need DDD. Transaction Script or Active Record will be enough.

Deep system

The Deep System is (as one could easily guess) the complete opposite of the Shallow System.

A Deep System is one that is designed to resolve some problems in a non-trivial and complicated domain. If the domain is complicated then the Domain Model will be complicated as well. Of course, the Domain Model should be simplified as it is possible and at the same time it should not lose aspects that are most important in a given context (in terms of DDD – Bounded Context). Nevertheless, it contains a lot of business logic that needs to be handled.

We do not specify a Deep System by the GUI prototype because too much is happening underneath. Saving or reading data is just one of the actions that our system does. Other activities are communication with other systems, complicated data processing or calling other parts of our system.

This time, much more is happening in the context of Domain-Driven Design implementation. Aggregates can publish multiple Domain Events , and for each Domain Event there can be many handlers responsible for different behavior. This behavior can be communication with an external system or executing a Command on another Aggregate, which will again publish its events to which another part of our system will subscribe. This scheme repeats itself and our Domain Model reacts in a reactive manner:

Deep system
Deep system in context of DDD.

Problem

In post about publishing and handling domain events was presented very simple case and the whole solution did not support the re-publishing (and handling) of events by another Aggregate, which processing resulted from the previous Domain Event. In other words, there was no support for complex flows and data processing in a reactive way. Only one Command -> Aggregate -> Domain Event -> handlers scenario was possible.

It will be best to consider this in a specific example. Let’s assume the requirements that after placing an Order by the Customer:
a) Confirmation email to the Customer about placed Order should be sent
b) New Payment should be created
c) Email about new Payment to the Customer should be sent

These requirements are illustrated in the following picture:

Let’s assume that in this particular case both Order placement and Payment creation should take place in the same transaction. If transaction is successful, we need to send 2 emails – about the Order and Payment. Let’s see how we can implement this type of scenario.

Solution

The most important thing we have to keep in mind is the boundary of transaction. To make our life easier, we must make the following assumptions:

1. Command Handler defines transaction boundary. Transaction is started when Command Handler is invoked and committed at the end.
2. Each Domain Event handler is invoked in context of the same transaction boundary.
3. If we want to process something outside the transaction, we need to create a public event based on the Domain Event. I call it Domain Event Notification, some people call it a public event, but the concept is the same.

The second most important thing is when to publish and process Domain Events? Events may be created after each action on the Aggregate, so we must publish them:
– after each Command handling (but BEFORE committing transaction)
– after each Domain Event handling (but WITHOUT committing transaction)

Last thing to consider is processing of Domain Event Notifications (public events). We need to find a way to process them outside transaction and here Outbox Pattern comes in to play.

The first thing that comes to mind is to publish events at the end of each Command handler and commit the transaction, and at the end of each Domain Event handler only publish events. We can, however, try a much more elegant solution here and use the Decorator Pattern. Decorator Pattern allows us to wrap up our handling logic in infrastructural code, similar like Aspect-oriented programming and .NET Core Middlewares work.

We need two decorators. The first one will be for command handlers:

As you can see, in line 16 the processing of a given Command takes place (real Command handler is invoked), in line 18 there is a Unit of Work commit. UoW commit publishes Domain Events and commits the existing transaction:

In accordance with the previously described assumptions, we also need a second decorator for the Domain Event handler, which will only publish Domain Events at the very end without committing database transaction:

Last thing to do is configuration our decorators in IoC container (Autofac example):

Add Domain Event Notifications to Outbox

The second thing we have to do is to save notifications about Domain Events that we want to process outside of the transaction. To do this, we use the implementation of the Outbox Pattern:

As a reminder – the data for our Outbox is saved in the same transaction, which is why At-Least-Once delivery is guaranteed.

Implementing flow steps

At this point, we can focus only on the application logic and does not need to worry about infrastructural concerns. Now, we only implementing the particular flow steps:

a) When the Order is placed then create Payment:

b) When the Order is placed then send an email:

c) When the Payment is created then send an email:

The following picture presents the whole flow:

Flow of processing
Flow of processing

Summary

In this post I described how it is possible to process Commands and Domain Events in a Deep System in a reactive way.

Summarizing, the following concepts has been used for this purpose:

– Decorator Pattern for events dispatching and transaction boundary management
– Outbox Pattern for processing events in separate transaction
– Unit of Work Pattern
– Domain Events Notifications (public events) saved to the Outbox
– Basic DDD Building Blocks – Aggregates and Domain Events
– Eventual Consistency

Source code

If you would like to see full, working example – check my GitHub repository.

Additional Resources

The Outbox: An EIP Pattern – John Heintz
Domain events: design and implementation – Microsoft

Related posts

How to publish and handle Domain Events
Simple CQRS implementation with raw SQL and DDD
The Outbox Pattern

GRASP – General Responsibility Assignment Software Patterns Explained

Introduction

I recently noticed that a lot of attention is paid to SOLID principles. And this is very good thing because it is the total basis of Object-Oriented Design (OOD) and programming. For developers of object-oriented languages, knowledge of the SOLID principles is a requirement for writing code which is characterized by good quality. There are a lot of articles and courses on these rules, so if you do not know them yet, learn them as soon as possible.

On the other hand, there is another, less well-known set of rules regarding object-oriented programming. It’s called GRASP – General Responsibility Assignment Software Patterns (or Principles). There are far fewer materials on the Internet about this topic, so I decided to bring it closer because I think the principles described in it are as important as the SOLID principles.

Disclaimer: This post is inspired and based on awesome Craig Larman’s book: Applying UML and Patterns: An Introduction to Object-Oriented Analysis and Design and Iterative Development. Although the last edition was released in 2004, according to me, the book is still up-to-date and explains perfectly how to design systems using object-oriented languages. It’s hard to find the better book about this subject, believe me. It is not book about UML, but you can learn UML from it because it is good explained too. Must-have for each developer, period.

The Responsibility in Software

Responsibility in software is a very important concept and not only concerns classes but also modules and entire systems. Thinking in terms of responsibilities is popular way to think about design of the software. We can always ask questions like:

– What is the responsibility of this class/module/component/system?
– Is it responsible for this or is it responsible for that?
– Is Single Responsibility Principle violated in this particular context?

But to answer these kind of questions we should ask one, more fundamental question: what does it mean that something is responsible in the context of the software?

Doing and Knowing

As it is proposed by Rebecca Wirfs-Brock in Object Design: Roles, Responsibilities, and Collaborations book and her RDD approach a responsibility is:

an obligation to perform a task or know information

As we see from this definition we have here a clear distinction between behavior (doing) and data (knowing).

Doing responsibility of an object is seen as:
a) doing something itself – create an object, process data, do some computation/calculation
b) initiate and coordinate actions with other objects

Knowing responsibility of an object can be defined as:
a) private and public object data
b) related objects references
c) things it can derive

Let’s see an example:

If you want more information about responsibilities in software and dig into Responsibility-Driven Design you can read it directly from Rebecca’s Wirfs-Brock book or this PDF.

Ok, now we know what the responsibility in context of software is. Let’s see how to assign this responsibility using GRASP.

GRASP

GRASP is set of exactly 9 General Responsibility Assignment Software Patterns. As I wrote above assignment of object responsibilities is one of the key skill of OOD. Every programmer and designer should be familiar with these patterns and what is more important – know how to apply them in everyday work (by the way – the same assumptions should apply to SOLID principles).

This is the list of 9 GRASP patterns (sometimes called principles but please, do not focus on naming here):

1. Information Expert
2. Creator
3. Controller
4. Low Coupling
5. High Cohesion
6. Indirection
7. Polymorphism
8. Pure Fabrication
9. Protected Variations

NOTE: All Problem/Solution paragraphas are quotes from Craig Larman’s book. I decided that it would be best to stick to the original.

1. Information Expert

Problem: What is a basic principle by which to assign responsibilities to objects?
Solution: Assign a responsibility to the class that has the information needed to fulfill it.

In following example Customer class has references to all customer Orders so it is natural candidate to take responsibility of calculating total value of orders:

This is the most basic principle, because the truth is – if we do not have the data we need, we would not be able to meet the requirement and assign responsibility anyway.

2. Creator

Problem: Who creates object A?
Solution: Assign class B the responsibility to create object A if one of these is true (more is better)
– B contains or compositely aggregates A
– B records A
– B closely uses A
– B has the initializing data for A

Going back to the example:

As you can see above Customer class compositely aggregates Orders (there is no Order without Customer), records Orders, closely uses Orders and has initializing data passed by method parameters. Ideal candidate for “Order Creator”. 🙂

3. Controller

Problem: What first object beyond the UI layer receives and coordinates “controls” a system operation?
Solution: Assign the responsibility to an object representing one of these choices:
– Represents the overall “system”, “root object”, device that the software is running within, or a major subsystem (these are all variations of a facade controller)
– Represents a use case scenario within which the system operation occurs (a use case or session controller)

This principle implementation depends on high level design of our system but general we need always define object which orchestrate our business transaction processing. At first glance, it would seem that the MVC Controller in Web applications/API’s is a great example here (even the name is the same) but for me it is not true. Of course it receives input but it shouldn’t coordinate a system operation – it should delegate it to separate service or Command Handler:

4. Low Coupling

Problem: How to reduce the impact of change? How to support low dependency and increased reuse?
Solution: Assign responsibilities so that (unnecessary) coupling remains low. Use this principle to evaluate alternatives.

Coupling is a measure how one element is related to another. The higher the coupling, the greater the dependence of one element to the another.

Low coupling means our objects are more independent and isolated. If something is isolated we can change it not worrying that we have to change something else or wheter we would break something (see Shotgun Surgery). Use of SOLID principles are great way to keep coupling low. As you see in example above between CustomerOrdersController and AddCustomerOrderCommandHandler coupling remains low – they need only agree on command object structure. This low coupling is possible thanks to Indirection pattern which is described later.

5. High Cohesion

Problem: How to keep objects focused, understandable, manageable and as a side effect support Low Coupling?
Solution: Assign a responsibility so that cohesion remains high. Use this to evaluate alternatives.

Cohesion is a measure how strongly all responsibilities of the element are related. In other words, what is the degree to which the parts inside a element belong together.

Classes with low cohesion have unrelated data and/or unrelated behaviors. For example, the Customer class has high cohesion because now it does only one thing – manage the Orders. If I would add to this class management of product prices responsibility, cohesion of this class would drop significantly because price list is not directly related to Customer itself.

6. Indirection

Problem: Where to assign a responsibility to avoid direct coupling between two or more things?
Solution: Assign the responsibility to an intermediate object to mediate between other components or services so that they are not directly coupled.

This is where Mediator Pattern comes in to play. Instead of direct coupling:

we can use the mediator object and mediate between objects:

One note here. Indirection supports low coupling but reduces readability and reasoning about the whole system. You don’t know which class handles the command from the Controller definition. This is the trade-off to take into consideration.

7. Polymorphism

Problem: How handle alternatives based on type?
Solution: When related alternatives or behaviors vary by type (class), assingn responsibility for the behavior (using polymorphi operations) to the types for which the behavior varies.

Polymorphism is fundamental principle of Object-Oriented Design. In this context, principle is strongly connected with (among others) Strategy Pattern.

As it was presented above constructor of Customer class takes ICustomerUniquenessChecker interface as parameter:

We can provide there different implementations of this interface depending on the requirements. In general, this is very useful approach when we have in our systems different algorithms that have the same input and output (in terms of structure).

8. Pure Fabrication

Problem: What object should have the responsibility, when you do not want to viloate High Cohesion and Low Coupling but solutions offered by other principles are not appopriate?>
Solution: Assign a highly cohesive set of responsibilites to an artifical or convenience class that does not represent a problem domain conecept.

Sometimes it is realy hard to figure it out where responsibility should be placed. This is why in Domain-Driven Design there is a concept of Domain Service. Domain Services hold logic which are not related with one, particular Entity.

For example, in e-commerce systems we often have need to convert one currency to another. Sometimes it is hard to say where this behavior should be placed so the best option is to create new class and interface:

This way we support both High Cohesion (we are only converting currencies) and Low Coupling (client classes are only dependent to IForeignExchange interface). Additionally, this class is reusable and easy to maintain.

9. Protected Variations

Problem: How to design objects, subsystems and systems so that the variations or instability in these elements does not have an undesirable impact on other elements?
Solution: Identify points of predicted variation or instability, assign responsibilities to create a stable interface around them.

In my opinion, this is the most important principle which is indirectly related to the rest GRASP principles. Currently, one of the most important software metrics is the ease of change. As architects and programmers we must be ready for ever-changing requirements. This is not optional and “nice to have” quality attribute – it is “must-have” and our duty.

Fortunately, we are armed with a lot design guidelines, principles, patterns and practices to support changes on different levels of abstraction. I will mention only a few (already beyond the GRASP):

– SOLID principles, especially the Open-Close principle (but all of them supports change)
– Gang of Four (GoF) Design Patterns
Encapsulation
Law of Demeter
Service Discovery
– Virtualization and containerization
– asynchronous messaging, Event-driven architectures
Orchestration, Choreography

As iterative software development process is more suitable today because even we are forced to change something once, we can draw conclusions and be prepared for future changes at a lower cost.

Fool me once shame on you. Fool me twice shame on me.

Summary

In this post I described one of the most fundamental Object-Oriented Design set of patterns and principles – GRASP.

Skilful management of responsibilities in software is the key to create good quality architecture and code. In combination with others patterns and practices is it possible to develop well-crafted systems which supports change and do not resist it. This is good, because the only thing that is certain is change. So be prepared.

Related posts

10 common broken rules of clean code

The Outbox Pattern

Introduction

Sometimes, when processing a business operation, you need to communicate with an external component in the Fire-and-forget mode. That component can be, for example:
– external service
– message bus
– mail server
– same database but different database transaction
– another database

Examples of this type of integration with external components:
– sending an e-mail message after placing an order
– sending an event about new client registration to the messaging system
– processing another DDD Aggregate in different database transaction – for example after placing an order to decrease number of products in stock

The question that arises is whether we are able to guarantee the atomicity of our business operation from a technical point of view? Unfortunately, not always, or even if we can (using 2PC protocol), this is a limitation of our system from the point of latency, throughput, scalability and availability. For details about these limitations, I invite you to read the article titled It’s Time to Move on from Two Phase Commit.

The problem I am writing about is presented below:

After execution of line 24 transaction is committed. In line 28 we want to send an event to event bus, but unfortunately 2 bad things can happen:
– our system can crash just after transaction commit and before sending the event
– event bus can be unavailable at this moment so the event cannot be sent

Outbox pattern

If we cannot provide atomicity or we don’t want to do that for the reasons mentioned above, what could we do to increase the reliability of our system? We should implement the Outbox Pattern.

Outbox pattern

The Outbox Pattern is based on Guaranteed Delivery pattern and looks as follows:

Outbox pattern

When you save data as part of one transaction, you also save messages that you later want to process as part of the same transaction. The list of messages to be processed is called an Outbox, just like in e-mail clients.

The second element of the puzzle is a separate process that periodically checks the contents of the Outbox and processes the messages. After processing each message, the message should be marked as processed to avoid resending. However, it is possible that we will not be able to mark the message as processed due to communication error with Outbox:

Outbox messages processing

In this case when connection with Outbox is recovered, the same message will be sent again. What all this means to us? Outbox Pattern gives At-Least-Once delivery. We can be sure that message will be sent once, but can be sent multiple times too! That’s why another name for this approach is Once-Or-More delivery. We should remember this and try to design receivers of our messages as Idempotents, which means:

In Messaging this concepts translates into a message that has the same effect whether it is received once or multiple times. This means that a message can safely be resent without causing any problems even if the receiver receives duplicates of the same message.

Ok, Enough theory, let’s see how we can implement this pattern in .NET world.

Implementation

Outbox message

At the beginning, we need to define the structure of our OutboxMessage:

What is important, the OutboxMessage class is part of the Infrastructure and not the Domain Model! Try to talk with business about Outbox, they will think about the outlook application instead of the messaging pattern. 🙂 I didn’t include ProcessedDate property because this class is only needed to save message as part of transaction so this property always will be NULL in this context.

Saving the message

For sure I do not want to program writing messages to the Outbox every time in each Command Handler, it is against DRY principle. For this reason, the Notification Object described in the post about publishing Domain Events can be used. Following solution is based on linked article with little modification – instead of processing the notifications immediately, it serializes them and writes them to the database.

As a reminder, all Domain Events resulting from an action are processed as part of the same transaction. If the Domain Event should be processed outside of the ongoing transaction, you should define a Notification Object for it. This is the object which should be written to the Outbox. The code looks like:

Example of Domain Event:

And Notification Object:

First thing to note is Json.NET library usage. Second thing to note are 2 constructors of CustomerRegisteredNotification class. First of them is for creating notification based on Domain Event. Second of them is to deserialize message from JSON string which is presented in following section regarding processing.

Processing the message

The processing of Outbox messages should take place in a separate process. However, instead of a separate process, we can also use the same process but another thread depending on the needs. Solution which is presented below can be used in both cases.

At the beginning, we need to use a scheduler that will periodically run Outbox processing. I do not want to create the scheduler myself (it is known and solved problem) so I will use one the mature solution in .NET – Quartz.NET. Configuration of Quartz scheduler is very simple:

Firstly, scheduler is created using factory. Then, new instance of IoC container for resolving dependencies is created. The last thing to do is to configure our job execution schedule. In case above it will be executed every 15 seconds but its configuration really depends on how many messages you will have in your system.

This is how ProcessOutboxJob looks like:

The most important parts are:
Line 1 – [DisallowConcurrentExecution] attribute means that scheduler will not start new instance of job if other instance of that job is running. This is important because we don’t want process Outbox concurrently.
Line 25 – Get all messages to process
Line 30 – Deserialize message to Notification Object
Line 32 – Processing the Notification Object (for example sending event to bus)
Line 38 – Set message as processed

As I wrote earlier, if there is an error between processing the message (line 32) and setting it as processed (line 38), job in the next iteration will want to process it again.

Notification handler template looks like this:

Finally, this is view of our Outbox:

Outbox view

Summary

In this post I described what are the problems with ensuring the atomicity of the transaction during business operation processing. I’ve raised the topic of 2PC protocol and motivation to not use it. I presented what the Outbox Pattern is and how we can implement it. Thanks to this, our system can be really more reliable.

Source code

If you would like to see full, working example – check my GitHub repository.

Additional Resources

Refactoring Towards Resilience: Evaluating Coupling – Jimmy Bogard
Asynchronous message-based communication – Microsoft

Related posts

Domain Model Encapsulation and PI with Entity Framework 2.2
Simple CQRS implementation with raw SQL and DDD
How to publish and handle Domain Events