Hey folks! This is the second part to the previous article where we discussed data ownership and which data belongs to which service. In this part we’ll dive into distributed transactions and talk specifics. If you didn’t read the first part make sure you do so here. Let’s dive in.

Introduction

When architects think about transactions, they usually think about a single atomic unit of work where multiple database updates are either committed together or all rolled back when an error occurs. ACID is an acronym describing the basic properties of an atomic single unit of work database transaction (atomicity, consistency, isolation and durability)

Let’s first briefly talk about ACID before moving on to distributed transactions.

ACID Properties

Atomicity means a transaction must either commit or rollback all of its updates in a single unit of work. All updates are treated as a collective whole so all changes either get committed or rolled back as one unit.

Consistency means that during the course of the transaction the database would never be left in an inconsistent state or violate any integrity constraints specified in the database.

Isolation refers to the degree of which individual transactions interact with each other. It protects uncommitted transaction data from being visible to other transactions during the course of the business request.

Durability means that once a successful response from a transaction commit occurs, it is guaranteed that all the data updates are permanent regardless of further system failures.

ACID can exist within the context of each service in a distributed architecture, but only if the corresponding database supports ACID properties, each service can perform its own commits and rollbacks to the tables it owns within the scope of the atomic business transaction. However if the business request spans multiple services the entire business request cannot be an ACID transaction rather it becomes a distributed transaction.

Distributed transactions occur when an atomic business request containing multiple database updates is performed by separately deployed remote services.

Distributed transactions DO NOT SUPPORT ACID TRANSACTIONS

Atomicity is not supported because each service commits its own data and performs only one part of the overall atomic business request. So atomicity is bound to the service not the entire request.

Consistency Is not supported because a failure in one service causes the data to be out of sync between the tables responsible for the business request.

Isolation is not supported because an insertion in any of the services data while being a part of the whole transaction but it gets committed so it’s visible to read.

Durability is not supported because as mentioned before its per service where there are multiple databases and anything could go wrong in any of them. Supported for each individual service though.

Distributed Transactions

Instead of ACID, distributed transactions support something called BASE.

Completely opposite of ACID BASE is;

Basic availability
Soft State
Eventual Consistency

Basic availability means all the involved services are expected to be available when a distributed transaction is pending.

Soft State describes a situation where a distributed transaction is in progress and the state of the atomic business request is not yet completed (or in some cases not even known). Basically meaning partial services commit or there is a wait time until we get an acknowledgment that everything has worked (or not).

Eventual consistency means that given enough time, all parts of the distributed transaction will complete successfully and all of the data is in sync with one another.

Moving on we’ll now talk about the patterns involved in eventual consistency and the caveats of each.

Eventual Consistency

Distributed architectures rely heavily on eventual consistency as a trade off for better operational characteristics such as performance, scalability, elasticity, fault tolerance and availability. There are numerous ways to achieve eventual consistency but there are thee main patterns in use today

Background synchronization pattern
Orchestrated request-based pattern
Event based pattern

Lets’s dive in them

Background synchronization pattern

The background synchronization pattern uses a separate or external service or process to periodically check the data sources and keep them in sync with one another. The time for data to become eventually consistent depends on the nature of the background synchronization pattern whether it is a batch job running at night or every hour, etc.

This pattern has the longest length of time for data to become consistent, However in some cases data doesn’t need to be in sync at this very moment.

One of the challenges of this pattern is that the process must know what data is changed, which can be done with different ways such as querying the source tables, a database trigger or an event stream. The most important thing is that it must have knowledge of all tables and data sources involved in the transaction.

As efficient as this pattern is, it has some serious tradeoffs:

All of the data sources are coupled together; breaking every bounded context rule between the data and the services. The background job needs to access different databases to be able to achieve the eventual synchronization process in the distributed transaction. They must have write access meaning that the background process has shared ownership with the service owning the tables.
Might lead to duplicate business logic because what the background job does might be already implemented In the services responsible for each table already.

This pattern isn’t suitable for distributed architectures requiring tight bounded contexts (micro services), where the coupling between data ownership and functionality is a critical part of the architecture.

Orchestrated request-based pattern

A common approach for managing distributed transactions is to make sure all of the data sources are in sync during the course of the business request (while the end user is waiting).

This pattern attempts to process the entire business transaction during the course of the business request. Therefore requiring some sort of orchestrator to manage the distributed transaction.

The orchestrator is responsible for managing all of the work needed to process the request, including knowledge of the business process, knowledge of the participants involved, multicasting logic, error handling and contract ownership.

One of the common ways to implement this is to designate a primary service to manage the distributed transaction. Although this approach avoids the need for a separate orchestration service, it tends to overload the responsibility of the designated service as the distributed transaction orchestrator. In addition to the role of an orchestrator the service must perform its own responsibilities as well. Also this approach leads to tight coupling and synchronous dependencies between services.

Using a dedicated orchestration service for the business request is a better approach here.

As efficient as this pattern is, it has some serious tradeoffs:

Favors consistency over overall responsiveness.
Really complex error handling (if one service fails you’re going to have to reverse what it performed on the others) (compensating transaction)
Failures might occur even during compensation which causes data to be out of sync and would need human intervention to repair.

Event based pattern

This pattern is one of the most popular and reliable eventual consistency patterns for most modern distributed architectures. Events are used in conjunction with an asynchronous publish and subscribe messaging model to post events or command messages to a topic or event stream, services involved in the transaction listen to events and respond to them

The eventual consistency time is usually short for achieving data consistency because of the parallel and decoupled nature of the asynchronous message processing. Services are highly decoupled from one another and responsiveness is good because the service triggering the eventual consistency doesn’t have to wait for the data synchronization to be done before returning a response to the customer.

Tradeoffs here are mainly failure handling becomes complex where if a consumer is processing and fails what happens? in most brokers they will try a number of times to deliver a message and after repeated failures they will send the message to a dead letter queue. Which would either be automatically fixed or would require human intervention.

Summary

In distributed systems, traditional ACID transactions don’t work due to multiple services each handling their own data, leading to distributed transactions that rely on the BASE model: Basic Availability, Soft State, and Eventual Consistency. To handle eventual consistency, three main patterns exist:

Background Synchronization: Runs periodic jobs to sync data across services but can cause delays and duplicate business logic.
Orchestrated Request-Based Pattern: Uses a central orchestrator to ensure all data is consistent during a request, often with complex error handling but favors consistency.
Event-Based Pattern: Services respond to asynchronous events, allowing for quick, decoupled syncing. Failures may result in dead letter queues, needing human intervention at times.

Each pattern has its tradeoffs in terms of consistency, speed, and complexity.

That’s it for this chapter and watch out for the next one where we’ll be talking about distributed data access. Hope you enjoyed!

Software Architecture - The Hard Parts [Chapter 9] Data Ownership and Distributed Transactions [Part 2]

Introduction

ACID Properties

Distributed Transactions