The Modular Monolith: A Practical Middle Ground

When you can skip microservices and still get most of the structure from a modular monolith.

#Architecture#JavaScript#Node.js

In 2019, I had the chance to work in an extremely highly skilled team. At that time, the microservices hype was happening, and I remember that I was also very inspired and wished to decompose new projects right from the very beginning. However, the team that I was a part of did not want to start with microservices; collectively, we decided to start with a modular monolith. There were two main reasons for this. The first one, we did not know yet the exact boundaries for each microservice in the beginning of the project, and to reshape such boundaries later, when they are already split into physical services, this is very painful. And the second reason, our goal in that moment was only to bring some structure into the code, because the need for scaling was still far from us. You don't need independent services initially; you just need clear boundaries and a design that will allow you to move modules into separate services later. A modular monolith gives you these boundaries without forcing you to build a distributed system. Consequently, you also avoid the long list of network problems that come with it, such as timeouts, retries, and half-committed writes across several services.

What it actually is

So how it looks in practice? You have one application and you deploy it like one piece, you build it once and run it as one process. But inside, the code is split into modules which are loosely connected between each other. Every module keeps for itself a clear border and holds its internal logic private, so the other modules can not reach inside and touch it. They talk only through the contracts which you define, not through some shared globals that everybody can grab. And what is the most important, every module owns also its own data, even when all of it sits in the same one database.

This last point about data ownership is the most important one, and I will return to it.

The cost microservices add

Distributed systems just change one set of problems for another set, which is bigger and more expensive. Calls that before were in-process now go over the network, so they add latency. In addition you get many other things: distributed transactions, service discovery, load balancing, and a separate deployment pipeline for each service. The tracing also must cross the boundaries between the services. And there is the work with data consistency too, which is the part that people underestimate the most.

A modular monolith gives you the same organizational benefit, which means clear ownership and separated concerns, but you don't pay this operational cost.

There is a second benefit which is easy to overlook. A well-structured modular monolith is much easier to split into services later than a tangled one. In other words, you buy yourself time. You use this time in order to understand where the real boundaries are, before you fix them in production.

Project structure

Imagine that you build an e-commerce platform. Instead of one flat directory with controllers, services and models, you organize the code by domain:

src/
├── modules/
│   ├── catalog/
│   │   ├── catalog.module.ts
│   │   ├── catalog.service.ts
│   │   ├── catalog.repository.ts
│   │   └── catalog.types.ts
│   ├── orders/
│   │   ├── orders.module.ts
│   │   ├── orders.service.ts
│   │   ├── orders.repository.ts
│   │   └── orders.types.ts
│   ├── payments/
│   │   ├── payments.module.ts
│   │   ├── payments.service.ts
│   │   ├── payments.repository.ts
│   │   └── payments.types.ts
│   └── users/
│       ├── users.module.ts
│       ├── users.service.ts
│       ├── users.repository.ts
│       └── users.types.ts
├── shared/
│   ├── events.ts
│   └── types.ts
└── app.ts

Each module exposes a public API and hides everything else from the outside. But the folder structure by itself enforces nothing, and a boundary which nobody enforces is not really a boundary. So each module exports only the things which it wants to expose:

// modules/catalog/catalog.module.ts

// ...imports and the wiring of the service, I cut it here

// this object is the public API, nothing else leaves the module
export const catalogModule = {
  getProduct: (id: string) => service.getProductById(id),
  listProducts: (filters: ProductFilters) => service.listProducts(filters),
  onProductUpdated: service.productUpdatedEvent,
};

export type { Product, ProductFilters } from './catalog.types';

The orders module can call catalogModule.getProduct(), but it can not import CatalogRepository directly.

An ESLint rule blocks cross-module internal imports before they reach the review, so it catches the typical "I will just import the repository directly, it is faster" shortcut already in the pull request:

// eslint.config.js
export default [
  {
    rules: {
      'no-restricted-imports': ['error', {
        patterns: [
          {
            group: ['*/modules/*/!(*.module|*.types)'],
            message: 'Import from the module file, not internal files.',
          },
        ],
      }],
    },
  },
];
How do the modules talk to each other

Modules still must somehow communicate between each other. For this exist two ways. One way is the direct call, this you use when you need to get an answer back from the other module. The second way is the events, and here you only want to say to everybody that something happened, but you do not wait for nothing.

Let's take the direct call first. One module asks the other module some question and waits until the answer comes:

// modules/orders/orders.service.ts

import { catalogModule } from '../catalog/catalog.module';

class OrdersService {
  async createOrder(userId: string, productId: string, quantity: number) {
    const product = await catalogModule.getProduct(productId);

    if (!product) {
      throw new Error(`Product ${productId} not found`);
    }

    return this.repository.create({
      userId,
      productId,
      quantity,
      totalPrice: product.price * quantity,
    });
  }
}

Events are the second way. A module sends a notification and does not know or care who is listening:

// shared/events.ts

type EventHandler<T> = (payload: T) => void | Promise<void>;

export class EventBus {
  private handlers = new Map<string, EventHandler<any>[]>();

  on<T>(event: string, handler: EventHandler<T>) {
    const existing = this.handlers.get(event) || [];
    this.handlers.set(event, [...existing, handler]);
  }

  async emit<T>(event: string, payload: T) {
    const handlers = this.handlers.get(event) || [];
    // the real bus also catches errors per handler, here I keep it minimal
    await Promise.all(handlers.map((h) => h(payload)));
  }
}

export const eventBus = new EventBus();
// modules/orders/orders.service.ts
import { eventBus } from '../../shared/events';

// After creating an order:
await eventBus.emit('order.created', { orderId, userId, productId, quantity });
// modules/payments/payments.module.ts
import { eventBus } from '../../shared/events';
import { PaymentsService } from './payments.service';

const service = new PaymentsService();

eventBus.on('order.created', async (order) => {
  await service.initiatePayment(order);
});

The orders module doesn't know that the payments module is listening. And later you extract payments into a separate service. Then you replace the in-process event bus with a message broker. The internal code of the module stays the same.

Owning data without separate databases

You don't need a separate database for each module. A separate schema, or even a simple table-ownership convention, is already enough:

// modules/catalog/catalog.repository.ts

class CatalogRepository {
  // This module owns these tables. No other module touches them.
  private readonly TABLES = {
    products: 'catalog_products',
    categories: 'catalog_categories',
  } as const;

  async getById(id: string): Promise<Product | null> {
    const row = await db.query(
      `SELECT * FROM ${this.TABLES.products} WHERE id = $1`,
      [id]
    );
    return row ? this.toProduct(row) : null;
  }
}

Add to the tables the module name as a prefix: catalog owns the catalog_* tables, orders owns orders_*. When one module needs the data of another module, it goes through the public API, never through a cross-module SELECT.

The first time when someone joins directly into the tables of another module, you have already rebuilt the tangled monolith which you tried to avoid. So be strict with this rule.

And when you finally pull a module out

There are real reasons in order to extract a module into a separate service. First is independent scaling. One module needs much more compute than the others, and you don't want to scale the whole application only because of it. Second is a different runtime, for example a module that needs a GPU, another programming language, or its own release schedule. Third, on larger teams the coordination overhead itself becomes the bottleneck. Here a separate deployable for each team can be worth the cost. And also fault isolation, when the failures of one module constantly takes down unrelated features.

If you have already done the modular-monolith work, the extraction is mostly mechanical. The module already has a defined interface. You put a network boundary where the function-call boundary was before. You replace the event bus with a message queue. And most of the work is done.

A few things you should keep in your mind before you start. From the very first commit already you must enforce the boundaries with the linting and the review, do not wait with this. And the event bus you should also build early, even in the moment when still nothing needs it. Because to add these two things into a big codebase later, this is exactly the slow and the painful part.