Decoupling a monolithic PHP application: a practical example

Decoupling a monolith is not a rare problem. It has cropped up in most of the companies I’ve worked at. This happens because, at the early stages of any startup, there is so-called decision debt being accumulated. As a result, the chosen architecture is optimal for rapid development and experimentation, but not for a mature product environment.

Considering that this problem appears often, there are plenty of books and articles on how to approach it, but I feel like there are not enough practical examples. Therefore, I will focus on the technical details of how it looks, using a PHP application powered by Symfony as an example. Some of these insights and approaches have been utilized when developing our Lokalise TMS.

    Before we begin

    Before we start, what do we mean by “decoupling a monolith”? What is our end goal? To establish this, let’s consider what the business goals that drive engineering to start tackling this problem might be. For instance:

    1. Slow development speed. This happens due to highly coupled code, which includes:
      • The code that is too complicated (there is an important difference between complex and complicated code).
      • Parts of the code being too dependent, which leads to a higher chance of code conflicts and higher difficulty in establishing a proper CI/CD process.
      • Technological shifts (e.g., upgrading or migrating to another framework) are nearly impossible due to high time and effort costs.
    2. Monolith being a SPOF, which makes it a business risk.
    3. Horizontal scaling of the application is more challenging.

    Considering the above, we can state that our end goal might be to migrate to a better application architecture to improve development speed, as well as the scalability potential of the application, and to lower infrastructural risks.

    Now that we have the problem and the goal established, let’s figure out the high-level plan.

    Example application

    I will be using a food delivery application as an example. For the sake of simplicity, it will not contain any real logic but will include several cross-service and database calls. To quickly grasp the idea of the application, take a look at these diagrams:

    The source code for the initial version of the sample app can be found on GitHub https://github.com/ilyachase/monolith-decoupling-example.

    High-level plan

    Now, let’s focus on what exactly we call “coupled code” in terms of monolith architecture. First, even though domains are separated into so-called service classes in our code, they communicate with each other through direct method calls. Moreover, even if we change the way services communicate with each other, we still won’t be able to extract the code to a separate service because it is using classes from other domains directly (through the use keyword). Finally, all service classes have access to all data in the database, meaning there are no internationalization boundaries between domains in terms of data.

    In real-world applications, even if we conceptually understand the required changes, there is a lot of code and effort required to separate them, especially when considering translation management system for multiple languages. So, we have to make the process iterative and predictable. For that purpose, we will introduce preliminary steps. One of them is turning our monolith into a modular monolith.

    So, let’s introduce the definition of a module. We will call a part of our code a module when it follows two rules:

    1. Code separation. Modules do not use classes of other modules directly. When one module needs to call another module, a service client is used.
    2. Data separation. Each module uses its own database.

    Let’s take a bit of a deep dive:

    • Modules do not use classes of other modules directly. To enforce this rule, we will be using a library called Deptrac. The way it works is pretty simple: you need to create a configuration file (deptrac.yaml by default), define modules, and add the library executable to your CI/CD pipeline. It can be GitHub Actions or anything similar. The important part is that it should be required (meaning, it should be not possible to merge a PR with this action failed).
    • When one module needs to call another module, a service client is used. The idea of this is simple: switching straight from monolith to services is usually a big leap, so we are preparing our code by using sub requests instead of real HTTP requests. The service client is just a helper class that forms a sub request, sends it to the corresponding module, and returns a response. We will look into the implementation of it soon.
    • Each module uses its own database. Depending on your application, you might opt for different database patterns, and “database per service” is just one of them. We will use it in our example because it’s one of the most common ones.

    Considering the above, let’s observe the stages of a product during migration to a better architecture:

    Big ball of mudModular monolithService orientedEvent driven
    Communication methodDirect method callsDirect method calls of module’s service APIHTTP requestsAsync messages
    CharacteristicsHigh coupling, low cohesionLow coupling, high cohesion enforced by conventionLow coupling, high cohesion enforced by API contractCoupling further decreased by relying on messages without defined single recipient instead of direct API calls
    Data storageMonolith databaseDatabase per moduleDatabase per serviceDatabase per serivce
    Product architecture improvement path

    At the last stage, we won’t implement a full-fledged implementation of the event-driven architecture with things like event streams, bounded context models, the outbox pattern, and so on (hence the asterisk). However, we will change interservice communication to async messages because this is a typical change for applications that undergo this kind of architecture evolution and therefore worth showing in the example.

    Implementation

    Before we start, I cannot understate the importance of test coverage. We are not going to focus on this part in our example application, but in real-world applications, the first step before any architectural changes should be creating a layer of tests that either work on near-HTTP level (e.g., application tests in Symfony) or real E2E tests.

    From a big ball of mud to a modular monolith

    Grouping files

    Once we have split the databases, we should then move the files to the appropriate directories but not untangle the dependencies yet. Let’s look at the current file structure:

    src/
        Controller/
            CourierApiController.php
            CustomerApiController.php
        Dto/
            ChangeDeliveryStatusRequest.php
            CreateOrderRequest.php
        Entity/
            Delivery.php
            Order.php
            Restaurant.php
        Repository/
            DeliveryRepository.php
            OrderRepository.php
            RestaurantRepository.php
        Service/
            CourierService.php
            CustomerService.php
            RestaurantService.php

    We will create a new directory level representing each of our modules, and move the rest of the files to the Common domain:

    src/
        Customer/   <-- module level directory
            Controller/
                CustomerApiController.php
            Dto/
                CreateOrderRequest.php
            Entity/
                Order.php
            Repository/
                OrderRepository.php
            Service/
                CustomerService.php
        Restaurant/   <-- module level directory
            Entity/
                Restaurant.php
            Repository/
                RestaurantRepository.php
            Service/
                RestaurantService.php
        Courier/      <-- module level directory
            Controller/
                CourierApiController.php
            Dto/
                ChangeDeliveryStatusRequest.php
            Entity/
                Delivery.php
            Repository/
                DeliveryRepository.php
            Service/
                CourierService.php
        Common/
            Exception/
                EntityNotFoundException.php

    Tip: If your IDE supports PSR namespaces, you can leverage it to move classes between namespaces, and it should fix references. For example, PhpStorm supports it out of the box if you synchronize your IDE settings with Composer.

    Also, we have to slightly adjust the Symfony configs to support our new structure:

    # config/packages/doctrine.yaml
             naming_strategy: doctrine.orm.naming_strategy.underscore_number_aware
             auto_mapping: true
             mappings:
    -            App:
    +            App\Courier:
    +                is_bundle: false
    +                dir: '%kernel.project_dir%/src/Courier/Entity'
    +                prefix: 'App\Courier\Entity'
    +                alias: App\Courier
    +            App\Customer:
                     is_bundle: false
    -                dir: '%kernel.project_dir%/src/Entity'
    -                prefix: 'App\Entity'
    -                alias: App
    +                dir: '%kernel.project_dir%/src/Customer/Entity'
    +                prefix: 'App\Customer\Entity'
    +                alias: App\Customer
    +            App\Restaurant:
    +                is_bundle: false
    +                dir: '%kernel.project_dir%/src/Restaurant/Entity'
    +                prefix: 'App\Restaurant\Entity'
    +                alias: App\Restaurant
     
    
    # config/routes.yaml
    -controllers:
    +courier_controllers:
         resource:
    -        path: ../src/Controller/
    -        namespace: App\Controller
    +        path: ../src/Courier/Controller/
    +        namespace: App\Courier\Controller
    +    type: attribute
    +customer_controllers:
    +    resource:
    +        path: ../src/Customer/Controller/
    +        namespace: App\Customer\Controller
         type: attribute

    Tip: In real-world applications, since this step is technically just moving files, it can be split to multiple small PRs that are convenient for release so that the effort is iterative and predictable.

    Splitting the databases

    Now we need to figure out natural boundaries between the parts of our application. They are pretty obvious in our example application (Customer, Restaurant, and Courier domains). Still, in real-world scenarios, we can use either common sense or a domain-driven design (as a more advanced approach). One thing to keep in mind is not going with services that are too small, because this comes with a maintenance cost. As stated in Google’s article: “We recommend that you create larger services instead of smaller services until you thoroughly understand the domain.”

    Once the boundaries are decided, the next step is to split our single database into separate ones, one per module. In the case of our example application, how we’ll split the databases is quite obvious:

    • customer database, which will contain the order table.
    • restaurant database, which will contain the restaurant table.
    • courier database, which will contain the delivery table.

    In real-world applications, it is often not that straightforward and will require more effort to decide the ownership of the tables between modules. However, regardless of the scale of the application, the technical part remains the same — the trick is to perform a so-called “hot migration”, where we inject another database connection into our code and write into both databases while reading from the old one. In parallel, we need to run a script to migrate the rest of the data from the old table to the new one. In this article, we won’t focus on the implementation details of this part, as it’s worth an article of its own, but the typical algorithm is:

    1. Find all usages of the table under migration within the code.
    2. For reading, leave the old connection.
    3. For writing, send queries to both databases.
    4. Meanwhile, implement and run a script that will migrate the rest of the data from the old table to the new one.
    5. After the data migration is done, release a PR that will use only the new connection for everything, clean up the code, and delete the old table.

    Some other things to consider:

    1. Joins. Since we are moving out certain tables to separate databases, joins between those tables will no longer be possible. Such places will have to be rewritten to separate queries (one per table).
    2. Foreign keys. It will not be possible for an RDBMS to enforce foreign keys since the tables are in separate databases. If your application relies on such logic, it will have to be moved to the application code. It’s worth noting that it is often common to opt out of foreign keys in high load projects because it has its problems.
    3. Transactions. This is probably the trickiest part because after the tables are split across multiple databases, an RDBMS will no longer be able to span transactions across such tables. Therefore, this process should be rethought and, depending on the business logic, such transactions will have to be either removed or rewritten within the application code using something like the sagas pattern.

    Leaving the hot migration technical details aside, here is how the separated databases are going to look in our example application.

    First, we need to introduce separate entity managers and connections:

    # config/packages/doctrine.yaml
     doctrine:
         dbal:
    -        url: '%env(resolve:DATABASE_URL)%'
    -
    -        # IMPORTANT: You MUST configure your server version,
    -        # either here or in the DATABASE_URL env var (see .env file)
    -        #server_version: '15'
    -
    -        profiling_collect_backtrace: '%kernel.debug%'
    +        connections:
    +            courier:
    +                url: '%env(resolve:COURIER_DATABASE_URL)%'
    +            customer:
    +                url: '%env(resolve:CUSTOMER_DATABASE_URL)%'
    +            restaurant:
    +                url: '%env(resolve:RESTAURANT_DATABASE_URL)%'
         orm:
    -        auto_generate_proxy_classes: true
    -        enable_lazy_ghost_objects: true
    -        report_fields_where_declared: true
    -        validate_xml_mapping: true
    -        naming_strategy: doctrine.orm.naming_strategy.underscore_number_aware
    -        auto_mapping: true
    -        mappings:
    -            App\Courier:
    -                is_bundle: false
    -                dir: '%kernel.project_dir%/src/Courier/Entity'
    -                prefix: 'App\Courier\Entity'
    -                alias: App\Courier
    -            App\Customer:
    -                is_bundle: false
    -                dir: '%kernel.project_dir%/src/Customer/Entity'
    -                prefix: 'App\Customer\Entity'
    -                alias: App\Customer
    -            App\Restaurant:
    -                is_bundle: false
    -                dir: '%kernel.project_dir%/src/Restaurant/Entity'
    -                prefix: 'App\Restaurant\Entity'
    -                alias: App\Restaurant
    +        entity_managers:
    +            courier:
    +                report_fields_where_declared: true
    +                validate_xml_mapping: true
    +                connection: courier
    +                mappings:
    +                    App\Courier:
    +                        is_bundle: false
    +                        dir: '%kernel.project_dir%/src/Courier/Entity'
    +                        prefix: 'App\Courier\Entity'
    +                        alias: App\Courier
    +            customer:
    +                report_fields_where_declared: true
    +                validate_xml_mapping: true
    +                connection: customer
    +                mappings:
    +                    App\Customer:
    +                        is_bundle: false
    +                        dir: '%kernel.project_dir%/src/Customer/Entity'
    +                        prefix: 'App\Customer\Entity'
    +                        alias: App\Customer
    +            restaurant:
    +                report_fields_where_declared: true
    +                validate_xml_mapping: true
    +                connection: restaurant
    +                mappings:
    +                    App\Restaurant:
    +                        is_bundle: false
    +                        dir: '%kernel.project_dir%/src/Restaurant/Entity'
    +                        prefix: 'App\Restaurant\Entity'
    +                        alias: App\Restaurant
    
    # .env
     ###> doctrine/doctrine-bundle ###
    -DATABASE_URL="mysql://root:${MYSQL_ROOT_PASSWORD}@db:3306/delivery_service?serverVersion=8.0.33&charset=utf8mb4"
    +COURIER_DATABASE_URL="mysql://root:${MYSQL_ROOT_PASSWORD}@db:3306/courier_service?serverVersion=8.0.33&charset=utf8mb4"
    +CUSTOMER_DATABASE_URL="mysql://root:${MYSQL_ROOT_PASSWORD}@db:3306/customer_service?serverVersion=8.0.33&charset=utf8mb4"
    +RESTAURANT_DATABASE_URL="mysql://root:${MYSQL_ROOT_PASSWORD}@db:3306/restaurant_service?serverVersion=8.0.33&charset=utf8mb4"
     ###< doctrine/doctrine-bundle ###
    
    # config/doctrine_migrations_courier.yaml
    +migrations_paths:
    +  'CourierMigrations': 'src/Courier/Migrations'
    
    # config/doctrine_migrations_customer.yaml
    +migrations_paths:
    +  'CustomerMigrations': 'src/Customer/Migrations'
    
    # config/doctrine_migrations_restaurant.yaml
    +migrations_paths:
    +  'RestaurantMigrations': 'src/Restaurant/Migrations'
    
    # migrate command example: doctrine:migrations:migrate -n --em courier --configuration config/doctrine_migrations_courier.yaml

    Then, we have to use them accordingly when needed, for example:

    # src/Customer/Service/CustomerService.php
         public function __construct(
             private RestaurantService $restaurantService,
             private CourierService $deliveryService,
    -        private EntityManagerInterface $entityManager
    +        private EntityManagerInterface $customerEntityManager
         ) {
         }
    ...
    +        $this->customerEntityManager->persist($newOrder);
    +        $this->customerEntityManager->flush();

    Lastly, we need to resolve relations between entities, as they are now stored in separate databases and managed by different entity managers. The simplest way to resolve entity relations is to start using database columns directly instead of entities, for instance:

    # src/Courier/Entity/Delivery.php
     
     namespace App\Courier\Entity;
     
    -use App\Customer\Entity\Order;
     use App\Courier\Repository\DeliveryRepository;
     use Doctrine\ORM\Mapping as ORM;
     use Symfony\Component\Serializer\Annotation\Groups;
     
     #[ORM\Entity(repositoryClass: DeliveryRepository::class)]
     class Delivery
     {
         public const STATUS_NEW = 'new';
    @@ -28,9 +28,9 @@
         #[Groups(['api'])]
         private ?string $status = null;
     
    -    #[ORM\OneToOne(cascade: ['persist', 'remove'])]
    -    #[ORM\JoinColumn(nullable: false)]
    -    private ?Order $RelatedOrder = null;
    +    #[ORM\Column(name: 'related_order_id')]
    +    private ?int $relatedOrderId = null;
     
         public function getId(): ?int
         {
    @@ -49,14 +49,14 @@
             return $this;
         }
     
    -    public function getRelatedOrder(): ?Order
    +    public function getRelatedOrderId(): ?int
         {
    -        return $this->RelatedOrder;
    +        return $this->relatedOrderId;
         }
     
    -    public function setRelatedOrder(Order $RelatedOrder): static
    +    public function setRelatedOrderId(int $relatedOrderId): static
         {
    -        $this->RelatedOrder = $RelatedOrder;
    +        $this->relatedOrderId = $relatedOrderId;
     
             return $this;
         }

    As you can see, database separation already pushes our code in the modular direction. It will help us during the next steps.

    You can see the full code example following this step here.

    Enforcing boundaries between modules

    Once the files are moved and databases are split, we can start enforcing boundaries between modules. As mentioned above, we will be using Deptrac to enforce these boundaries. Deptrac works using layers (modules) defined in deptrac.yaml and the vendor/bin/deptrac executable to see dependencies between these modules. This is useful in two scenarios:

    • When files are just moved to corresponding directories, we can define modules in the deptrac.yaml file, run the executable to see the full list of dependencies, and plan the work.
    • When dependencies are completely resolved, we can commit the deptrac.yaml file and include it in our CI/CD pipeline to ensure no new dependencies are introduced.

    Time to see how it looks in practice. First, let’s define our modules in deptrac.yaml without committing the config file yet:

    parameters:
      paths:
        - ./src
      layers:
        - name: Common
          collectors:
            - type: directory
              value: src/Common/.*
        - name: Courier
          collectors:
            - type: directory
              value: src/Courier/.*
        - name: Customer
          collectors:
            - type: directory
              value: src/Customer/.*
        - name: Restaurant
          collectors:
            - type: directory
              value: src/Restaurant/.*
      ruleset:
        Courier:
          - Common
        Customer:
          - Common
        Restaurant:
          - Common

    After running vendor/bin/deptrac, we will get a report explaining the dependencies between our modules.

    Deptrac report with violations

    We will be resolving these issues by introducing service clients and DTOs. It will be something like an SDK for our services that we will put in our Common directory, so all our modules will have access to it. The point of this is to have a layer where we can decide how exactly the inter-module communication takes place; therefore we can switch between direct function calls and, for example HTTP calls, much easier later. After the modules are split, the Common module can migrate to a separate repository and can be used as a composer library, to enable integration into future services.

    To implement service clients, we will utilize a Symfony feature called sub requests, which will allow us to dispatch request objects to existing routes internally, without real over-the-network requests. In Node.js, there is a similar approach using mcollina/fastify-undici-dispatcher.

    There is one caveat worth noting — sub requests have a slight overhead compared to direct method calls; thus, if used excessively, they can cause a slowdown. However, it has to be thousands of times in a loop to become at least noticeable. Plus, if you remember that those calls will become real HTTP requests later on, where the overhead will be much bigger, sometimes it’s even useful to notice such cases beforehand and potentially change the code so that there is no need to call the API in a loop.

    Now, let’s first introduce the base class for all service clients:

    # src/Common/Client/AbstractSymfonyControllerResolvingClient.php
    <?php
    
    declare(strict_types=1);
    
    namespace App\Common\Client;
    
    use App\Common\Exception\BadPayloadException;
    use Symfony\Component\HttpFoundation\Request;
    use Symfony\Component\HttpFoundation\Response;
    use Symfony\Component\HttpKernel\HttpKernelInterface;
    use Symfony\Component\Serializer\Encoder\JsonEncoder;
    use Symfony\Component\Serializer\Normalizer\ObjectNormalizer;
    use Symfony\Component\Serializer\Serializer;
    
    abstract class AbstractSymfonyControllerResolvingClient
    {
        public const IS_INTERNAL_REQUEST_ATTRIBUTE_KEY = 'is-internal-request';
    
        protected readonly Serializer $serializer;
    
        public function __construct(
            private readonly HttpKernelInterface $httpKernel,
        ) {
            $encoders = [new JsonEncoder()];
            $normalizers = [new ObjectNormalizer()];
    
            $this->serializer = new Serializer($normalizers, $encoders);
        }
    
        protected function sendServiceRequest(
            string $uri,
            array $query = [],
            array $requestBody = [],
            string $method = Request::METHOD_GET
        ): Response {
            foreach ([$query, $requestBody] as $payload) {
                $this->validatePayload($payload);
            }
    
            $request = new Request(
                query: $query,
                request: $requestBody,
                content: json_encode($requestBody, JSON_THROW_ON_ERROR),
            );
    
            $request->setMethod($method);
            $request->server->set('REQUEST_URI', $uri);