Continuous Delivery in a Monolithic Repository
A monolithic repository is a giant repository that holds multiple projects. The move towards monolithic repositories has been popularised over the past couple of years by tech giants Facebook and Google, stating advantages such as simplified dependency management and improved code reuse. However, this approach introduces new challenges to continuous delivery.
Traditional Continuous Delivery
Traditional continuous delivery pipelines are generally triggered by subscribing to, or polling for, changes to the contents of a repository. Once it has detected that a change has been made somewhere in the repository, the build that is listening to that repository is triggered. It clones the whole repository, including its history, and the tests are run. If the tests pass, new artefacts are then created and deployed for developers to work with.
However, this is not ideal for a monolithic repository.
Firstly, no matter what files are changed on a merge into the repository, all the builds that have been listening to the repository are triggered. As a monolithic repository holds multiple projects, even a project that has not been modified by its code or a dependencies code will have its builds triggered. This results in resources being used to run builds that have no good reason to run.
Secondly, cloning a monolithic repository and its history takes longer than cloning a single repository. Depending on the combined size of all the files in the repository, and the size of the history of each of the projects, this can end up being a substantial amount of data that needs to be cloned.
Monolithic Continuous Delivery
As mentioned above, there were several downsides that needed to be addressed before we could efficiently use our builds in the monolithic repository. We came up with the following solutions.
Instead of listening to changes in the whole repository, we decided to take a path-listening approach. When a merge is made into the monolithic repository, we determine what paths have changed. We then send tokens to the builds that should be triggered to indicate that the builds should start running, and should use the latest version of the repository. As we use Jenkins for continuous delivery, we are able to use the “Trigger builds remotely” option. To implement path listening, we use the Parameterized Builds for Jenkins plugin by Kyle Nicholls (https://github.com/KyleLNicholls/parameterized-builds).
To reduce the cloning time of the monolithic repository in our builds, we use shallow cloning. This clones the whole repository, but it only clones the repository’s history to a specified depth. Given that many merges are made within our repository, this saves a great deal of time.
These are a few approaches we have taken at Caplin to improve our continuous delivery process in our new monolithic repository. If you have experience of continuous delivery and/or monolithic repositories, it would be great to hear about it.