It’s been a while I’ve been discussing about monorepo. Few things have changed: I’ve started to work on the successor of full-build.
Well full-build was probably a step forward for source management but it’s really a step backward in terms of compatibility: modifying projects or dev workflows is bad:
Devs always forget to update those files
Most of the time, this is different workflows on local and CI
In order to build a monorepo efficiently, it’s important to take into account how devs are working:
They use most of the time an IDE
They use standard tools (make, Terraform, …)
They do not want to bother with stuff that interfer with their workflows
So OK, let’s not change those habits and just:
Use standard tools to build artifacts
Never modify projects files
Just provide tooling to build current branch and check if everything is ok
Terrabuild is then the successor of full-build. It’s still being baked. Only few components are available at the Magnus Opera GitHub Organization.
It will allow building a monorepo using standard tools without any modifications to source. It will also allow to isolate builds and easily build with same toolchains as on CI. Terrabuild will use extensions to deal with various languages or tools and will support build caching for fast builds both local or on CI.
It’s been something like a year and half I’ve been CTO at Tessan. When I joined, I’ve found the same problems that plague engineering teams:
lack of delivery practices
ship branches instead of versions (using kind of gitflow - that was soooo scary)
no testing but manual testing
no metrics in production
no cadence to deliver feature
no way to enable a feature in production once mature
As most company, Tessan is using a multi-repositories strategy. Like a vast majority of companies, this is just the situation that has been reached without much thoughts: start with a project, start a second, discover builds are complicated - split into 2nd repository,… rince again That’s where most companies are. No strategy, no thinking about what can be done to improve things or improve delivery.
I must confess, we are still using multiple-repositories. But things have been largely improved to gear towards a mono-repository.
What has been implemented is a gitops strategy:
repositories have dedicated build and generates their own artifacts: Docker images or zip archive (we have part of our infrastructure that’s running on-premise in the field). Artifacts are archived and tracked using a git tag or branch name
everything is deployed using Terraform with strict versionning. Targets are Kubernetes and on-premise infrastructure
all deployments happens using GitHub actions within protected environments hence we are using the super expensive GitHub Enterprise just to only use protection rules (this badly hurts)
we have more deployment pieces since splitting the big server-side monolith was a requirement to move faster across teams (yes, we went for micro-services using fbus)
Basically, this means we have more repositories than before 🤪.
But we are in good shape to switch to mono-repository now. Everything is running quite smooth. So why move away from that model?
Well, feature development is a pain in the neck. Most of our development implies modifying several applications from front-end to back-end while changing database models from time to time. This is difficult for most devs to grasp:
isolate from main branch (especially when several repositories are impacted) the time feature stabilizes
understand impacts for testing
understand impacts for a release to ensure smooth communication with support team
do not miss something when deploying to test environment (we have micro-services again)
get the correct merge period to lower impacts
From an operational point of view, there are also several things hard to track:
reviews are complicated: usually this spans several repositories and it’s a hell to understand what’s going on
tests are underestimated (due to incremental impacts)
communication is impaired has several repositories have to be tracked
And from a dev perspective, it’s rather not good:
no motivation for changes/refactoring across repositories
lack of understanding how things keep working despite partially released (nullables, feature flags…)
no opportunity to learn by reading more code
For at least one thing, I’m a strong advocate for mono-repository: atomic feature implementation.
But if you think going mono-repository is easy, you are totally wrong. Single app per repository (aka multiple-repositories) is easy to do: setup sources, build and generate artifact on changes. Done.
When using mono-repository, you will hit the wall for sure: time to build the applications and noise:
Most of the time, there is no need to rebuild everything. We only need to rebuild and release what has changed. Shipping the whole platform is a non-sense.
Noise is a clear problem has looking at history is not so funny. Hopefully, Meta has release a tool to understand what’s going on Sapling. I’ve not tested it, but maybe this can help inthe future. To be investigated.
So what to do ? Well, your mono-repo must have tools to ensure it’s fast, optimize the build and provide auditing features:
identify what has changed (a new commit, a new branch)
build only what has changed - and think about changes in libraries up to deployment
delivery what has changed - generate a release note for changes
A mono-repository requires much more work to setup than multi-repositories. But benefits are tremendous.
I learned about mono-repository (or at least unified view of mult-repositories) at Criteo: that was the MOAB (Mother Of All Builds). When I left, I decided to create full-build to help D-Edge move faster engineering side. full-build is not much maintained (publicly speaking) but it lives under various names today (no more public). I’ve considered doing a public v2 as I’m not really satisfied with current state of affair.
Anyway, there are several tools on the market:
Bazel (Google) / Buckbuild (Meta): probably the top offer to consider but requires dedicated teams - does not really fit the startup/mid-company. Lack reuse of existing project metadata
But as I’m saying, I’m not really satisfied. What I’m looking for:
be explicit with projects declaration: no magic
use most of metadata from projects (npm, .net projects, maven…) to get dependencies
leverage eco-systems instead of relying on plugins
ensure strong projects isolation - all paths must be relative to project, not workspace
support for explicit tasks
All in all, I think I will go for full-build v2 😃 I just need a tool that is no brainer and definitively open-source.
So here I am… restarting this blog again (must be the 4th or 5th I guess). Hope it will be better than past years…
Anyway, I’m planning to talk a little bit about what I will do in the future as I will manage IoT projects starting next week. I will try to share my experience discovering this world, implementing and managing sensors in the wild. It will be Azure related and as I do not know much about IoT on Azure, this will be for sure super interesting - at least for me ;-)
On personal projects, I will restart full-build and complete .net core support. I’m planning to implement following features:
.net core project style only
NuGet as build artifacts
Just asking myself if I should restart this project using Golang or continue with F#… But a full rewrite could be worthful as several data structures and graph algorithms are required for this kind of tool.
For 2020, I also plan to complete implementation of a new 3d engine for Amiga and hopefully release a demo with this engine. I started this project 2 years ago with a friend but I stopped working on this due to lack of motivation and time.