This post is a discussion around why I am looking at a MonoRepo at all and what I see as the pro's and con's of a MonoRepo vs Multiple Repository. This topic is widely discussed elsewhere and these are just my views today given what I know right now.
Historically I have always worked with multiple repositories, usually one app, micro service, class library or solution in Microsoft terminology per repository. This has worked well for my team, but there are definitely some inefficiencies with working this way and recently I’ve been looking at MonoRepo as a way to address some of the process bottle necks I see.
The evolution of the tooling we use, particularly TFS and Visual Studio 2017 has also made this more of a realistic consideration.
Just in case you aren't already aware, the difference between a MonoRepo and Multiple Repos is;
The oft cited example of a company using a MonoRepo is Google. They have their own bespoke source control system to enable this. Off the shelf systems have not handled it well historically, but I believe they are starting to catch up now.
To describe the current work flow with a generic example, let’s assume we have an API and UI that consumes it. Within the API repository there is a client class library that is use to test the API and is consumed by callers of the API. This client is built, packaged and push to our package feed on every CI build of the API.
In order for someone to create a new API method and associated UI change, they need to;
I know it is possible to create a package locally for the API client and do a lot of testing prior to the CI build, but I generally don’t see that happening.
As you can see this process is not optimal. The same applies for other types of share libraries that are packaged (usually via nuget of npm in our case), not just API clients.
The MonoRepo solves this by directly referencing the API client or other shared library in the consumer. This allows all development and testing to happen locally in a fast cycle. Only one pull request is required and the changes are part of the same version.
Don’t get me wrong, package managers like nuget are fantastic, they have served us well over the years and will continue to for third party components or components we develop that we will never change. However for components that do change, the management of that is becoming more onerous, especially as we continue to expand our code base.
Other benefits of a MonoRepo are that it is easier to really share code. I find that despite good intentions, having separate teams working in separate repos means a lot of code duplication and a lack of shared libraries.
I also believe a MonoRepo gives better visibility of dependency consumers. It allows us to easily trigger CI builds for consumers when a dependency is updated.
One of the biggest benefits I've seen so far in my experimentation is refactoring. It is so much easier to refactor code that would normally be avoided through fear of breaking something it is dependent on.
So in list form here are some benefits that I've noted so far;
I'm sure there are others I've not covered here.
As you can see my main motivation for considering moving to a MonoRepo is removing waste / bottle necks from our development process. However, I do have a number of concerns with making the switch that I fear may come back to bite me if I make the wrong call.
Some things that have made moving to a MonoRepo a real possibility for us are;
If you are worried about losing your Git history when you merge repositories, don’t. Here is an example;
git clone http://[TFSOrVSTSUrl]:8080/tfs/[Collection]/[Project]/_git/[MonoRepoName] e.g. http://AcmeTFS:8080/tfs/Acme/Engineering/_git/MonoRepo
git remote add -f [RemoteBranchName] [RemoteUrl] e.g. git remote add -f queuing http://AcmeTFS:8080/tfs/Acme/Engineering/_git/AcmeQueueing
git merge [RemoteBranchName]/master --allow-unrelated-histories e.g. git merge queueing/master --allow-unrelated-histories
You now have the repository merged, but all the files are and folders are in the root directory. Create a new sub directory for the merged repository (e.g. md queueing).
Move the folders and files into the new sub directory. The easiest way to do this in Windows is by using TortoiseGit. Right click the folders and files and drag them into the directory. When the prompt comes up, select 'Git Move versioned files here'. Alternatively, you can use the 'git mv' command.
IMPORTANT! If you don't use Git to move the files and folders you will loose your history and defeat the point of the merge!
git commit -m "Moved merged repository into a sub-folder" git push
I’m afraid this has turned into a bit of a ramble, so I will try to conclude.
Based on my experimentation and the type of code base we work on, I think that a MonoRepo would suit us. The main benefits I see are shifting left and removing bottle necks from the delivery pipeline. There are no show stopping reasons not to give it a go anyway. Is it going be right for every team? No, probably not. My advice is to try it on a small scale and see what you think.