Git renames are not renames

I consider myself pretty git-confident, I’ve worked with it a lot, taught it, been a git consultant, run engineering and various things-as-code teams. This week I had a spectactular git problem where merging one branch into another produced changes that didn’t exist on either branch. Turns out, renaming directories in a monorepo with multiple almost-identical boilerplate documentation files comes with surprises…

The short version is: Git actually doesn’t track renames at all. It tracks adds and deletes, and presents them to the end user as renames. If the file is 50% the same, it’s considered a rename. It’s a bit of a blunt instrument and so it goes wrong sometimes. You can adjust whether your client shows you renames, or how similar the files can be – but you don’t impact how other users see it.

The situation

I’m in a situation where I need to alter the directory structure of a monorepo with lots of similar-structured but actually different files that appear in every folder. I’m adding an additional folder level, to allow siblings of each project. Did I mention that git doesn’t support renames? It doesn’t track folders either.

The similar structures of the projects turned out to be even more interesting when I realised that there are two sets of mostly-boilerplate documentation in every folder. The two folders (they both have markup content that gets published as documents) have some files that are named the same, and every folder has both folders, with either no changes, or with very minor (but crucial!) changes.

I renamed every folder, committed the changes, updated ALL the build scripts to handle the changes, and thought I’d done the difficult bit (famous last words). Then when I started applying these updates to the existing work-in-progress branches, I noticed that git was deleting or updating files in other projects within the monorepo – changes that were not in either of the branches I was combining.

Why this happens

Git simply has no idea which adds go with which deletes – and on a big project, with a lot of almost-the-same files which were all renamed? It just fails in a big messy heap.

Things I tried that did not help at all:
– splitting up the restructure into small chunks so I only moved one directory at a time, per commit
– adjusting git’s rename settings when merging the branches
– swearing a lot

(I did have it on my list to make all the repeated content into templated files, because it’s annoying to maintain as it is – but I had no idea that was on the critical path for the directory restructure!)

This story has no ending

As things stand, I have no solution! I knew this was the way git worked and I knew there was content that could be problematic – but this problem manifested in a way I didn’t expect when the changes which had been applied apparently without any issue were then merged into the ongoing feature branches (it’s a slow moving project, a lot of contributors, all the reasons I normally advise against doing this sort of repository surgery in the first place). I started seeing phantom changes that simply didn’t exist in any of the “before” branches and it took me a moment to understand what happened.

Things I am doing so we can make progress with the changes:
– lots of checks in CI for pull requests to help humans catch any weirdness at review time and before merge
– having a good attitude to if we need to fix minor documentation excitements as a result of these changes, because they are absolutely a positive change for the project and it’s the least interesting files that are impacted
– manually applying the same folder restructures to the feature branches (it was a script in the first place so it’s repeatable) before allowing users to sync with the main branch seems to help git to know what’s happening

My only advice is either not to do this sort of change, or to have a no-pull-requests interval. Neither of those were options for me though so I’m sharing in case you find yourself in a similar situation some day! Be aware that “renames” (and in fact “directories”) are not always what they seem ….

LornaJane Blog

Git renames are not renames

The situation

Why this happens

This story has no ending

Related Posts

Leave a Reply Cancel reply