Git Submodules for Dependent or Common Code

Submodules are one of the most powerful and most mistrusted features in git, at least in the web development part of the internet where I spend my time. I’ve seen them go horribly wrong, but I’ve also had teams adopt submodules and have their development process run much more smoothly as a result – so I thought I’d take a moment out of my day to write down the process (and the gotchas) of development with submodules.

The Problem Submodules Solves

Git’s submodule feature allows you to have one repo which has another repo as a subdirectory. This is useful for code which is either common to multiple projects, or which is a library you’re using in this project but which is still under active development – you know it will change, and you may make those changes within the project you’re working on.

There is probably a whole other post on how to structure your project to support submodules and shared code!

.
├── lib
│   ├── theme88
│   └── mybesttools <-- submodule
├── README.md
├── src
    ├── config.php
    ├── controllers
    ├── inc
    ├── models
    ├── public
    ├── scripts
    ├── services
    └── views

Here, I've got a basic project layout, with a submodule inside the lib/ directory. This enables me to easily bring in changes to the mybesttools repository, but also to make changes here and then send them back upstream.

How to create and work with a setup like this? Keep reading ....

Creating Submodules

Creating the submodule is actually the easy part. Just do (you must run these commands from the top level of your repo):

git submodule add [repo url] lib/mybesttools

Now let's have a look at what actually happened. Run git status now and you'll see two things:

  • A change to mybesttools
  • A change to a file called .gitmodules, have a look in here to see the settings for the subdirectory and the repository that is linked there

You can commit these changes and when you share them, other people will get these changes in their repo. However git won't hydrate these submodules automatically, when you first clone a repo with submodules, or if you get a new submodule appearing in an existing project, you need to ask git to put them in place with git submodule init.

Accepting Changes into Submodules

So you're building a project with a library in a submodule. There are new changes in the submodule, and you want to bring them into your project. Your workflow will look something like this:

  1. Go into the submodule directory. From the inside, it works just like any other git repo, so pull in the new changes as you usually would
  2. Now you need to tell the parent repo to use the new revision. git status will show you that your submodule has changed, and git diff will show the revision numbers involved. Add and commit the changed submodule, and push your changes

Remembering not just to pull new code into your own project and check everything is okay, but also to commit and push in the parent project is the key here - and very easy to overlook!

Making Changes Within a Submodule

If you're working on a feature that needs changes in the submodule, you don't need to commit them upstream and pull them in, you can make them right in place in your project (which is one reason why I like this method over dependency management when working with a library that I know might need to evolve along with me). You do need to handle the changes between submodule and parent repo quite carefully, so try this as a recipe for success:

  1. Make all of the changes. Commit the ones in the parent project as normal
  2. Change into the submodule, and treat that like a normal repo, curating sane and atomic commits with meaningful commit messages (because that's how we always work, right?). You may need to check your submodule out onto a branch before you can commit to it; this happens because when you update your submodule it just checks out the given revision and ends up in detached head state.
  3. Make sure you now push your changes back to the upstream repo for the submodule
  4. Finally (and vitally) now go back to your parent project and add and commit the submodule. This lets the parent project know which revision of the submodule it is pointing to.
  5. Now push your parent repo changes and double check that you really did push that revision from the submodule as well!

Hopefully that gives you some idea of how the moving parts work with submodules. There are certainly some pitfalls, in terms of forgetting to update the parent repo when the submodule changes, or forgetting to push changes on one repo or the other, but hopefully I've labelled all those in a way that will help you to avoid any major problems - or at least to understand and untangle them quickly if anything does happen.

If you have another other tips or tricks, please share in the comments, I'm always looking for ways to improve on my workflow and to make techniques like submodules easier for the teams I work with.

2 thoughts on “Git Submodules for Dependent or Common Code

  1. Pingback: Git submodules cheat sheet | Rob Allen

Leave a Reply

Please use [code] and [/code] around any source code you wish to share.

This site uses Akismet to reduce spam. Learn how your comment data is processed.