What Goes in Source Control?

Short answer: everything! However we need some good directory structures and source control configuration to make that a really practical answer, so this article is a quick outline of my usual advice for a good source control structure for a standard web project. The examples are for a PHP project but I’m sure you could apply this to your own language of choice, also.

Web Root

The web root of your project is never the same thing as the root of your source control folder. Typically I recommend an src directory for all code, then a subdirectory called something like public which will be your web root. The web root contains only the entry point for your application’s endpoints, typically an index.php file plus any assets such as javascript, images and css that you want to serve.

Library Code

This may or may not form part of your repository. If it does, it should go in a source folder such as the src example I gave above. You may chose to bring the code in from its own repository, using source control features like submodules in git or externals in SVN. Alternatively, you may consider that the libraries are a platform dependency, place them on each server where they are needed, and either symlink to them or include them as appropriate. This is particularly useful where you have several sites building on the same shared libraries; just put them in a shared place.

Build Scripts

Using a tool like phing or ant to repeatably perform tasks within your project is an excellent practice as it can really help to make sure things are done quickly and correctly every time. Any project which takes advantage of these types of tools should include the configuration files (e.g. build.xml for phing). These should live separate from any application code, perhaps in a tools directory, or even in the root of the project.

Configuration

Configuration might be different for every platform that an application runs on, but we still need a template to start from when we work with configuration. To achieve this, create files called something like config.php.dist which contain example settings that are needed for the config (for bonus points, make these settings correct for the live platform, so that if you ever get this wrong, this platform is fastest to fix!). Every installation will need to copy the file and call it something like config.php – this should then be ignored by your source control tools so that any changes to it, on any platform, are not shared.

For systems with automated deployments, it can be useful to keep the config file(s) separately on the server, and at deploy time insert a symlink at the point that the application expects the config file to be.

Auxilliary Tools

Many applications have extra tools around them, for example I have an open source project that has a command-line tool for trying out the API, and a tool that generates sample data you can use with the application. Both of these are integral to the project and should be kept in the repository – perhaps each inside their own directory.

Database Patches

Most applications will have some way of keeping track of changes to the structure of their databases, and these are as much part of the project changes as the code is! There are tools to help with database changes, and I wrote a post about database patching strategies myself, but either way they usually result in both patch files, and files to manage the patches. Both of these should be in a database directory or similar, as part of the repo.

Tests

Tests are definitely part of your project, so keep those in the repo! This ties in nicely with the comments about phing files, which can be a great way to make it easy for people to run the various test suites that you have. Whether your project uses traditional PHPUnit testing, has functional or behavioural testing, API testing, or all of the above – check all those tests into the repository so that everyone can keep the versions up to date and run them easily.

Everything and the Kitchen Sink

Anything you need for your project belongs in the repo, it’s not unusual to also have documentation as well as everything mentioned above, plus several other things I’ve probably forgotten – so add a comment to tell me what you store in your repo that I didn’t mention?

4 thoughts on “What Goes in Source Control?”

Paul Rentschler on April 30, 2013 at 12:30 said:

Great post with lots of good advice. I am already doing many of those things all be it in slightly different ways. In particular with regard to configuration files, I have started to suffix my live configuration files with .local so that I can have a generic exclude rule for *.local.

Don’t forget, you really can put “everything” in source control. This includes shell scripts, linux shell configuration files, apache config files, the php.ini file, everything. If your repository is public, just be careful what you put in there so you don’t give away the keys to the kingdom!

Reply ↓
Greg Militello on April 30, 2013 at 20:11 said:

Consider using your webserver to configure your environment specific variables for you. See: http://httpd.apache.org/docs/2.2/env.html For example you could define your database connection via environment variables, keeping your DB host, port, username, and password out of version control completely. Some organizations even require this sort of deployment due to regulations and controls. This avoids the step of symlinking or copying a configuration entirely. If I defined an environment variable in Apache2 with the name of `DB_NAME` I could access it via the $_SERVER superglobal array. Some frameworks offer a transparent way to configure this like Symfony2 http://symfony.com/doc/2.0/cookbook/configuration/external_parameters.html.

I prefer to link dependancies when possible (GIT submodules ETC), but always have a way of accessing this code if you do not control the repository you are linking to. What is to stop the package manager running git://supercoolgithost.com/coolcode/goodstuff.git from removing your access to his repository? Or what happens when supercoolgithost.com goes down for unforeseen reasons? Always have a contingency for these cases.

Reply ↓
Pingback: Four tips for DevOps migrating to the software-defined datacenter: techniques for managing technical debt | Real User Monitoring
Martin Hlaváč on May 1, 2013 at 15:01 said:

Greg Militello: We use same aproach as Paul Rentschler with config files. As our production and testing environment passwords and keys are not visible to developers, but only for specific server admins. Each instance has it’s own config.yml.local file which contains specific settings for each instance. There is also config.yml which has default settings which can be overwritten wtih config.yml.local.

Great article. I really enjoyed it even though i am already doing almost everything what is on the list. You can also add vagrant file with provisioning script to your source control. This way everyone on team have same environment. Not only that… everyone can start working on the projects in few minutes (or in hour if you are installing and setting up a lot of packages).

I am already working on virtual LAMP for virtualhost and vagrant. It has some neat features like automatically created virtualhosts with dns entries upon directory creation in /var/www… You can try it out here https://github.com/mhlavac/lamp-install-script, i hope to release first stable version with complete documentation in late May.

Reply ↓

LornaJane Blog