Counting Duplicate Commit Messages

When chatting about source control good practice the other day, I got a question about repeated git commit messages. In general, I would always advise that the same commit messages appearing multiple times in a project’s history is a definite red flag – and if I’m responsible for that repository I will probably make fun of you for doing it. Workplace harrassment aside, if you can repeat your commit messages, then they are probably not descriptive enough. There are always projects that break this rule for some reason, but in all the years I’ve worked in software, I have never worked on a project where I’d make an exception for something like that.

To support my point, I also checked one of the larger repos at work for duplicate commit messages. It was simple to do but I thought I’d share my script in case anyone else wants to use it on their own repos and offer constructive feedback to their own colleagues!

git log --oneline | cut -c 10- | sort | uniq -c | sort -n

This shows every commit message in the history of the project, with a count of how many times it appears – and it sorts them by that count (increasing, so that the most repeated messages appear immediately above your cursor when the command completes). Typically I do see quite a few “Merge branch master into ….” type messages and we also have some automation that produces some very similar messages – all that is fair enough. When I find the person who thinks that “Update [filename]” is an acceptable commit message though, I will be taking some time to ~point and laugh~ offer some constructive advice.

Also published on Medium.

2 thoughts on “Counting Duplicate Commit Messages

Leave a Reply

Please use [code] and [/code] around any source code you wish to share.

This site uses Akismet to reduce spam. Learn how your comment data is processed.