About a year ago, I wrote a bit about gearman and how I was using it to create points in my application where intensive tasks could be processed asynchronously. It occurred to me recently as I was working on something quite similar that I didn't come back and write about the tool I use to manage the workers; supervisord.
Supervisord is a python tool which starts and stops other processes, and can monitor them for when they need a restart, it also logs the output of these processes and tells you how long they have been running. That's kind of all I have to say about it - short and sweet, yet an indispensable part of my process.
For bitestats.com, supervisord keeps the worker running; if something should go horribly wrong while generating one of the reports and kill the process (I wrote this PHP code, it could happen!), or the machine reboots (that could happen too!), then supervisord will bring the worker back up again, and it will talk to gearman and figure out what to do.
I even use supervisord to keep things like my IRC bot running. The interface for starting, stopping and checking on the status of things is called
supervisorctl. When I run it, I see:
[email protected]:~$ supervisorctl nerdita RUNNING pid 1418, uptime 24 days, 1:35:33 supervisor>
You can see nerdita (she's a nerdie IRC bot, written in node.js) listed here, with her PID and how long she's been running. Rather nicely, we can type
help for help at this point:
supervisor> help default commands (type help
): ===================================== add clear fg open quit remove restart start stop update avail exit maintail pid reload reread shutdown status tail version supervisor>
I mostly only use the stop/start/restart commands, and
tail. You can tail -f [process name] and see the output from the programme - very handy when you want to look back and see what has happened recently or if there have been any errors.
If you're in the situation of sometimes having to log into the server, check if a process has gone away (sometimes delete a stubborn pid file!), and restart it - then take a look at supervisord. For me it provides an easy way to keep my scripts running and looked-after, and to know what happened to them if things go wrong! I actually only started using supervisord after reading an excellent post about gearman by Matthew Weier O'Phinney, so thanks to him for once again pointing me in the direction of a great tool!