My requirements were simply to add both asynchronous (for processing things like recalculating counts) and periodic (mostly for garbage collection) tasks to a PHP application. The application has a separate web application and backend API, both made of PHP’s Slim framework, and the API talks to MySQL. It’s all very lightweight and scalable, and I was looking for something to fit in with what we have, with good PHP support.
Enter beanstalkd, it’s a super-simple job queue and has great PHP support in the shape of Pheanstalk (I’m saving my PHP + beanstalkd examples for another day because this post would get too long to read otherwise!). I’ve used gearman in the past but beanstalkd seemed lighter, and when I started looking at their documentation I discovered that I had a working installation in about the time it would take me to fall off a log – which is always a good indicator of a tool that will be fun to work with :)
Beanstalkd is a job queue rather than a message queue, so when you put things on the queue (or “tube” as they seem to be called in Beanstalkd), they stay there until a worker comes along and processes each one successfully. It will retry if the worker doesn’t indicate it completed successfully, and the worker can also “bury” the job – i.e. mark it as failed. If there are more jobs than can be handled, the queue will just build up a bit of a backlog while it works through it all, nothing will be lost or missed. Also beanstalk can run with a binlog, so if the server goes down or beanstalk crashes, it can pick up again with the queue intact.
Working with Beanstalkd
I’m an ubuntu user so
aptitude install beanstalkd worked very nicely. It has a selection of client libraries so you can use whatever works best for the application you want to use it with.
By default beanstalk comes up on port 11300 so you can greet it by telnetting to there:
telnet localhost 11300
How is your beanstalkd feeling today? You can ask it by typing
stats – this gives quite a lot of output but will include how many jobs are waiting/in progress/failed. In fact from this telnet prompt you can look at a bunch of things; here’s a handful of my favourite commands:
list-tubes– shows which tubes are available/in use
use-tube [tube]– use a specific tube
stats-tube [tube]– get the stats for a single tube
peek-ready– shows the next job to be processed in the current tube
When you are done, type
quit to return to your prompt. Using these commands you can check in with your queue if you need to, but mostly I find I just need to make sure it is running, and that the workers are running, and beyond that I don’t need to think about it too much! I have detailed logging on the workers so if there are any issues, the information that I need is there.