Using Persistent Storage with Gearman

I’m using gearman for the first time in a new project, and two things in particular were bothering me. Firstly, there doesn’t seem to be a built-in way to see what’s in the queue. Secondly, if the gearman server dies (which seemed quite likely when I was first getting to grips with this stuff and writing really buggy code!) you lose your queue. Therefore I decided that I would switch gearman over to running with persistent storage. My config file (this is Ubuntu 10.10) is in /etc/default/gearman-job-server and it contains the following snippet:

# Use mysql as persistent queue store                                                                                                  
# PARAMS="-q libdrizzle --libdrizzle-host=10.0.0.1 --libdrizzle-user=gearman \                                                         
#                       --libdrizzle-password=secret --libdrizzle-db=some_db \                                                         
#                       --libdrizzle-table=gearman_queue --libdrizzle-mysql"

Since I’m already using MySQL as application storage, this seemed like a great way forward. After looking around a bit, I found this great post about using PHP and Gearman. including instructions for persistent storage. We create a mysql table like this:

CREATE TABLE gearman_queue(
`unique_key` VARCHAR(64) PRIMARY KEY,
`function_name` VARCHAR(255),
`priority` INT,
`data` LONGBLOB
);

Then we adapt the block of code above to point to our mysql instance as needed. I found that I also needed a --libdrizzle-port=3306 in that configuration, along with my host, user, password and database details to make this work. Once I had changed the config file, I restarted gearman:

/etc/init.d/gearman-job-server restart

Now when I add jobs to gearman, I see them in the gearman_queue table until they have been processed, and if the job server does restart with an oustanding job queue, it won’t be lost.

18 thoughts on “Using Persistent Storage with Gearman

  1. Thank you a lot, without specifying the port number, I had this error

    CRAZY libdrizzle query: SHOW TABLES
    ERROR _libdrizzle_query

  2. Hi,

    I have followed your explanation and some others but I don’t get any data in my MySQL DB.

    What can I check for this ?

    Thanks,

    Matt

    • It seems there is a bug in Drizzle and maybe in Gearman too.

      I compiled the latest version of Gearman on my Ubuntu machine, this didn’t fix it.

      Drizzle failed on compiling… and it seems that you have to stay away from 7… but where can I find the old version ?

  3. Matt: Sorry, I haven’t used Drizzle at all, so I’m not sure what to suggest. This example is for using Gearman with MySQL – if you get things working though, feel free to come back and add a comment in case anyone else is looking for the same thing!

  4. I have the same problems (thank you to everyone who commented on the issues, I was forwarned!), since upgrading to ubuntu 11.10 I found that there is no drizzle (and therefore no mysql) support in gearman. I’ve switched over to storing the queue in sqlite instead, which is working really well.

      • There will be records in this table when there are jobs on the queue that haven’t been processed yet. Once the workers have dealt with the jobs, the table will be empty again. Does that help?

        • I have a Client and a Worker running together. My scenario test is like below :
          a. The Client send 1 jobs (6 MB), then the Worker get the job and do some business process. It will take about 2 minutes to complete the process.
          b. While the worker is still running the business process of the first job (6 MB), the Client send the second job (1 MB). As I can see in the log file the Worker is still proceed the first job, and is not get the second job yet.

          Now I think I have 1 pending job in the job server queue, is it correct?
          Then I try to execute
          $ sqlite3 /tmp/example.db
          sqlite> select * from gearman_queue;
          sqlite> //no record

          Note: After accomplish the first job (6 MB), then the Worker get the second job (1 MB).

          • Hmmm, sounds like something is wrong there. I would verify by stopping the workers that the jobs are in the database I think.

  5. I recommend to NOT create the queue table manually in the database. It took me a while to find out why there was an error “can’t find the column ‘when_to_run'”.

    When you launch gearmand for the first time, it will create an empty table structure, which can be updated over time / releases.

    Thanks for the tutorial, great help!

    Greets

  6. In addition since 0.38 it’s available libmysqlclient, I had some problems while using 0.41 with mysql so I updated to the last 1.1.4 and now, on Ubuntu 12.04, it works fine, these are my steps:

    [code]sudo wget https://launchpad.net/gearmand/1.2/1.1.4/+download/gearmand-1.1.4.tar.gz
    tar zxfv gearmand-1.1.4.tar.gz
    ./configure –disable-libdrizzle
    make
    make install[/code]

    and then just start the server:

    [code]gearmand –verbose DEBUG -l stderr –queue-type=MySQL –mysql-host=localhost –mysql-user=username –mysql-password=something –mysql-table=table_name –mysql-port=3306[/code]

    as suggested by Kim the table will be created automatically.. I had few problems with this because after the install, the job server worked fine, but by restarting it, it was outputting these errors:

    [code] INFO 2013-02-02 13:24:28.405948 [ main ] Initializing MySQL module
    INFO 2013-02-02 13:24:28.410278 [ main ] MySQL module: creating table jobqueue
    ERROR 2013-02-02 13:24:28.410620 [ main ] MySQL module: create table failed: Table ‘jobqueue’ already exists -> libgearman-server/plugins/queue/mysql/queue.cc:259
    ERROR 2013-02-02 13:24:28.410704 [ main ] Failed to initialize mysql initialize(QUEUE_ERROR) -> libgearman-server/queue.cc:198
    DEBUG 2013-02-02 13:24:28.411245 [ main ] Shutting down all threads -> libgearman-server/gearmand.cc:240
    INFO 2013-02-02 13:24:28.411304 [ main ] Shutdown complete[/code]

    By running:
    [code]mysql> drop table jobqueue;[/code]
    gearmand was creating the table and working well, but again the same error when restarting the job server. So I restarted the machine and now it seems to work fine. Hope is useful, bye!

    • In the config file – look at the first example in the post itself, and then uncomment those settings and edit them as appropriate for your system.

  7. Hi Lorna,

    It looks like since you wrote this (years ago) Gearman has added support for MySQL by name, so you can reference it directly rather than pointing libdrizzle at it (in fact, libdrizzle may be gone).

    [code]
    MySQL:
    –mysql-host arg (=localhost) MySQL host.
    –mysql-port arg (=3306) Port of server. (by default 3306)
    –mysql-user arg MySQL user.
    –mysql-password arg MySQL user password.
    –mysql-db arg MySQL database.
    –mysql-table arg (=gearman_queue) MySQL table name.
    [/code]

  8. Sorry for bring up the this old post again, but I found this really useful.

    The table is lacking a timestamp field.
    [code]
    Update
    ALTER TABLE `gearman_queue` ADD COLUMN `Added` TIMESTAMP NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP AFTER `when_to_run`;

    Remove
    ALTER TABLE `gearman_queue` DROP COLUMN `Added`;

    original
    CREATE TABLE `gearman_queue` (
    `unique_key` varchar(64) DEFAULT NULL,
    `function_name` varchar(255) DEFAULT NULL,
    `priority` int(11) DEFAULT NULL,
    `data` longblob,
    `when_to_run` int(11) DEFAULT NULL,
    UNIQUE KEY `unique_key` (`unique_key`,`function_name`)
    ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
    [/code]

    [code]
    select count(*) from gearman.gearman_queue where Added < DATE_SUB(NOW(),INTERVAL 60 minute)
    [/code]

Leave a Reply

Please use [code] and [/code] around any source code you wish to share.

This site uses Akismet to reduce spam. Learn how your comment data is processed.