Multiple Search Keys in CouchDB
The Data
I’m using an example database of movie data, which includes information such as the year the film was released, which genres it belongs to and the ratings on IMDb.
One of the questions I wanted to answer was: how many films released since 2012 have had a rating of 9 or above?
There are a bunch of different ways to get the data out of CouchDB: since I’m using Cloudant, I could use Cloudant Query to have it search the database (which would be fine, it’s a small data set). I prefer to work with views since they (generally!) perform better.
In CouchDB, there isn’t an equivalent of the WHERE
clause that you see in a traditional RDBMS. Views are created with keys, which define the sort order and also allow us to start and stop our results at particular points.
Views and Multiple Keys
My view simply indexes the records by year and rating (this gets updated when any record changes, making it quick to access as the data is already available), and the “reduce” function counts how many films have this year/rating combination.
Here is the code for the view:
function (doc) { emit([doc.year, doc.imdb.rating], null); }
This view outputs something like this (just a little bit of the output!)
{ key: [ 1971, 8.5 ], value: 3 }, { key: [ 1971, 8.6 ], value: 1 }, { key: [ 1971, 8.7 ], value: 2 }, { key: [ 1972, 7.6 ], value: 13 }, { key: [ 1972, 7.7 ], value: 6 }, { key: [ 1972, 7.8 ], value: 8 }, {
Hopefully this shows what I said about the keys dictating the sort order, we get all the records sorted by year, and then by rating within the year. To filter the results we get from this view, we amend the request we send.
Using simple GET
requests we can do:
/_design/rating/_view/year-rating?group_level=2
makes the basic request to the view, outputs as shown above/_design/rating/_view/year-rating?group_level=2&startkey=[2012,9]
shows all films made in 2012 with a rating of 9 or more … and then goes on to also return all films made later than 2012 also
A common pattern for solving this if you use the same parameters all the time (i.e. look for a record that isn’t “deleted” is one I use a lot!), is to create a view that only contains those records, so that you don’t need to filter them out when requesting the view. Here, we could create a view that only included films with a rating of 9 or more, and use the year as the key – that’s one way to solve it.
Another alternative is to pass multiple key ranges into our couchdb view. This is a relatively new feature, but for a situation like this one, you may find it handy. To achieve this: make a POST
request rather than a GET
request, and pass a JSON body including a "queries"
parameter, like this:
{
"queries": [
{ "startkey": [ 2012, 9 ], "endkey": [ 2012, 10 ] },
{ "startkey": [ 2013, 9 ], "endkey": [ 2013, 10 ] },
{ "startkey": [ 2014, 9 ], "endkey": [ 2014, 10 ] },
{ "startkey": [ 2015, 9 ], "endkey": [ 2015, 10 ] }
]
}
This returns the films with a 9+ rating for each of the years. It took me some digging to find how to make this request and pass in the multiple ranges, so I thought I’d put it here so that I can find it again, if it helps you too then that is awesome!
can you show me a example
Hello,
I just have a doubt.
If you want to get the films which are released in 1972 or the movies with a rating above 7. How can you implement it?
In SQL one would write
select * from table where year=1972 or rating>7
Thank you.