Too often though, they don't think about how that data will be retrieved or what the implications are when it gets beyond the thousand records that were used for testing. This is where having an idea of how the data will be retrieved can really help application performance. (note: this article is aimed at users of traditional relational databases, and ignores all other possibilities). This post takes a look at the various index types and when to use them. Continue reading
Singular and Plural
This goes for table names, and also for the names of join tables. If you call your tables "user" and "group" then you probably want your linking tables to be "user_group". If you go for plurals (my personal favourite) then be consistent over whether the linking tables are called "user_groups" or "users_groups".
I've seen two main variations on the column names for primary keys, one is to call them all simply "id", and the other is to name them after their table name such as "user_id" or "group_id". It doesn't really matter but my recommendation is for the latter - that way, the user_id column in any other table clearly joins on to the user_id column in the users table, making it easy to read and understand.
Case and Capitalisation
Due to my EXtreme DOuble CApitalitis, I prefer everything to be lower case, but the key is consistency, so that it is easy for developers to get used to the patterns in the database setup and to develop against your schema without having to refer back to it all the time.
Consistency is Key
In general, I like database schemas which are predictable and well-laid-out. Although I have my own preferred conventions, I don't mind what is used so long as it is predominantly in step with itself - this makes my life as a developer so much easier! What's your top tip for sane database naming conventions? Leave a comment and let me know!
- add a table for meta data to the database, set a database version parameter to 0
- export structure and any essential data (lookup tables, etc) into an sql file
- for any changes, create a numbered patch file e.g. patch-1.sql, including the change and an update statement to bring the version meta data to match the patch file
- store all of the above in source control
- for bonus points, create another sql file with some nice sample data
Give the Database Some Version Awareness
Naming is completely changeable but mine looks something like this:
CREATE TABLE `meta_data` (entry varchar(255) primary key, value varchar(255)); INSERT INTO `meta_data` SET entry="version", value="0";
This new row will hold information about what patch version the database is up to. Every patch that is created will update this number so that it is possible to tell which patches have and have not been applied.
Create an Initial Data Structure
Firstly create a database directory for all these files to live in. This should be outside your web root but inside your source control project.
Take your database and dump just the structure of the tables using the --no-data switch to mysqldump like this:
-p --no-data > create.sql
You will also want to populate tables which hold things like lookup values, country lists, that sort of thing, so that these are set up. People starting to develop with this project, or if the application needs to be deployed to somewhere new, can use this file as a starting point.
Create Patches for Changes
This is where the discipline element comes in - no direct edits on the database are allowed! Instead, write the SQL for your change and place it in the patch file, then run it against the database. If that sounds too much like hard work then copy/paste the SQL you use to make changes, or the SQL generated by whatever SQL tool you use, and place it in the file.
Every file must have its own incrementing version number in its file name, e.g. patch-1.sql, patch-2.sql etc. Within the file the version must also be updated to match, with a statement like:
UPDATE `meta_data` SET value="1" WHERE entry = "version";
Here are a few pointers on getting the most out of something like this:
- Under no circumstances is it acceptable to edit a patch file that has been committed to source control. Someone might have run it already and you'll confuse everyone completely.
- Having matching undo scripts alongside the patches can be really useful in case a change gets deployed and needs to be undone.
- Make a script to look up the database settings in the config file of your application, query the database for what version it is at, and run any outstanding scripts. This makes life much easier especially if you have large numbers of scripts (I've seen the patch numbers hit the hundreds)
- A possible adaptation of this approach is to create patch files for use for updating a database, but to also update the install.sql file to make it correct at any point in time, this means a much shorter and easier setup time for new deployements/developers. The version awareness works in the same way regardless
- Creating a sample database which creates a few records in each table can really help for development purposes - its quicker for people to get set up and attain a working system that they can make progress with.
I'm sure there are many improvements and variations on this theme of simple database patching, leave a comment and let me know what works for you and why!