Mechanisms for tracking DB schema changes

43,050

Solution 1

In the Rails world, there's the concept of migrations, scripts in which changes to the database are made in Ruby rather than a database-specific flavour of SQL. Your Ruby migration code ends up being converted into the DDL specific to your current database; this makes switching database platforms very easy.

For every change you make to the database, you write a new migration. Migrations typically have two methods: an "up" method in which the changes are applied and a "down" method in which the changes are undone. A single command brings the database up to date, and can also be used to bring the database to a specific version of the schema. In Rails, migrations are kept in their own directory in the project directory and get checked into version control just like any other project code.

This Oracle guide to Rails migrations covers migrations quite well.

Developers using other languages have looked at migrations and have implemented their own language-specific versions. I know of Ruckusing, a PHP migrations system that is modelled after Rails' migrations; it might be what you're looking for.

Solution 2

We use something similar to bcwoord to keep our database schemata synchronized across 5 different installations (production, staging and a few development installations), and backed up in version control, and it works pretty well. I'll elaborate a bit:


To synchronize the database structure, we have a single script, update.php, and a number of files numbered 1.sql, 2.sql, 3.sql, etc. The script uses one extra table to store the current version number of the database. The N.sql files are crafted by hand, to go from version (N-1) to version N of the database.

They can be used to add tables, add columns, migrate data from an old to a new column format then drop the column, insert "master" data rows such as user types, etc. Basically, it can do anything, and with proper data migration scripts you'll never lose data.

The update script works like this:

  • Connect to the database.
  • Make a backup of the current database (because stuff will go wrong) [mysqldump].
  • Create bookkeeping table (called _meta) if it doesn't exist.
  • Read current VERSION from _meta table. Assume 0 if not found.
  • For all .sql files numbered higher than VERSION, execute them in order
  • If one of the files produced an error: roll back to the backup
  • Otherwise, update the version in the bookkeeping table to the highest .sql file executed.

Everything goes into source control, and every installation has a script to update to the latest version with a single script execution (calling update.php with the proper database password etc.). We SVN update staging and production environments via a script that automatically calls the database update script, so a code update comes with the necessary database updates.

We can also use the same script to recreate the entire database from scratch; we just drop and recreate the database, then run the script which will completely repopulate the database. We can also use the script to populate an empty database for automated testing.


It took only a few hours to set up this system, it's conceptually simple and everyone gets the version numbering scheme, and it has been invaluable in having the ability to move forward and evolving the database design, without having to communicate or manually execute the modifications on all databases.

Beware when pasting queries from phpMyAdmin though! Those generated queries usually include the database name, which you definitely don't want since it will break your scripts! Something like CREATE TABLE mydb.newtable(...) will fail if the database on the system is not called mydb. We created a pre-comment SVN hook that will disallow .sql files containing the mydb string, which is a sure sign that someone copy/pasted from phpMyAdmin without proper checking.

Solution 3

My team scripts out all database changes, and commits those scripts to SVN, along with each release of the application. This allows for incremental changes of the database, without losing any data.

To go from one release to the next, you just need to run the set of change scripts, and your database is up-to-date, and you've still got all your data. It may not be the easiest method, but it definitely is effective.

Solution 4

The issue here is really making it easy for developers to script their own local changes into source control to share with the team. I've faced this problem for many years, and was inspired by the functionality of Visual Studio for Database professionals. If you want an open-source tool with the same features, try this: http://dbsourcetools.codeplex.com/ Have fun, - Nathan.

Solution 5

If you are still looking for solutions : we are proposing a tool called neXtep designer. It is a database development environment with which you can put your whole database under version control. You work on a version controlled repository where every change can be tracked.

When you need to release an update, you can commit your components and the product will automatically generate the SQL upgrade script from the previous version. Of course, you can generate this SQL from any 2 versions.

Then you have many options : you can take those scripts and put them in your SVN with your app code so that it'll be deployed by your existing mechanism. Another option is to use the delivery mechanism of neXtep : scripts are exported in something called a "delivery package" (SQL scripts + XML descriptor), and an installer can understand this package and deploy it to a target server while ensuring structural consistency, dependency check, registering installed version, etc.

The product is GPL and is based on Eclipse so it runs on Linux, Mac and windows. It also support Oracle, MySQL and PostgreSQL at the moment (DB2 support is on the way). Have a look at the wiki where you will find more detailed information : http://www.nextep-softwares.com/wiki

Share:
43,050
pix0r
Author by

pix0r

Computer nerd

Updated on July 08, 2022

Comments

  • pix0r
    pix0r almost 2 years

    What are the best methods for tracking and/or automating DB schema changes? Our team uses Subversion for version control and we've been able to automate some of our tasks this way (pushing builds up to a staging server, deploying tested code to a production server) but we're still doing database updates manually. I would like to find or create a solution that allows us to work efficiently across servers with different environments while continuing to use Subversion as a backend through which code and DB updates are pushed around to various servers.

    Many popular software packages include auto-update scripts which detect DB version and apply the necessary changes. Is this the best way to do this even on a larger scale (across multiple projects and sometimes multiple environments and languages)? If so, is there any existing code out there that simplifies the process or is it best just to roll our own solution? Has anyone implemented something similar before and integrated it into Subversion post-commit hooks, or is this a bad idea?

    While a solution that supports multiple platforms would be preferable, we definitely need to support the Linux/Apache/MySQL/PHP stack as the majority of our work is on that platform.

  • pix0r
    pix0r almost 16 years
    Yeah, that's pretty much what we have in place right now. Unfortunately that doesn't give us an easy way to modify existing databases -- the SQL script generated by mysqldump assumes you're creating the table from scratch (or overwriting a table if it exists). We need something a bit more high-tech because it needs to apply a sequence of ALTER TABLE statements to the database, and in order to do that properly it needs to be aware of the current state of the database.
  • Osama Al-Maadeed
    Osama Al-Maadeed over 15 years
    The dump has to be in SQL, like a mysqldump, Oracle's dumps are binary.
  • Piskvor left the building
    Piskvor left the building over 14 years
    Ruckusing FTW - we adapted it to our db system and are quite happy with it.
  • psp
    psp almost 14 years
    There is also a more fundamental problem with schema diffing. How do you differentiate a column drop + add from a column rename. The answer is simple: you can't. This is the reason why you need to record the actual schema change operations.
  • deadprogrammer
    deadprogrammer almost 14 years
    The diff will show that the one column is gone, while the other appeared (unless they have the same name), and most of the time it's enough. Scripting every schema change is a good way to go, of course: in Drupal this is handled by a special hook, for instance.
  • Piskvor left the building
    Piskvor left the building over 13 years
    Looks interesting. Does it have command-line interface as well, or is one planned?
  • Asaf Mesika
    Asaf Mesika over 13 years
    How did you handle collisions? Multiple developers changing the same element in the DB, for instance a stored procedure? This can happen if you're working at the same on the same branch, or you have two development lines going (two branches)
  • schoppenhauer
    schoppenhauer over 13 years
    Collisions were very rare; the only thing that happened really is that two people would try to create the same N.sql file. Of course, the first one wins and the second one is forced to rename to the next highest number and try again. We didn't have the database versioning on a branch, though.
  • Mark Schultheiss
    Mark Schultheiss over 11 years
    Seems to be a dead URL as of this point in time.
  • Admin
    Admin almost 9 years
    It is now located at github: github.com/ruckus/ruckusing-migrations
  • Smith
    Smith almost 7 years
    how do you script out all changes?