How to efficiently manage frequent schema changes using sqlalchemy?

30,657

Solution 1

Alembic is a new database migrations tool, written by the author of SQLAlchemy. I've found it much easier to use than sqlalchemy-migrate. It also works seamlessly with Flask-SQLAlchemy.

Auto generate the schema migration script from your SQLAlchemy models:

alembic revision --autogenerate -m "description of changes"

Then apply the new schema changes to your database:

alembic upgrade head

More info here: http://readthedocs.org/docs/alembic/

Solution 2

What we do.

  1. Use "major version"."minor version" identification of your applications. Major version is the schema version number. The major number is no some random "enough new functionality" kind of thing. It's a formal declaration of compatibility with database schema.

    Release 2.3 and 2.4 both use schema version 2.

    Release 3.1 uses the version 3 schema.

  2. Make the schema version very, very visible. For SQLite, this means keep the schema version number in the database file name. For MySQL, use the database name.

  3. Write migration scripts. 2to3.py, 3to4.py. These scripts work in two phases. (1) Query the old data into the new structure creating simple CSV or JSON files. (2) Load the new structure from the simple CSV or JSON files with no further processing. These extract files -- because they're in the proper structure, are fast to load and can easily be used as unit test fixtures. Also, you never have two databases open at the same time. This makes the scripts slightly simpler. Finally, the load files can be used to move the data to another database server.

It's very, very hard to "automate" schema migration. It's easy (and common) to have database surgery so profound that an automated script can't easily map data from old schema to new schema.

Solution 3

Use sqlalchemy-migrate.

It is designed to support an agile approach to database design, and make it easier to keep development and production databases in sync, as schema changes are required. It makes schema versioning easy.

Think of it as a version control for your database schema. You commit each schema change to it, and it will be able to go forwards/backwards on the schema versions. That way you can upgrade a client and it will know exactly which set of changes to apply on that client's database.

It does what S.Lott proposes in his answer, automatically for you. Makes a hard thing easy.

Share:
30,657
kabdulla
Author by

kabdulla

Updated on July 08, 2022

Comments

  • kabdulla
    kabdulla almost 2 years

    I'm programming a web application using sqlalchemy. Everything was smooth during the first phase of development when the site was not in production. I could easily change the database schema by simply deleting the old sqlite database and creating a new one from scratch.

    Now the site is in production and I need to preserve the data, but I still want to keep my original development speed by easily converting the database to the new schema.

    So let's say that I have model.py at revision 50 and model.py a revision 75, describing the schema of the database. Between those two schema most changes are trivial, for example a new column is declared with a default value and I just want to add this default value to old records.

    Eventually a few changes may not be trivial and require some pre-computation.

    How do (or would) you handle fast changing web applications with, say, one or two new version of the production code per day ?

    By the way, the site is written in Pylons if this makes any difference.

  • Rick
    Rick over 13 years
    Just to comment, this is a horrible practice in the real world. It forces me to check the database vs the code for all kinds of settings like lengths allowed on strings, etc. I don't get any autocomplete or source checks from my IDE. I really discourage people from going all reflective in any project. It's just a mess to work with.
  • vonPetrushev
    vonPetrushev over 13 years
    "all kinds of settings like lengths allowed on strings" - can you elaborate on this? What kinds of settings? And, just because you don't get any autocomplete in ide doesn't mean that the approach is wrong. After all, we are dealing with python - not java/c++.
  • vonPetrushev
    vonPetrushev over 13 years
    Also, i have apps 'in the real world' that work with db reflection. With complex ERs. It's not that messy for me.
  • Rick
    Rick over 13 years
    If you have a field name on a model and I'm working on code and see person.name = 'really long string' and I want to check out what is the max length of name, I have to load up the db to go look because it's not going to be in my model as person = Column(Unicode(100))
  • vonPetrushev
    vonPetrushev over 13 years
    You don't have to 'load up the db' - (I don't know what exactly this means but you don't have to do it). After the reflection of the schema, all the info you need is stored in the metadata reflected with sqlalchemy. Specifically, to get the length of the column you need this: length = metadata.tables['tablename']._columns._data[u'columnname'].t‌​ype.length. metadata being previously reflected of course. Note that this not call introspection to the db again - it is already stored.
  • Dexter
    Dexter over 11 years
    I agree but it's a bit difficult to get it working with your flask app. I think that new code 'flask-alembic' is what I need to test out. Check this problem: stackoverflow.com/questions/14682466/…
  • Dexter
    Dexter over 11 years
    How do you define a major schema change? Is adding a table / new-model considered a major schema change even if compatibility didn't change? Do you rival firefox in version numbers? (joke)
  • Alan Hamlett
    Alan Hamlett over 11 years
    After running alembic init alembic and changing my db username & password in alembic.ini, I changed my alembic/env.py to this: gist.github.com/alanhamlett/4721073 That was all that was needed to get alembic working with my flask app. Hope it helps.
  • Dexter
    Dexter over 11 years
    my problem is that I can't do line 10 in your gist code. I can't import my models or my "db" variable. It won't let me do it in env.py.
  • radix07
    radix07 almost 9 years
    Had a similar issue getting alembic to work. Line 8 and 9 from Alan's gist are very important and not really mentioned in the documentation. But you need to add the Project to the system path to allow alembic to access your models.