Database Migration Tips
Background
Our database must have a schema that matches the expectations of the code. If these expectations are not satisfied, the application may attempt to access fields or tables that do not exist or have different definitions than expected.
We maintain a series of "migrations" that apply DDL and DML sequentially to a database to align it with the current code expectations. These migrations are generated by Atlas, stored in git, and referenced from a table that Atlas maintains in the target database. These files, along with the database state, allow Atlas to advance the database through migrations safely.
This sounds straightforward, but it can become complicated quickly. Git histories can be non-linear, but when we push to production, the migration files must be consistent with the expectations of the code being pushed. In practice, this requires developers to carefully manage migration files, with special attention to ensuring compatibility between migrations.
How do we manage schema migrations?
We are using Atlas (non-Pro) to manage database schema migrations.
When you make a change to the SQLAlchemy tables in models, run this command to generate migration files for Postgres:
The files will be created in migrations/sa_postgres. Commit any new files alongside your changes to the models. Migrations in the Railway environments happen automatically upon deployment. See railway.json and Dockerfile.railway for more information.
Note: Your pushes will fail if you have changed the models but have not generated migrations. There is a GitHub Action check in migration.yamlto ensure that pushes have migrations that match the models.
Then, apply them to your local Postgres instance:
Note: Do not edit the migration files generated by Atlas. If you need a custom migration, see below.
How do I add a new field to a SQLAlchemy table?
- Add the new field to the SQLAlchemy definition.
- Run
task make-migrations
. - If your server is running, kill it.
- Run
task start
. This will apply your new migration automatically.
Don't forget to git add migrations
to ensure your new migration file and updates to atlas.sum
are committed.
How do I hand-craft a data migration?
Please try to avoid hand-crafted data migrations. If you must, here is how you can do so:
- Run
task make-manual-migration
. - Paste in your tested migration.
- Save and quit.
- Apply it against your local database with
task apply-migrations
. - If it performs as expected, run
git add migrations
.
How do I resolve a conflict on atlas.sum?
The specifics may vary, but the workflow below should resolve most migration conflicts.
The most common case is:
- Two developers make changes to the database schema on the same day.
- Each developer runs
task make-migrations
to generate a migration file. The filenames are timestamps, so they likely won't conflict with each other. - Each developer has an updated
atlas.sum
file. - Both developers push a PR for review. One gets reviewed and merged before the other.
- When the second PR is ready for merge, the conflict on
atlas.sum
will block the merge. The developer must now resolve this conflict.
The general solution for the owner of the 2nd PR is:
-
During the merge, "accept theirs" to resolve the conflict on
atlas.sum
. This drops your branch's changes so the atlas.sum file matches the contents in the main branch. Commit the merge to the development branch. -
Identify the name(s) of the migration file(s) no longer referenced by atlas.sum. Then, use the Atlas CLI to hash and rebase the migrations to update the Atlas sum file. For example:
How do I rollback a migration in my local development database?
In your local development database, often the easiest approach is to manually revert your migrations/ directory to match
the main branch and reset your development database with task drop-databases
.
How do I rollback a migration that has been pushed to main or to the production database?
Outside of local development environments, we do not rollback migrations. We only advance them forward.
All commits to the main branch are pushed to production automatically via a continuous deployment pipeline. Therefore,
any changes to the migrations/ directory that reach the main
branch on GitHub should be assumed to have been pushed to
production.
To revert a migration, you must create a new migration that does the opposite of the migration you wish to rollback. For
example, if you incorrectly added a column, you must add a new migration that removes that column and push it out.
Similarly, if you made a mistake on a hand-crafted data migration, you must fix it with another hand-crafted data
migration. The task make-manual-migration
helper can be useful for this purpose.