Continuous Deployment Of Databases : Part 1

This is part 1 in my series on continuous deployment of databases. In this post I’ll be covering the logic behind the different deployment options.

Part 2 covers SqlPackage.exe.
Part 3 covers ReadyRoll.
Part 4 covers FlyWay.
VSTS Sql Server Database Deployment Task can be found here.

Continuous deployment of databases is a lot more complicated than applications so before we can look at continuous deployment of databases we have to look at managing database changes first. I’m going to give a high level overview of the problem space and methods to deal with database changes before looking at practical examples of continuous deployment of databases in the next post.

The Problem Space
A few of the challenges we face with database changes are:

  • Rolling back: Rolling back files is easy, just restore the previous version from source control, with databases you have to explicitly back them up before hand and restore them if needed and it can take a while for large databases and no one can work while you are restoring.
  • Down Time: It is easy to swap servers in and out of your load balancer to update them but it is not so simple with a database, even in a cluster. It is not impossible with databases just more work, there are solutions like sharding for instance but that is a post on its own.
  • State: Your application files do not contain state, you can recreate them easily while your database contains precious business data. You have to make changes to the structure and existing data without loosing anything. This leads to some teams hand crafting the change scripts for each release which is error prone.
  • Control: In large organisations DBAs are often the only people allowed to make changes to a database and they scrutinise the change scripts provided by developers before implementing them manually. You can’t have automatic migrations or automatically generated scripts executing against the database.
  • Drift: It is not unheard of that databases in production do not match the version developers are using the create the change scripts, especially with applications that are installed at the client premises.

Keeping Track Of Database Changes
The most important step to keep track of your database is adding it to source control, whether you are using specialised database tools or not. This will enable tracking the changes made to your database at specific points in time. Even just having the database create scripts committed each time a change is made will be better than nothing. If you are using the state/declarative method it will also help to solve conflicting changes, for instance two developers renaming the same column in a table at the same time will lead to a merge conflict in your source control system. You can also branch the database just like you do with application code to fix a serious production bug and then later merge the changes back into trunk.

Approaches To Implement Database Changes
Currently there are two ways to handle database changes, state/declarative and migrations, each one with its own strengths and weaknesses:

State/Declarative Method:
The state method works by comparing the target database to a source database and then generating a script to bring the target database in sync with the source database. This is best accomplished using a database schema comparison tool.
Pro: Any number of changes made in any sequence by team members will be included since the current state of the source database in source control is the source of truth not the steps taken to reach that state. Works great for databases during the development phase when you can tolerate dropping and recreating tables or columns and large teams making multiple changes at the same time. Less error prone since the scripts are generated by a tool instead of hand crafting them.
Cons: Cannot handle data migrations since the tool works according to the current state, it has no way of knowing how that state was achieved. For example if a new column was created by manipulating data from two existing columns.

Migrations Method:
This method applies a sequence of changes to a database to migrate it to the new state. With this method the migration scripts are the source of truth, the scripts are created by hand (there are exceptions see the comparison of tools later in the post) and keeping the correct sequence is very important.
Pro: Handles data migrations very well since developers can include any data manipulation required in the scripts.
Cons: Most of the time the scripts will be created by hand which is cumbersome and error prone. You have to take all the change scripts created during development and scrutinise them to weed out conflicting changes and you have to keep the correct sequence. For example developer A renames a column in table X and checks in the alter script, later developer B also renames the same column to something different without getting the changes from developer A and checks in his alter script. When you execute all the migrations the change from developer B will fail since the column exists under the new name. It is also cumbersome for all developers to keep their development databases in sync with the changes.

As you can see there is no clear cut solution to the problem and most tools will fix some of the problems but not all, for example Sql Server Data Tools will handle the state based method very well and Entity Framework Code First will handle migrations very well. Some tools like ReadyRoll try to combine both by generating a sequentially numbered migration script based on the current state of your database when you check-in a change.

Francois Delport

Published by

Francois Delport

I am a cloud and devops consultant, technology fan and previously a professional C# developer with a keen interest in system design and architecture. Currently I am involved in projects using Azure, the Microsoft stack and DevOps. I am based in Melbourne, Australia. Email: [email protected]

Leave a Reply

Your email address will not be published. Required fields are marked *