Azure SQL Data Sync Best Practice And Tips

In this post I’m going to share some Azure SQL Data Sync best practice and tips. Most of the information comes from the official documentation which is very comprehensive and quite a read. This is more of a summary for future me to avoid reading through all the documentation again. You can find my 2 other posts on Azure SQL Data Sync here part1 and part2.

Initial Synchronisation

It is best to start the sync group with only one database containing data and the others empty. If the other databases contain rows in the synced tables every row will be treated as a potential conflict during the initial sync. This will lead to back and forth traffic and extra processing to determine if there is a conflict. If rows exist with the same primary key it will be a conflict. The conflict policy will then be applied to resolve the conflict. This leads to even more back and forth traffic on every conflict row.

Queuing Up Synchronisations

If you are using automatic sync make sure your scheduled sync time window is greater than the time it takes to complete a sync. For example if a sync takes 30 minutes to complete but you have it set to sync every 10 minutes the syncs will queue up indefinitely.

Stale Databases And Sync Groups

If a database is offline for 45 days it will be considered stale and will be removed from the sync group. When you reconnect this database to the sync group it will be marked as “out of date”. To start syncing again you have to remove it from the sync group and add it again. This will trigger an initial sync which can take very long.

If there is a problem for instance a schema mismatch, preventing the sync group from syncing successfully to all databases for 45 days the sync group will become stale. There is no way to recover from out of date sync groups. You’ll have to delete the sync group and create it again.

Synchronisation Loops

Sync loops are basically a circular reference within the same database or across databases. It usually happens when you have multiple sync groups and they are configured to sync the same rows in a table. Every time a change from one sync group is synced to the hub the other sync group will see it as a modified row. The second sync group will perform a sync on the same row and the first sync group will then see that as a row modification and start a sync of its own. This will go on indefinitely. The same can happen if a single database is registered with more than one agent. The tool is capable of handling databases and tables that are part of multiple sync groups but a row should never take part in more than one sync group. If you have to use multiple sync groups apply row filters to limit the rows for each sync group.

Azure SQL Firewall

It may seem obvious but if you are syncing on-premise databases the sync agents have to connect to the hub and sync meta databases in Azure. You have to configure the Azure SQL firewall on both of them to white list your agent IP addresses.

Troubleshooting

SqlException Error Code: -2146232060 – SqlError Number:3952 Message: Snapshot isolation transaction failed…

For Azure SQL Data Sync to work  ‘Allow Snapshot Isolation’ and ‘Read Committed Snapshot’ must be on. In my case ‘Allow Snapshot Isolation’ was off and setting it using ALTER DATABASE statements failed but I was able to modify this setting in the properties window of the database in SQL Server Management Studio.

Francois Delport

How To Handle Schema Changes In Azure SQL Data Sync

In this post I will show you how to handle schema changes in Azure SQL Data Sync. It follows on my previous post which gave a quick overview of Azure SQL Data Sync.

How To Handle Schema Changes In Azure SQL Data Sync

Since schema changes are not handled automatically by Azure SQL Data Sync you have to perform some manual steps in the Azure Portal or automate the steps. I’m going to cover doing it manually but same sequence of steps can be automated using PowerShell cmdlets or direct calls to the REST API.

Refreshing The Schema

For Azure SQL Data Sync to pickup changes in the schema of Sync Group members you have to refresh the schema.

In the Azure Portal browse to your Sync Group.

How To Handle Schema Changes In Azure SQL Data Sync

 

Select properties and disable automatic sync to prevent errors while changes are taking place.

How To Handle Schema Changes In Azure SQL Data Sync

 

Select Tables and for each database in the Sync Group click on refresh schema to pickup any changes. Select the table and columns to sync and click on save.

How To Handle Schema Changes In Azure SQL Data Sync

Adding A New Column To Sync

There are two scenarios here:

  • Adding a new column that is empty or the same in all databases

In this case you create the column in all the databases with a null or default value that will result in the same rows across all the databases.

  • Adding an existing column with data that was not synced yet

In this case the column exists with possibly different data between databases. If the rows are different between databases your conflict resolution policy and client sync sequence will determine the end result. If the conflict resolution policy is set to “hub wins” the member databases will end up with the values from the hub. If the policy is set to “client wins” the last client to sync will set the values in the hub and those values will be synced to the other clients.

  1. Disable automatic sync.
  2. Add the column in all Sync Group databases.
  3. Refresh the schema for each database in the Sync Group.
  4. Enable automatic sync.

Changing the data type of a column

You can change the data type of a column if the change will not cause any data loss for instance changing int to bigint.

  1. Disable automatic sync.
  2. Change the column type in all Sync Group databases.
  3. Refresh the schema for each database in the Sync Group.
  4. Enable automatic sync.

If the change will lead to data loss you can treat it as removing and adding a new column to achieve the same outcome.

Changing the name of a column

Changing the name of column basically involves removing the existing column and adding a new one.

  1. Disable automatic sync.
  2. Change the column name in all Sync Group databases.
  3. Refresh the schema for each database in the Sync Group and select the new column.
  4. Enable automatic sync.

NOTE: You cannot change a column if it is used in a filter.

Deleting a column

  1. Disable automatic sync.
  2. Delete the column from all databases in the Sync Group.
  3. Refresh the schema for each database in the Sync Group and select the new column.
  4. Enable automatic sync.

NOTE: You cannot remove a column if it is used in a filter.

Francois Delport

Azure SQL Data Sync

In this post I’m going to take a quick look at Azure SQL Data Sync. What is does, how it works, scenarios where you would use it and recent updates.

What Does It Do

As the name implies you use Azure SQL Data Sync to sync data between MS SQL databases via Azure. The databases can be on-premise MS SQL or Azure SQL databases. Take note that this is data only, not schema changes or other database objects, just data in tables. The data sync can be configured to run on a schedule with the smallest recurrence of 5 minutes or it can be a manual sync. Individual tables and columns can be selected for sync.

How Does It Do It

Azure SQL Data Sync uses a hub and spoke model. The hub in this case is an Azure SQL database, the spokes or member databases are connected to the hub. Changes will flow from a member database to the hub and then to other member databases. Azure SQL databases can connect directly to the hub but on-premise databases have to use Sync Agents which you install on-premise. When you connect a member database you can choose the sync direction:

  • bi-directional
  • to the hub
  • from the hub

To handle conflicts you select a conflict policy, the options are, hub wins or client wins.

Usage Scenarios

By choosing different sync directions you can utilise Azure SQL Data Sync for a variety of scenarios:

  • Bi-directional sync for all members can be used to keep all connected databases and the hub in sync
  • Syncing to the hub for all members can be used to collect all records from member database into the hub. For example creating a central aggregated database for reporting
  • Syncing from the hub for all members can be used to create a single source of truth in the hub. Member databases will have a local copy of the data
  • Setting one member to sync to the hub and another to sync from the hub will keep the read only member database in sync with the transactional member database. This is handy when moving a database that is constantly being updated into Azure SQL. You can continue to use the on-prem database and make the switch with very little down time and without restoring from a backup or bulk copy procedure. UPDATE: Transactional Replication is now available for Azure SQL and is the preferred method to sync database changes in a migration scenario link.

It is important to remember Azure SQL Data Sync will sync only data, schema changes are not handled automatically by the tool. In a future post I will dig deeper into handling schema changes.

Azure SQL Data Sync Update

Previously Azure SQL Data Sync was only available in the old Azure portal. After the update it is now available in the new Azure portal. It also received PowerShell cmdlets and Rest API enhancements, previously everything had to be done manually in the UI. The hub used to have a shared database maintained by the Azure SQL Data Sync service. Users will now have to provision their own database in their own subscription and use it as the hub.

Resources

The Azure SQL Data Sync team did a very good job providing comprehensive documentation which can be downloaded here link.

Francois Delport

Installing SQL Server Reporting Services On A Client Operating System

In this post I’m going to cover installing SQL Server Reporting Services on a client Operating System.

In my previous post I showed how to use a service principal to call the Azure Billing API without any user interaction. I used the results from that scheduled API call to populate a local database that I’ll be using for reporting. Technically you could call the API every time passing it the required date range and report on that but it is much quicker to do it locally. Secondly I wanted a way to schedule the delivery of the chart/report using SSRS instead of browsing to the report every day. The goal for all of this, is to catch any upwards trend in my Azure usage instead of waiting for my Azure spending limit to kick in.

Side Note: You could use Billing Alerts, it is currently in preview but it is a bit simplistic, you can only set alerts on a specific amount not an amount for a specific interval other than your billing period. For example, I create an alert for $1000 since my average monthly usage is $900 and one of my team members forget to shut down a few VMs while testing. If I used alerts a few days would pass before I pick it up and the $1000 limit is reached. It would be useful if I could set a daily limit which would be based on my average usage plus a bit, this way if it goes over for example $50 per day I will get an alert.

Anyway back to the actual topic of this post. It has been a while since I used SQL Server Reporting Services and it has been even longer since I installed it. I decided to install SSRS on my dev machine to test the scheduling of my Azure usage report. I ran into a few problems that I want to document for future use. Bear in mind this was a Windows 10 pc not a server, you won’t have most of these problems if you are using a server OS. Most of the solutions around permissions was covered already thanks to this post.

  • First problem occurred at the very end of the installation, I had an existing SQL instance and I added SSRS to it.
    Installing SQL Server Reporting Services On A Client Operating System

I was using the local administrator account to run the installation and the user had permissions to modify the Temp folder. Turns out you can’t do much about it at this point just continue with the installation and fix the other errors related to this later on.

  • Second problem was the SSRS service wouldn’t start, I got “Access Denied” error messages when I start the service. To fix this give the account running the SSRS service permissions to the:
      “C:\Program Files\Microsoft SQL Server\MSRS12.MSSQLSERVER\Reporting `    
       Services”
    folder.
  • Not really a problem but keep in mind when you configure SMTP settings SSRS needs an account name with access to your SMTP server, you can’t use Gmail, Hotmail etc, which makes sense, this will usually be used in an enterprise environment. If you want to test emailing on a schedule and you don’t have a valid account I recommend smtp4dev or setup your own SMTP server in IIS.
  • I also saw the following errors in the event log
    “Report Server (MSSQLSERVER) cannot load the TERADATA extension.”
    and
    “Report Server (MSSQLSERVER) cannot load the SQLPDW extension.”
    These won’t prevent you from working but it does spam the eventlog, making it difficult when you are trying to troubleshoot real problems. This link will explain how to disable the plugins if you don’t need them or how to install the software components if you need them.
  • You will also have trouble loading the SSRS website as administrator, this link explains how to fix that and also administrator access for SSMS.

Side Note: The SSRS version you get with SQL Express doesn’t support emailing and scheduling reports. I did a quick google and found some free solutions that email Crystal Reports on a schedule. I didn’t try it myself but it is something to keep in mind for home use. I also tried to use PowerBI but it doesn’t have the ability to schedule and email reports.

Francois Delport