Azure SQL Data Sync - My Story

This is a story about my experiences using Azure SQL Data Sync. You will learn how to use this to migrate a custom on-premise database to Azure.

Azure SQL Data Sync Use Case

Continuing my theme of migrating a BizTalk solution to Azure in recent posts, I need to move a custom database to Azure. The BTHub database is on the same SQL Server that our BizTalk Server databases is on. Furthermore, the tables are simple and it is about 3 Gb in size.

Firstly this database stores some BizTalk lookup data. These tables are typically small with less than 500 rows each. Secondly we store some metadata about different processing events in other tables. The metadata tables contain about 500, 000 rows each and we purge this data after a year. The main use of the metadata is for troubleshooting by support staff.

We plan to move the BTHub data to Azure slowly as we migrate different applications into Azure. Thus we want to keep data synchronized between your on-premises databases and Azure SQL databases. Hybrid data synchronisation is one of the sweet spots for Azure SQL Data Sync and this is why we have chosen this as part of our overall plan.

Before our migration starts we have a customer requirement to run SQL Reporting Service Reports against the BTHub tables. We could not allow this customer to run these against our on-premises database. Firstly uncontrolled queries could affect our BizTalk Server because it has databases on the same SQL server. As an aside Microsoft recommend that you should not have custom databases on the same SQL server instance as your BizTalk Server databases for the same reason. Notwithstanding, the BizTalk Health monitor raises a warning when it detects this situation. Thirdly the customer is not in the same domain as our SQL Server. Finally there is other data in the database that belongs to other customers.

One solution was to first create a customer specific view. Secondly replicate the database to another SQL Server. Thirdly create a SQL user that only has access to this view & finally run the SSRS report against the replicate database. Separating different workloads across different databases is another sweet spot for Azure SQL data services.

This approach minimizes the performance impact on your production workload. You can use Data Sync to keep these two databases synchronized.

In conclusion this solution achieves four things for us;

Prepares an Azure custom database for our BizTalk migration.
Allows us to move our custom database away from SQL Server instance that hosts our BizTalk databases. This will remove one warning from the BizTalk Health Monitor.
Allows a customer to set up reports against the custom database view without affecting the SQL Server performance for other users. More importantly this view only contains their data.
Allows us to move tables slowly to our Azure SQL database.

My Database Setup Steps

The Microsoft documentation that describes how to setup Azure SQL Azure Data Sync is pretty good. I will only try to highlight some gotchas here.

Prepare your source database and ensure any tables that you want to synchronise contain a primary key. In my case I had to add a primary key to some of the tables.
Prepare a SQL view and create a SQL user that can only access this view. This will ensure our customer can only see his data and nobody else’s.
Create a blank SQL database in Azure as described here. I choose the smallest database size S0 with 10 DTU’s. I talk about this choice a little bit later on.
Install Data Migration Assistant on the on-premise SQL Server. I think it is important to use this wizard to validate your database schemas first.
Open your firewall on the Azure SQL server and your on-premise SQL server as described here. The firewall on the on-premise SQL Server allows ports 1433. The firewall on the Azure SQL server allows the IP address of the other SQL server.
Follow the Microsoft instructions to migrate the on-premise SQL database schema to Azure. Take your time to fix all validation warnings and errors before attempting the migration. Trust me this step is important if you want to get a good final result. Finally I only migrate schemas and not the data in this step. Importantly, I don’t need all the data migrate immediately and I do not want to set up synchronisation right from the beginning.
Reset and enable passwords again for the SQL users on the azure database.

My Azure SQL Data Sync Setup steps

After deploying the database schemas to Azure we are now set to enable the data synchronization in the next steps;

Create an Azure Data Sync group using the Azure portal. A Sync group binds your Azure database called the hub and your on-premise database called a member. Follow these instructions but more importantly make sure you accept the recommendation to create a new Sync Metadata database. I did not and got into a terrible mess and had to start again. Notably you only get one chance to create a Sync Metadata database. Choose the name carefully and where it lives because you are only allowed one per region. I setup my first one up in my UAT environment with a name containing UAT. All subsequent Azure data sync group must use the same Sync metadata DB. Setting up production I had to unpick what I had done and start again using more suitable names and arrangements.
Download and install the client sync agent on the on-premise database server. I found it beneficial to do this before the next step in the Microsoft documentation.
Add the on-premises SQL Server database as per the documentation.
Choose windows authentication for the Sync agent.
Choose “To the Hub” for Sync direction because initially we want to synchronize data from on-premise to the Azure database. See later for a brief discussion about this choice.
Check that all tables display for both the hub and member database dropdowns on the configure sync group tab. If they do not display then you have an access issue and you must fix this before proceeding. Notably Hub database tables will not display unless you allow Azure services to access your Azure SQL server. I will give an example of a ARM template to do this later on.

7. Next select and manually synchronise the SupplierPO table first in the configure Sync screen. Next select and manually synchronous the SupplierPOStatus table. I did this because there is a foreign key in SupplierPOStatus from to SupplierPO. If I try to synchronise two tables like this to begin with you will get errors. These tables contain about 5000,000 rows each and the initial synchronisation took only ten minutes. At this stage I did not configure any other tables because the view built from only these two tables.

8. Enable automatic synchronization once the initial synchronisation is complete.

9. Finally the Azure data Sync screen looks like this.

Security Considerations

Follow the steps suggested by the security wizard. Restricting the IP addresses that can access the Azure SQL instance only allows our customer access. Next we created a SQL user for the customer that can only access the view.

Azure services allows applications from Azure to connect to your Azure SQL server. Synchronising your hub database requires this toggle to be turned on. This option configures the firewall to allow all connections from Azure including connections from the subscriptions of other customers. When selecting this option, make sure your login and user permissions limit access to only authorized users. You can add this to ARM template with something like;

{ “type”: “Microsoft.Sql/servers/firewallRules”, “apiVersion”: “2015-05-01-preview”, “name”: “[concat(variables(‘sql_server_name’), ‘/AllowAllWindowsAzureIps’)]”, “dependsOn”: [ “[resourceId(‘Microsoft.Sql/servers’, variables(‘sql_server_name’))]” ], “properties”: { “startIpAddress”: “0.0.0.0”, “endIpAddress”: “0.0.0.0” } },

Comments specific to this example

Gratifyingly the customer reported that they could run Select TOP (1000) from the view with a sub-second response. Our customer was in NZ and our Azure SQL database was in Australia. Thus our initial requirement is met and query latency is reasonable for this solution.

Furthermore the smallest Azure SQL database, S0 with 10 DTU is more than adequate for our needs. The beauty of this solution is that if we find that the we hit some resource constraint we can increase the number of DTU’s easily. in addition we get all the benefits of Azure SQL; automated backups, automated tuning for example.

What’s next

This prepares us for the next step, moving our custom database from the on-premise SQL server to the Azure SQL server. In this phase we foresee turning synchronisation on for the other tables to migrate the data. Additionally some of the tables , greyed out in the picture above do not have primary keys and the data will be migrated by export and import. This is not a big concern because some of these tables may be deprecated or contain few rows.

Once migration of the data is complete, BizTalk server will be reconfigured to use the Azure SQL custom database and Azure SQL Data Sync will be turned off. Preliminary test have shown that the latency between BizTalk server and the Azure custom SQL database is acceptable.

Finally let briefly consider why I think Azure Data Sync has to be turned off. There are three synchronisation modes, “To the Hub“, “Bi-directional Sync” or “From the Hub”. Could we choose “Bi-directional Sync” and have both or Hub and Member updating the custom database tables. It is unclear to me what would happen. One has to select whether the member or the hub wins , but what would happen if there is a conflict? Does some of the data get lost? I could not find any documentation that tells me exactly what would happen and thus I have decided to turn Azure SQL data Sync off in this phase.

The final phase of the migration is to remove BizTalk Server and now the entire integration and custom database resides in Azure.

Conclusions

This example shows how Azure SQL data sync can help you move a custom database to azure while your application remains on the ground. We used the example of a custom database on a BizTalk server database.

Using Azure SQL data sync was easy, cost effective and quick to set up.

Azure SQL Data Sync – My Story