Category Archives: HA|DR

Distributed Availability Groups Illustrated

Remember Database Mirroring?

I do. Vaguely.

In particular, I remember working with Database Mirroring (DBM) as part of an interesting -and complex- HADR setup. Visualize a 4-node windows fail-over cluster in Data Center 1 with SQL Server installed on each node (lets call it a 4XActive). We have high-availability covered here via the clustered instance. Given an outage for any given node, the SQL instance will move (fail-over) to another node.

On the other side of town, in Data Center 2, is another 4XActive.

Now, here’s the DR part. Every database on each instance is replicated from Data Center 1 to Data Center 2 via DBM. To explain, here’s a ppt slide I’m borrowing from my recent presentation on this stuff:

This worked! Pretty well in fact. But of course, knowing what I know now, I clearly remember the limitations and paint points:

  • Single point of failure in the cluster’s shared disks hosting database files
  • No “readable” replica for scale-out reads (we use transnational replication to create a report instance)
  • No automatic fail-over – yeah DBM has auto-fail-over capability but I’m thinking more about application connections… it was not easy to quickly re-direct connections from DC1 to DC2.
  • No grouped database fail-over (in this case, up to 4 database were related (tenant architecture) and there was no easy method of failing them over a unit)

I bet you saw this coming, but the answer to these limitations -and others like them- came along when SQL introduced Availability Groups.

HADR and Availability Groups

A basic availability group will solve for HA via a synchronous fail-over replica partner, and scale-out reads via a readable replica. But it gets a bit more complicated if you need to build-in DR as well. Here’s another slide to illustrate one option to add DR to your HA using Availability Groups:

In this case we add a cluster node in a remote data center to a standard AG configuration. This is the DR node (and it might also serve dual duty as the read-replica). Seems legit, right? But what are the drawbacks? To my thinking, there is a lot of management complexity. Consider the configuration and administration of a cluster over WAN and multi-domain and\or multi-subnet clusters.

Which bring us to …. Distributed Availability Groups

What if you could have the best of both worlds? Local availability groups in both the production and DR data centers, optional scale-out reads on the DR side, and replication between the two?

Tada!

Check it out, that is just what we get with Distributed Availability Groups.

Thanks for reading, please stay tuned for more on this topic…

 

Overview: SQL Database Mirroring High-Availability with Automatic-Failover

A High-Availability with Automatic-Failover SQL Database Mirroring Configuration consists of a Principal Instance and Database, a Mirror Instance and Database, and a Witness Instance.  Each instance must have a Database Mirror Endpoint – a special TCPIP endpoint with defined listener IP and port. A mirroring session is configured between these servers utilizing the dedicated mirror endpoints.

A ‘ping’ request bounces around between the 3 servers as a ‘heart-beat’ or ‘is-alive’ type of check.  Given a failure of the ‘is-alive’ test any 2 servers still communicating can form a quorum.  In the case where the quorum members are the Mirror Instance and the Witness Instance these two instances can agree to automatically fail-over the mirror session so that the Mirror Instance becomes the ‘new’ Principal Instance.

The following outline and accompanying diagram illustrates the transaction flow in a High Availability with Automatic Failover SQL Data Mirror Configuration:

synch DBM diagram 

  1. Transactions execute against a mirrored database on the Principal Instance
  2. Before the transactions are written to the transaction log, they persist in the log-buffer memory space and here is where Database Mirroring picks them up.  The transactions remain here – unhardened – until the mirror process completes.
  3. The transaction data is sent over the wire to the Mirror Instance via the Database Mirror Endpoint.
  4. The transactions land in the Mirror Instance log buffer and …
  5. … are hardened to the Mirror Instance Transaction Log.
  6. An acknowledgement message that the transaction is committed to the Mirror SQL Instance t-log is sent to the Principal SQL Instance. Also, the transactions are applied to the Mirror Database as the Mirror Instance Transaction Log is continuously restored to the Mirror Database.
  7. Finally, once the ‘ACK’ that the transaction is hardened on the Mirror Instance, the transaction is written to the Principal Instance T-log.