Monthly Archives: September 2013

T-SQL Tuesday #46 : Table Partitioning to Archive

TT150x150_thumb

This is my first T-SQL Tuesday post. Thanks to Rick Krueger (blog)  for hosting.

On hearing this month’s subject I googled around a bit for the ‘official’ definition of a Rube Goldberg Machine. Wikipedia  has it as  “…a deliberately over-engineered or overdone machine that performs a very simple task in a very complex fashion, usually including a chain reaction.” I guess I’m not the only one this Tuesday who ran across this video from OK Go which is an entertaining example.

If you are at all like me (read DBA nerd), you tend to see any block-like units from legos to shipping containers as somehow representing values in the all-encompassing, cosmic database schema of reality (there is one right?). So perhaps you’ll understand how the first scene of the video -the falling dominoes – as well as the ‘chain reaction’ in the wikipedia definition got me thinking about table partitioning. More specifically, a sliding-window type implementation of table partitioning I once configured to age-out\ roll-off older date-based data from an active oltp table into a sibling report table and so enforcing some segregation of conflicting query types.

Of course, there are a number of simple ways to age-out old data (e.g. TSQL insert\delete, SSIS, etc) and table partitioning may be more typically thought of as a complex solution applied to large data-sets and enterprise infrastructure. Depending on the situation, table partitioning just might be overkill … Rube Goldberg indeed. However, several features of table partitioning really may make it an neat solution regardless of the size of the data and hopefully the following examples will bring them to light.

At this point I want to mention that table partitioning is an enterprise feature. Also, thanks to Kendra Little and her post here – those ‘helper views’ are definitely helpful.

Gathering the material and building the contraption:

Example Tables:

This is our table of date-based data, product.

CREATE TABLE product (
asofdate datetime2, descr varchar(100)
)

And here we populate it with records for every minute of every hour for the last 24 hours (I chose to partition by the hour for this post but it could be any unit of time). In the real world there could be any number of records for any hour but to keep it simple for this example each hours-worth of data will have 60 rows.

DECLARE @start datetime2 = DATEADD(d,-1,SYSDATETIME())
DECLARE @inc int = 1439

INSERT INTO product
VALUES (@start,cast(CRYPT_GEN_RANDOM(50) as varchar(100)))

WHILE @inc > 0
BEGIN
INSERT INTO product
SELECT TOP 1 DATEADD(MI, @inc, asofdate),
CAST(CRYPT_GEN_RANDOM(50) as varchar(100))
FROM product order by asofdate 
SELECT @inc = @inc-1
END

As each hour goes by, the oldest data in the product table become 25 hours old. This is the data we wish to age-out and roll off to an archive table.

CREATE TABLE archive (
asofdate datetime2, descr varchar(100)
)

Partition Functions:

Next, we need two partition functions, one for each table.

For the product table we start with a single range value 25 hours prior to the last date in the product table to set an initial boundary.

DECLARE @StartHour datetime2 = DATEADD(hh,-25,(SELECT MAX(asofdate) FROM product));
CREATE PARTITION FUNCTION product_pf (datetime2)
AS RANGE RIGHT FOR VALUES (@StartHour);
GO

I find visualizing the partition ranges is very helpful — here is simple text diagram to illustrate the range values of the partition function at this point. The bar represents the boundary (aka range value) and the number represents the the hours since the newest value in the product table, i.e the hours-old.

older than 25 | 25 and newer

Next we loop a split operation to create range values (i.e. boundaries) for each hour of the day to build the partiton function into something like this …

older | 25 | 24 | 23 | … | 2 | 1 | newer

DECLARE @StartHour datetime2 = DATEADD(hh,-24,(SELECT MAX(asofdate) FROM product));
DECLARE @i int =0
WHILE @i<>25
BEGIN
ALTER PARTITION FUNCTION product_PF() 
SPLIT RANGE (DATEADD(hh,@i,@StartHour));
SET @i=@i+1
END

We can examine our partition function to be sure it has the range values we want …

SELECT pf.name, rv.boundary_id, rv.value, 
DATEDIFF(hh,convert(datetime2,rv.value),(SELECT MAX(asofdate) FROM product)) AS 'HoursOld'
FROM sys.partition_range_values rv
JOIN sys.partition_functions pf 
ON rv.function_id = pf.function_id
WHERE pf.name = 'product_PF'

….and we can add another – or another few – as needed. There is no harm, and in fact you are building in a margin of safety for later split operations, if you create extra boundaries on the leading end.

older | 25 | 24 | 23 | … | 2 | 1 | 0 | -1 | -2 | newer

--find the max range value 
DECLARE @maxpartbound datetime2
SELECT @maxpartbound = CAST(MAX(value) as datetime2)
FROM sys.partition_range_values rv
JOIN sys.partition_functions pf 
ON rv.function_id = pf.function_id
WHERE pf.name = 'product_pf'

--add a hour to the max range value and split off a new range
ALTER PARTITION FUNCTION product_pf() 
SPLIT RANGE (DATEADD(hh,1,@maxpartbound))

For the archive table we initially only need 2 range values to define the boundary between data which is 23 hours old (and newer), 24 hour old data, and data older than 24 hours.

older than 24 | 24 | 23 and newer

As you will see, once we apply this function to the archive table the partition housing the 24-hour-old data will always be empty since that data is still part of the product table, however it needs to be there as a place-holder when the 24-hour-old data turns 25.

DECLARE @StartHour datetime2=DATEADD(hh,-24,getdate());
CREATE PARTITION FUNCTION archive_pf (datetime2)
AS RANGE RIGHT FOR VALUES
(@StartHour);

Partition Scheme:

We require two partition schemes – one for each partition function. In short, the partition scheme maps each table partition to a file group, ideally so as to spread the IO across multiple disks. Although it might be nice, in this scenario where we are solely focusing on an archiving solution, we don’t necessarily need multiple disks, we need a simple, efficient, repeatable archive process. When this came to light for me I was very excited since it simplifies things to a large degree and makes this otherwise complex feature a Rube Goldberg viable solution. The partition schemes will map all partitions to the primary filegroup.

--create the product partition scheme
CREATE PARTITION SCHEME product_ps 
AS PARTITION product_pf
ALL TO ([PRIMARY]);

--create the archive partition scheme
CREATE PARTITION SCHEME archive_ps 
AS PARTITION archive_pf
ALL TO ( [PRIMARY] );

Indexes:

Lastly we need two clustered indexes – one for each table. Building the cluster index on the table using the partition scheme effectively applies the partitioning to the table.

CREATE CLUSTERED INDEX [cix_product-asofdate]
ON product(asofdate ASC ) 
ON product_ps(asofdate)

CREATE CLUSTERED INDEX [cix_archive-asofdate] 
ON archive(asofdate ASC )
ON archive_ps(asofdate)

Tipping the domino:

As each hour passes the oldest data in the product table becomes eligible for archival. To archive that partition we’ll alter the partition functions to split off new boundaries or merge old ones and use the switch argument of the alter table command.

As a point of reference here is a diagram of the usual, between-the-hours state of partitioning in the system. This is T1.

T1

T1 – Standard configuration

At T2 the oldest data in the product table is 25 hours old and ready to be moved to the archive table. 

T2

T2 – At the turn of the hour

This is the time to initiate the archive process beginning with the switch.

Switch:

T3

Post switch

To run the switch we need a source and target partition number from the product and archive tables. We can get that via a partition function but first we need the minimum range values of the product table’s partitions. While we’re at it we can also grab the minimum range values of the archive tables partitions which we’ll use later for the merge.

--find the min archive boundry -will use it to merge later
DECLARE @MinArchiveBoundry DATETIME2(0);
SELECT @MinArchiveBoundry = CAST(MIN(rv.value) AS DATETIME2(0))
FROM sys.partition_functions  pf
JOIN sys.partition_range_values rv 
ON pf.function_id=rv.function_id
where pf.name='archive_pf'

--find the minimum range partition in product so we can pass it to the $PARTITION function
DECLARE @MinProductBoundry DATETIME2(0);
SELECT @MinProductBoundry = CAST(MIN(rv.value) AS DATETIME2(0))
FROM sys.partition_functions  pf
JOIN sys.partition_range_values rv 
ON pf.function_id=rv.function_id
where pf.name='product_pf'

--use $PARTITION function to figure the correct partition numbers based on the minimum range val in product table
declare @ProductPartitionToArchive int, @ArchivePartition int
select @ProductPartitionToArchive =$PARTITION.product_pf(@MinProductBoundry)
select @ArchivePartition =$PARTITION.archive_pf(@MinProductBoundry)

--switch details to archive
ALTER TABLE product SWITCH PARTITION @DetailPartitionToArchive  TO archive PARTITION @ArchivePartition;

 Merge:

T4

Before merge

Using the same variables we set as minimum range values during the switch we can merge the trailing partitions of each table.

--merge archive trailing partitions
ALTER PARTITION FUNCTION archive_pf() MERGE RANGE (@MinArchiveBoundry);

--merge detail tail ends
ALTER PARTITION FUNCTION detail_pf() MERGE RANGE (@MinDetailBoundry);

T5

Post merge

Split:

T6

Pre split at T6

It is expensive to split a partition containing data since it results in disk IO so we always want to split empty partitions. Since we lost a partition boundary in the merge, the last step of the archive process is to split a new partition range off the leading (and empty) end of each table. This results in an empty place holder partition and prepares the table for the next switch-merge-split. Note the ALTER PARTITION SCHEME call here which would normally set the next file group – in our case its not necessary since all partitions are on the primary file group but I have included it.

--split off some new empty future partitions
ALTER PARTITION SCHEME archive_ps NEXT USED [PRIMARY]
--find the most recent boundary
 DECLARE @MaxArchiveBoundry DATETIME2(0);
SELECT @MaxArchiveBoundry = CAST(MAX(rv.value) AS DATETIME2(0))
FROM sys.partition_functions  pf
JOIN sys.partition_range_values rv 
ON pf.function_id=rv.function_id
where pf.name='archive_pf'
--add one hour to the most recent boundary and split the partition
ALTER PARTITION FUNCTION archive_pf() 
SPLIT RANGE (DATEADD(hh,1,CAST(@MaxArchiveBoundry AS DATETIME2)))

ALTER PARTITION SCHEME product_ps NEXT USED [PRIMARY]
--find the most recent boundary
DECLARE @MaxProductBoundry DATETIME2(0);
SELECT @MaxProductBoundry = CAST(MAX(rv.value) AS DATETIME2(0))
FROM sys.partition_functions  pf
JOIN sys.partition_range_values rv 
ON pf.function_id=rv.function_id
where pf.name='product_pf'
--add one hour to the most recent boundary and split the partition
ALTER PARTITION FUNCTION product_pf() 
SPLIT RANGE (DATEADD(hh,1,CAST(@MaxProductBoundry AS DATETIME2)))

At T7 the split has been applied to both partitioned tables and we’ve come full circle back to the standard operating configuration.

T7

 

Well, there you have it…

…a walk through and working example of a simple data archive solution using table partitioning and a Rube Goldberg contraption in its own right.

If only I had just a little more time I’d put a trigger on the archive table that inserted a row to another table where the insert fires-off a service broker message that runs a procedure which fails-over a mirror session which generates an alert which cause a job to run which …..  …. a yoga-ball which turns on the coffee maker which kicks-off sp_send_dbmail and finally, finally, finally sends you an email to say…..

“Thank you for reading my post!”