Thursday, December 13, 2012

0 MSSQL: Spilt SQL Server Datafile to multiple files

If you have one big primary data file and want to split into multiple small data files. How do we do it?

I think the first question we need to ask is why we want to do it? I can think of couple of reasons:

  1. We have the IO bottleneck on the IO subsystem, by split the files and place them to the different disk can improve the IO performance.
  2. We want to separate the certain Index to the different files so if we do need to perform the index maintenance, it wont impact other data as much.
  3. By diligent design and placing the table in the file group properly, we can perform the piece meal restore
  4. Migrate the database to the different Host and the target host does not have enough disk to sustain the biggest datafile in the source database.

Check the Data files and space used

Below query shows the current data file size and the space used.

Use AdventureWorks;
SELECT df.name ,df.size/128.0 - CAST(FILEPROPERTY(df.name, 'SpaceUsed') AS int)/128.0 AS AvailableSpaceInMB,df.size/128.0 as CurrenSizeinMB,mf.size*8192/1024 AS InitialSizeinKB
FROM sys.database_files df join
sys.master_files mf on df.name = mf.name and database_id = DB_ID()
go

 

image

DBCC SHOWFILESTATS

Show the space information in the extents context.

Use AdventureWorks;
DBCC SHOWFILESTATS;

image

Check tables/index belong to which file group

Below Query I get it from here and I modify a little bit to give me more information.  Meaning of sys.indexes type column (  0 = Heap,1 = Clustered,2 = Nonclustered,3 = XML etc)

select  t.name as TableName,  
        i.name as IndexName, 
        fg.name as FielGroup,
        i.type,
        i.type_desc,
        t.type,
        p.rows as Rows
    from sys.filegroups fg join sys.database_files df 
        on fg.data_space_id = df.data_space_id join sys.indexes i 
        on df.data_space_id = i.data_space_id join sys.tables t 
        on i.object_id = t.object_id join sys.partitions p 
    on t.object_id = p.object_id and i.index_id = p.index_id  
    --where fg.name = 'primary' and t.type = 'U'  
    order by rows desc

image

ADD new file group and new datafiles

 

USE [master]
GO
ALTER DATABASE [AdventureWorks] ADD FILEGROUP [SALES]
GO
ALTER DATABASE [AdventureWorks] ADD FILE 
( NAME = N'Sales_Data', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL10_50.SQLEXPRESS\MSSQL\DATA\AdventureWorks_SalesData.ndf' , SIZE = 3072KB , FILEGROWTH = 1024KB ) TO FILEGROUP [SALES]
GO

 

image

 

Here are couple ways to accomplish the mission:

Create/recreate the cluster index and place it into file group.

Each SQL Server table can be either heap or cluster index. On the leaf level of the cluster index is the data itself. It is not possible to have table with heap and cluster index. Therefore by recreate the cluster index on the different file group, we are moving the actual data to the different file group.

create the test table
SELECT *
INTO newsales 
FROM sales.Individual;
 
CREATE CLUSTERED INDEX IX_CustomerID
    ON newsales (customerID); 
GO

Check the filegroup size and the location of the filegroup
Use AdventureWorks;
SELECT df.name ,df.size/128.0 - CAST(FILEPROPERTY(df.name, 'SpaceUsed') AS int)/128.0 AS AvailableSpaceInMB,df.size/128.0 as CurrenSizeinMB,mf.size*8192/1024 AS InitialSizeinKB
FROM sys.database_files df join
sys.master_files mf on df.name = mf.name and database_id = DB_ID()
go
 
select  t.name as TableName,  
        i.name as IndexName, 
        fg.name as FielGroup,
        i.type,
        i.type_desc,     
        t.type,
        p.rows as Rows
    from sys.filegroups fg join sys.database_files df 
        on fg.data_space_id = df.data_space_id join sys.indexes i 
        on df.data_space_id = i.data_space_id join sys.tables t 
        on i.object_id = t.object_id join sys.partitions p 
    on t.object_id = p.object_id and i.index_id = p.index_id  
    where t.name='newsales'

image

Add new filegroup
USE [master]
GO
ALTER DATABASE [AdventureWorks] ADD FILEGROUP [SALES]
GO
ALTER DATABASE [AdventureWorks] ADD FILE 
( NAME = N'Sales_Data', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL10_50.SQLEXPRESS\MSSQL\DATA\AdventureWorks_SalesData.ndf' , SIZE = 3072KB , FILEGROWTH = 1024KB ) TO FILEGROUP [SALES]
GO
 
Recreate the cluster on the different filegroup.
USE [AdventureWorks];
CREATE CLUSTERED INDEX IX_CustomerID
    ON newsales (customerID)
    WITH(
        DROP_EXISTING = ON
        -- ONLINE = ON
        )
    ON SALES
GO

 

We can recheck the file group and cluster index and compare with previous screenshot.

The IX_CLustomer has moved to the SQLES file group and Primary file group now has 67MB free space ( used to be 43MB).

image

 

DBCC Shrink file emptyfile

Another way to move the data to other files is via DBCC SHRINKFILE emptyfiles. This would only move the data between data files in the same file group.

Below SQL add 2 new files in the primary file group.

ALTER DATABASE [AdventureWorks] ADD FILE 
( NAME = N'AdventureWorks_Data_new1', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL10_50.SQLEXPRESS\MSSQL\DATA\AdventureWorks_Data1.ndf' , SIZE = 3072KB , FILEGROWTH = 1024KB ) TO FILEGROUP [PRIMARY]
GO
ALTER DATABASE [AdventureWorks] ADD FILE 
( NAME = N'AdventureWorks_Data_new2', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL10_50.SQLEXPRESS\MSSQL\DATA\AdventureWorks_Data2.ndf' , SIZE = 3072KB , FILEGROWTH = 1024KB ) TO FILEGROUP [PRIMARY]
GO

 

Lets use DBCC SHRINKFILE with emptyfile option to move the data.

USE [AdventureWorks];
DBCC SHRINKFILE ( AdventureWorks_Data,emptyfile);

 

We can use DBCC SHOWFILESTATS to monitor the extend move status.

image

image

image

image

  • DBCC Shrinkfile can be cancel at any time. SQL Server engine will retain whatever change has been made.
  • Rebuild/Reorg index is recommend because after the data page move, the index could become highly fragment
  • If we run the DBCC SHRINKFILE on the first primary datafile, we would get the “
  • Msg 2555, Level 16, State 1, Line 1 Cannot move all contents of file "AdventureWorks_Data" to other places to complete the emptyfile operation.” This is because the primary file has contain the metadata and can not be moved but the file other data has been move to the new files.

image

Restore one big datafile to the target database with multiple ndf files.

When restore the database, the target database needs to have the same number of the datafile as source. ( look up the restore database syntax, with move option needs to have one to one mapping for the target location). In this case, we have to use above method to split out the data file and take the new backup. Then restore to the target.

0 comments:

Post a Comment

 

SQL Panda Copyright © 2011 - |- Template created by O Pregador - |- Powered by Blogger Templates