Archive for the ‘SQL Server’ Category

Exploring backup read io performance

January 24, 2010 Leave a comment

I was recently exploring how to increase the backup read throughput on one of our SQL servers. Below are some interesting facts i found.

I would say that one of the most important reminders that came from the exercise is, do not assume that 2 databases being backup up on the same server using an identical backup command means that the processes are identical under the hood.

  1. Backup read threads are spawned 1 per physical device used by the database. (This is documented in Optimising Backup & Restore Performance in SQL Server)
  2. Multiple database files on 1 disk will not increase throughput because 1 disk = 1 thread and the thread works through the database files 1 at a time.
  3. Backup read buffers are evenly distributed across the number of read threads.
  4. Backups are pure IO operations, they do not read pages from the buffer pool.
  5. When passing in @MaxTransferSize it appears to be a suggestion rather than implicit and SQLServer will assign the requested value if it can otherwise it can pick another lower value.
    So, quite a few statements there… Where is the proof? Well the best find had to be trace flag 3213 which outputs information regarding backup decisions made. Below is an extract of this output which i will then talk through.
    2010-01-22 12:00:02.45 spid78      Backup/Restore buffer configuration parameters
    2010-01-22 12:00:02.45 spid78      Memory limit: 32765MB
    2010-01-22 12:00:02.45 spid78      Buffer count:               40
    2010-01-22 12:00:02.45 spid78      Max transfer size:          448 KB
    2010-01-22 12:00:02.45 spid78      Min MaxTransfer size:       64 KB
    2010-01-22 12:00:02.45 spid78      Total buffer space:         17 MB
    2010-01-22 12:00:02.45 spid78      Buffers per read stream:    10
    2010-01-22 12:00:02.45 spid78      Buffers per write stream:   8
    2010-01-22 12:00:02.45 spid78      Tabular data device count:  4
    2010-01-22 12:00:02.45 spid78      FileTree data device count: 0
    2010-01-22 12:00:02.45 spid78      Filesystem i/o alignment:   512
  • Red covers the max transfer size. I actually asked for 1mb but only got 448 KB. Additionally i also noticed that where i kick of multiple backups (but all requesting 1mb) that the transfer size tends to decrease the more backups you have. So, no one backup is necessarily the same.
  • Green covers buffer distribution. So, i asked for 40 buffers. The database being backed up has data devices on 4 physical disks so gets 4 read threads. Buffers per read stream is 10 which is (40 buffers / 4 threads).
  • Blue covers read threads. The database backed up had data files on 4 physical disks. This is exposed as the Tabular data device count and confirms the statement in point 1 that you get 1 read thread per physical device as documented by MS.

So, what about statements 2 & 4? Well, i monitored the reads to the individual files using  sys.dm_virtual_io_file_stats and took a number of snapshots whilst performing a backup. There are plenty of scripts you can download to take the snapshots yourself such as this one. Once the backup completed i looked at the time slices and you can see the following.

  • Total mb read during backup = total data held in the file. From this i drew the conclusion its not reading any of the data held in the buffer pool which makes a lot of sense as the backup includes the transaction log.
  • Querying the statistics at different time intervals you see the first datafile MB’s growing and then the second data file mb’s don’t start growing until first is complete hence its going 1 file at a time. However, if you have multiple files on multiple disks you do see 1 file on each disk being read from. I’ve not mentioned increasing the number of backup devices and changing block sizes as my case specifically did not call for it but you can read about that here.

The last thing i want to say since i have touched on single threaded backup reads is that i`m keen not to spawn any new urban legends. Whilst this is true for the backups in the context of per physical disk device, that’s it! Its worth reading this article about urban legends around SQL threads.

Categories: Backup, Performance, SQL Server

The evils of implicit conversions

January 18, 2009 Leave a comment

I wanted to put up a brief post showing the impact of an implicit conversion on the performance of a query (or not……). In the example i will show an implicit conversion negatively impacting query performance and an implicit conversion that does not impact performance…..

So, we need to setup the test environment using the code below.

--********      Create Test Data           ******************
CREATE TABLE #data_test(ukey INT IDENTITY(1,1) PRIMARY KEY , first   VARCHAR(200),second VARCHAR(200))
DECLARE    @first INT@second INT
@first = 1
WHILE @first < 250000
SELECT  REPLICATE(@first,@first), REPLICATE(@first,@first)
SELECT  @first = @first +1
stuf_1 ON #data_test (first)
--********      End of Test Data            ******************


Now, with the test data in place we can run the following 2 queries and observe the differences.

/* This uses a variable declared as an NVARCHAR */
EXEC sp_executesql N'SELECT * FROM #data_test WHERE first = (@p0) ', N'@p0 nvarchar(200)',@p0 = N'1'
--Scan count 1, logical reads 3093, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

/* This uses a variable declared as an VARCHAR */
EXEC sp_executesql N’SELECT * FROM #data_test WHERE first = (@p0)’, N’@p0 varchar(200)’,@p0 = ‘1’
–Scan count 1, logical reads 7, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Below each statement is the io incurred and the difference on this tiny little table is > 3000 IO’s just because we used unicode (nvarchar) instead of non-unicode (varchar) and i`m sure you can imagine that on a larger table this becomes a significant overhead. So, why has this happened? Lets take a look at the plans.

Read more…

Categories: Performance, SQL Server

The overhead of a non-unique clustered index

October 20, 2008 Leave a comment

So, we all know that if we create a clustered index that is not unique that we will incur a 4 byte overhead right? Well not always because as usual, it depends….. Geeked

When you create a non-unique clustered index SQL server must maintain uniqueness so it adds a hidden 4 byte column which is populated for each non-unique row (not every row) but what many people may not realise is that this is actually a variable length column so if your table has no variable length columns you have to incur another 4 bytes to maintain the variable offset data giving you a total of 8 bytes per row instead of 4 bytes.

A few bytes may not sound much but when dealing with multi billion row tables it soon adds up so its important to know how the space consumption breaks down.

Below is an extract from Books Online 2005 "Estimating the size of a clustered index"

"The uniqueifier is a nullable, variable-length column. It will be nonnull and 4 bytes in size in rows that have nonunique key values. This value is part of the index key and is required to make sure that every row has a unique key value."

Its great to see that this hidden column is now documented but a bit of additional clarity around its potential variable length property storage overhead would nice.

Finally I thought I would visual this hidden data overhead for you with a screen shot from Danny’s awesome Internals Viewer


Update: Thanks to Christian Bolton for clarification that the overhead is for each non-unique row which i have now reflected in the post.

Changing the Data Files Location after Installation

August 10, 2008 Leave a comment

The other day i wanted to change the “Data Files” location for a 2005 database engine installation and a 2005 Analysis Services installation which you can specify under the advanced options during installation. I quickly found out that there appears to be no documented ways to do this other than uninstall SQL Server and install again specifying a new location for data files. It’s also not as simple as moving your system databases as “Data files” covers things like server errors logs, sql agent logs, replication default directory etc. So, as the uninstall route was not one i was prepared to go down i sat down and worked out how to do it and below are the results.

Read more…

MS SQL Server Book of Wisdom

May 21, 2008 Leave a comment

I was chatting with a friend today and he asked “Have you ever seen those little books of wisdom?”. We quickly decided that we could write a MS SQL Book of Wisdom, Below is a summary of what ensued for your amusement. Now, some of the statements are actually based on bad real life advice and many we just made up. Can you tell which is which? Also please comment if you have got any good entries for the Book……

Categories: SQL Server, Uncategorized

Transaction log backup deadlock

February 23, 2008 1 comment

Recently we started to see deadlock errors when backing up our transaction logs. The "important" part of the error is shown below.

Could not insert a backup or restore history/detail record in the msdb database. This may indicate a problem with the msdb database. The backup/restore operation was still successful.

What this meant was that the transaction log backup was occurring but the entry in the msdb was not being made as it was being chosen as a deadlock victim so we investigated the cause of the problem as we had some processes that used this information to copy transaction logs to other servers and we needed it to be complete.

We used trace flag 1222 to output the deadlock information to the error log and found the culprit to be a Microsoft stored procedure called "sp_delete_backuphistory" that is called by SQL 2005 maintenance plans when you use the "History cleanup task" and tick the "Backup and Restore History". Having a look at the stored procedure it was obvious why it was deadlocking so we decided to log our findings with Microsoft. Microsoft have confirmed the bug and have stated it will be fixed in SQL 2008 but have stated they will not be issuing a KB in the immediate future which is one of the reasons i decided to blog about it.

I have had a look at the latest 2008 CTP and can confirm that Microsoft has updated the stored procedure to avoid the deadlocking and i noticed they also added a non-clustered index on the backup_finish_date in the backupset table (finally). I would also like to point out that the changes made to the stored procedure could easily be ported back to SQL 2005 so i`m a little surprised they have not been.

Work Around

To avoid this specific issue, we took the approach of identifying a generic window when transaction log backups would not be running on 95% of our server estate and changed the "History cleanup task" to run at this time. For the remaining 5% we worked out per server windows and now we do not see the issue on any of our servers.

A brief history of msdb backup history tables

Now, for those of you who want to know more about this problem and are wondering why this problem seems to only just be appearing in SQL 2005 here is a brief history lesson.

Prior to SQL 2005, maintenance plans never gave the ability to delete old backup history but the procedure "sp_delete_backuphistory" did exist. So, many DBA’s would find that their MSDB’s were growing rather large and if they used Enterprise Manager to do a restore it would hang for ages as it read the large backup tables. So, people would then find out about "sp_delete_backuphistory" and schedule it as a job but quite often the first time it was run it would take ages (some time days) to run due to poor coding and volume of data so people then implemented their own more efficient code (Google sp_delete_backuphistory and you will see what i am talking about, for example, see here).

So, what does this lead to? Well, not so many people using the MS stored procedure prior to 2005! But then SQL 2005 rocks up and we have the ability to call the procedure via the gui! Well, lets tick that puppy!!! 😀 We do need to keep that msdb trim after all and that is how we get to where we are now!

Categories: Backup, Deadlock, Errors, SQL Server

SSMS Log file viewer and Deadlock Graphs

January 23, 2008 1 comment

Firstly I must say a big thank you to Microsoft for the new deadlock trace flag 1222. Compared to the trace flag output for 1204 & 1205 that you had to use in SQL 2000 it’s a walk in the park to interpret.

Anyway, back to the post at hand! This is a quick FYI as i`m not going to go through how to interpret a deadlock graph because Bart Duncan does a fantastic job of it here.

When you enable 1222, it will output the deadlock information to the error log. If your using the log file viewer and steam on in and do your analysis you will probably find you get your deadlock victim the wrong way round like I did in the first cut of my analysis. Fortunately I did realise my mistake which made me look at the output again and I realised that the output in the log is upside down! This is because the log file viewer sorts the log so that the most recent entry is first and as such reverses the deadlock output. I`m not aware of any way to configure the sort order of the log file viewer and exporting the log exports it in the same order its displayed….

So, when looking at your deadlock information you have 2 options.

  • Find the occurrence of the words “deadlock – list” and read upwards.
  • Grab a copy of the error log from the servers log directory and open it in notepad.
    Categories: Deadlock, Errors, SQL Server