Archive

Posts Tagged ‘Performance’

When is SELECT INTO not minimally logged?

June 20, 2015 3 comments

If you had asked me that question a month ago i would have said “Never, it is always minimally logged and this behaviour is documented in BOL’. But now I know better…

Recently i have been working on an application migration from SQL 2008 R2 –> SQL 2014 and we found a huge drop in performance for some SELECT INTO statements. I traced the issue down to the fact the operation was being fully logged and then we engaged Microsoft to work out why. Read more…

The anatomy of a process update – Part 2

February 21, 2013 10 comments

In part 1 of this 2 part post I shared an overview of the procedure to optimise process update. I will now walk through the optimisation process on one of our multi terabyte systems.

Part 1 shows a visualisation of process update and i have captured timings in profiler for all the stages shown and I now present my starting point for tuning below.

image

We have 3 stages that can be optimised.

  • Dimension build – This comes in at 92 seconds which is acceptable so i won’t be discussing this further but you can refer to part 1 for tuning options.
  • Partition index processing – Almost 25 minutes is spent here so tuning is required.
  • Commits – Almost 7 agonising minutes are spent waiting for the phase 2 commit with the server lock in place.

Read more…

The Anatomy of a process update – Part 1

February 9, 2013 13 comments

I have been working on optimising the process update of one of our multi terabyte systems and this 2 part post will break down the optimisation process and share with you some very interesting results.

In part 1 I will share an overview of the procedure to optimise process update and in part 2 I will apply this to our system and share some startling discoveries.

Overview

Internally process update is a complex operation and can be difficult to describe so to facilitate explanation I have created the visualisation shown below of the process which also helps to show the single threaded and multithreaded execution paths.

Read more…

SSAS Multithreaded sync with Windows 2008 R2

July 7, 2012 10 comments

We have been happily running some of our systems on Wndows 2003 and have had an upgrade to W2K8 R2 on the list for quite some time. The upgrade has now completed and we can start taking advantage of some of the new features which is the reason for this post.

For a long time we have used the sample Robocopy script from the SQLCat team to synchronize some of our larger SSAS databases. If your wondering what i mean by large, around 5 TB with a good few thousand partitions.

The script works like a dream but at this scale you soon find that the copy process starts to chug when it hits the fact data as this fantastic multi threaded script now becomes single threaded again as the bulk of the data being transferred is fact data.

Read more…

SQL Server 2008 uses half the CPU’s

July 6, 2012 Leave a comment

I recently got my hands on a couple of 4 socket servers with Intel E7-4870’s (10 cores per cpu) and with hyper threading enabled that gave me 80 logical CPU’s.

The server has Windows 2008 R2 SP1 along with SQL 2008 (Currently we can not deploy SQL 2008 R2 for the application being hosted).

When SQL Server started I noticed only 2 NUMA nodes were configured and 40 logical cores where there should have been 4 NUMA nodes and 80 logical cores (see below).

clip_image001

Read more…

Categories: Configuration, Numa, SQL Server

SSAS 2008 R2 Operations Guide

June 2, 2011 Leave a comment

What can i say but WOO HOO! The SSAS 2008 R2 Operations Guide has been published so download your copy now!

This document should be considered essential reading and contains over 100 pages of high quality information for configuring, testing and operating an Analysis Services server.

Finally i consider this document so important i have made a rare exception and gone back and added it to my Deployment Essentials post.

Categories: Analysis Services, Tips

SSAS Native v .net Provider

November 30, 2010 Leave a comment

Recently I was investigating why a new server which is in its parallel running phase was taking significantly longer to process the daily data than the server its due to replace.

The server has SQL & SSAS installed so the problem was not likely to be in the network transfer as its using shared memory. As i dug around the SQL dmv’s i noticed in sys.dm_exec_connections that the SSAS connection had a packet size of 8000 bytes instead of the usual 4096 bytes and from there i found that the datasource had been configured with the wrong provider but what was really interesting and the point of the blog is the performance difference which i have shown below.

Rows per second
.Net SqlClient Data Provide 30,000
SQL Server Native Client 10 240,000

That’s right! For a single partition, the native client was able to process 240,000 rows per second where as the .net client maxes out at 30,000. That means the SQL Native Client is 800% faster! I knew that the .net providers were slower but I had never gathered any metrics before. If your looking after a SSAS server I would definitely recommend taking a few minutes to check which provider is configured in the datasource.

Another point to consider is you may have a custom solution that is doing your ETL and utilising the .net providers. This would also be impacted by the .net provider throughput limits and a switch over to SSIS could dramatically improve your ETL.

Exploring backup read io performance

January 24, 2010 Leave a comment

I was recently exploring how to increase the backup read throughput on one of our SQL servers. Below are some interesting facts i found.

I would say that one of the most important reminders that came from the exercise is, do not assume that 2 databases being backup up on the same server using an identical backup command means that the processes are identical under the hood.

  1. Backup read threads are spawned 1 per physical device used by the database. (This is documented in Optimising Backup & Restore Performance in SQL Server)
  2. Multiple database files on 1 disk will not increase throughput because 1 disk = 1 thread and the thread works through the database files 1 at a time.
  3. Backup read buffers are evenly distributed across the number of read threads.
  4. Backups are pure IO operations, they do not read pages from the buffer pool.
  5. When passing in @MaxTransferSize it appears to be a suggestion rather than implicit and SQLServer will assign the requested value if it can otherwise it can pick another lower value.
    So, quite a few statements there… Where is the proof? Well the best find had to be trace flag 3213 which outputs information regarding backup decisions made. Below is an extract of this output which i will then talk through.
    2010-01-22 12:00:02.45 spid78      Backup/Restore buffer configuration parameters
    2010-01-22 12:00:02.45 spid78      Memory limit: 32765MB
    2010-01-22 12:00:02.45 spid78      Buffer count:               40
    2010-01-22 12:00:02.45 spid78      Max transfer size:          448 KB
    2010-01-22 12:00:02.45 spid78      Min MaxTransfer size:       64 KB
    2010-01-22 12:00:02.45 spid78      Total buffer space:         17 MB
    2010-01-22 12:00:02.45 spid78      Buffers per read stream:    10
    2010-01-22 12:00:02.45 spid78      Buffers per write stream:   8
    2010-01-22 12:00:02.45 spid78      Tabular data device count:  4
    2010-01-22 12:00:02.45 spid78      FileTree data device count: 0
    2010-01-22 12:00:02.45 spid78      Filesystem i/o alignment:   512
  • Red covers the max transfer size. I actually asked for 1mb but only got 448 KB. Additionally i also noticed that where i kick of multiple backups (but all requesting 1mb) that the transfer size tends to decrease the more backups you have. So, no one backup is necessarily the same.
  • Green covers buffer distribution. So, i asked for 40 buffers. The database being backed up has data devices on 4 physical disks so gets 4 read threads. Buffers per read stream is 10 which is (40 buffers / 4 threads).
  • Blue covers read threads. The database backed up had data files on 4 physical disks. This is exposed as the Tabular data device count and confirms the statement in point 1 that you get 1 read thread per physical device as documented by MS.

So, what about statements 2 & 4? Well, i monitored the reads to the individual files using  sys.dm_virtual_io_file_stats and took a number of snapshots whilst performing a backup. There are plenty of scripts you can download to take the snapshots yourself such as this one. Once the backup completed i looked at the time slices and you can see the following.

  • Total mb read during backup = total data held in the file. From this i drew the conclusion its not reading any of the data held in the buffer pool which makes a lot of sense as the backup includes the transaction log.
  • Querying the statistics at different time intervals you see the first datafile MB’s growing and then the second data file mb’s don’t start growing until first is complete hence its going 1 file at a time. However, if you have multiple files on multiple disks you do see 1 file on each disk being read from. I’ve not mentioned increasing the number of backup devices and changing block sizes as my case specifically did not call for it but you can read about that here.

The last thing i want to say since i have touched on single threaded backup reads is that i`m keen not to spawn any new urban legends. Whilst this is true for the backups in the context of per physical disk device, that’s it! Its worth reading this article about urban legends around SQL threads.

Categories: Backup, Performance, SQL Server

Analysis Server appears to hang…..

April 5, 2009 4 comments

We had an on going problem where by users would suddenly start complaining that the Analysis Server was hanging. When investigating the box we could see that there appeared to be no physical constraints. Disks were a bit above average, CPU was very low and there was plenty of memory.

When looking at the server through the SSAS activity viewer (part of the community samples) we could see that users were definitely queuing up and many of the queries should have returned in less than a second but were hanging around for ever (30 minutes or more). It was as if we were experiencing some form of blocking…….

To compliment our monitoring we use ASTrace (also part of the community sample) to do real time logging of what’s happening on the SSAS server to a SQL Server and amongst the capabilities it gives us is the ability to run a little procedure to show us what mdx/xmla is running with duration etc (its much more friendly than the activity viewer). So, when we were experiencing the trouble our script showed that an MDX query that touched multiple partitions totalling 100’s of gb’s appeared to be at the head of the chain every time.

Read more…

The evils of implicit conversions

January 18, 2009 Leave a comment

I wanted to put up a brief post showing the impact of an implicit conversion on the performance of a query (or not……). In the example i will show an implicit conversion negatively impacting query performance and an implicit conversion that does not impact performance…..

So, we need to setup the test environment using the code below.


SET ANSI_WARNINGS OFF
--********      Create Test Data           ******************
CREATE TABLE #data_test(ukey INT IDENTITY(1,1) PRIMARY KEY , first   VARCHAR(200),second VARCHAR(200))
DECLARE    @first INT@second INT
SELECT
@first = 1
WHILE @first < 250000
BEGIN
INSERT 
#data_test
SELECT  REPLICATE(@first,@first), REPLICATE(@first,@first)
SELECT  @first = @first +1
END
CREATE NONCLUSTERED INDEX
stuf_1 ON #data_test (first)
--********      End of Test Data            ******************

 

Now, with the test data in place we can run the following 2 queries and observe the differences.

/* This uses a variable declared as an NVARCHAR */
EXEC sp_executesql N'SELECT * FROM #data_test WHERE first = (@p0) ', N'@p0 nvarchar(200)',@p0 = N'1'
--Scan count 1, logical reads 3093, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

/* This uses a variable declared as an VARCHAR */
EXEC sp_executesql N’SELECT * FROM #data_test WHERE first = (@p0)’, N’@p0 varchar(200)’,@p0 = ‘1’
–Scan count 1, logical reads 7, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Below each statement is the io incurred and the difference on this tiny little table is > 3000 IO’s just because we used unicode (nvarchar) instead of non-unicode (varchar) and i`m sure you can imagine that on a larger table this becomes a significant overhead. So, why has this happened? Lets take a look at the plans.

Read more…

Categories: Performance, SQL Server

The overhead of a non-unique clustered index

October 20, 2008 Leave a comment

So, we all know that if we create a clustered index that is not unique that we will incur a 4 byte overhead right? Well not always because as usual, it depends….. Geeked

When you create a non-unique clustered index SQL server must maintain uniqueness so it adds a hidden 4 byte column which is populated for each non-unique row (not every row) but what many people may not realise is that this is actually a variable length column so if your table has no variable length columns you have to incur another 4 bytes to maintain the variable offset data giving you a total of 8 bytes per row instead of 4 bytes.

A few bytes may not sound much but when dealing with multi billion row tables it soon adds up so its important to know how the space consumption breaks down.

Below is an extract from Books Online 2005 "Estimating the size of a clustered index"

"The uniqueifier is a nullable, variable-length column. It will be nonnull and 4 bytes in size in rows that have nonunique key values. This value is part of the index key and is required to make sure that every row has a unique key value."

Its great to see that this hidden column is now documented but a bit of additional clarity around its potential variable length property storage overhead would nice.

Finally I thought I would visual this hidden data overhead for you with a screen shot from Danny’s awesome Internals Viewer

ole0

Update: Thanks to Christian Bolton for clarification that the overhead is for each non-unique row which i have now reflected in the post.

My old mate sp_recompile

October 12, 2007 Leave a comment

As soon as i saw the error messages in the logs i thought to myself "Oh my, that did not happen in testing" (ok, maybe it was more colourful than that).

We were creating a clustered index on a tiny little table and the index went through fine. However, the application started to generate the message "Could not complete cursor operation because the table schema changed after the cursor was declared". My gut reaction was to restart each application server in the cluster but having restarted the first one it made no difference. It suddenly clicked that SQL Server must be dishing out the cursor plan from cache.

Now, I did not want to restart the SQL servers because only a small part of the application was affected and I did not want a more significant outage. So, how do we get the plan out of cache? The table below details your options with the corresponding impact.

Action

Pros

Cons

EXEC sp_recompile ‘object’

Minimal impact. When passing a table name all procedures referencing it will be recompiled. Plans in cache not referencing the table will stay in cache.

You have to know the name of the object(s) needing to be recompiled.

DBCC FREEPROCCACHE

Quick and dirty.

The procedure cache for the server is cleared so the server will slow down whilst the cache populates again.

Restart SQL

I suppose you could say you are 100% sure you have nailed the sucker.

You have a system outage and you have to wait for your procedure and buffer cache to repopulate.

The lesson to take away here is to always use sp_recompile when making any kind of DDL changes, i also tend to use it on stored procs & views too. I normally always have it in my scripts so believe you me i gave myself a good talking to about forgetting to put it in this time Open-mouthed smile

And on a related note, have you come across sp_refreshview? No? Well, its worth knowing about.

Categories: Performance, SQL Server

Server level VLF report

July 25, 2007 Leave a comment

I read Tony Rogerson’s blog on Virtual Log Files today and it reminded me that i really should knock up a little report to list all databases on a server and the number of VLF’s per database. Since I had been busy writing some other Operational reports I was in the right frame of mind so knocked up what you see below.

The Report code

CREATE TABLE #VLFS (fileid INT,filesize bigint,startoffset bigint,fseqno bigint,status INT,parity INT,createlsn VARCHAR(1000))

CREATE TABLE #Results (SRV_Name NVARCHAR(500),Database_Name NVARCHAR(500),VLFS INT)

EXEC MASTER.dbo.sp_msforeachdb

    @command1 = 'USE ? INSERT INTO #VLFS EXEC(''DBCC LOGINFO WITH TABLERESULTS'')',

    @command2 = 'insert into #Results SELECT @@SERVERNAME,''?'',count(*) from #vlfs',

    @command3 = 'truncate table #vlfs'

--For this example i`ve just returned the results but you can

--just as easily write the results to a local or central server.

--I write mine to a central server for reporting on.

SELECT * FROM #Results

DROP TABLE #vlfs

DROP TABLE #Results

Example output

Results

For me the beauty of this little report is that i have set it up so that it runs on all 150 of my servers distributed around the world and logs to my central logging server. From here I get one report that tells me the databases and the servers they are hosted on that have excessive VLF’s. Having said that in a well managed environment i should never find any databases with lots of VLF’s……..

You may also be wondering "How many VLF’s are too many?". Tony did not cover this but Kimberly Tripp did in a post she put up a few years ago (See point 8) which was when I first learned about VLF’s. I don’t however think this is a hard and fast number and obviously the performance gains will be less if your not far off this number.

The last thing I am going to mention in this post is part of the code I used in the report. I used the undocumented procedure sp_msforeachdb. This is a great little procedure that will cycle through your databases executing up to 3 commands. To use it, where you would have put a database name you put a ‘?’ and its as simple as that! Incidentally there is also a sp_msforeachtable.

SQL 2005: Procedure cache cleared

June 5, 2006 Leave a comment

I don’t normally write about a technet article but i must say that 917828 caught my eye and i felt that it was worth highlighting it.

The reason it caught my eye is that it describes a number of scenarios where by the whole procedure cache is flushed which will clearly lead to a decrease in query performance whilst the cache builds up again. The most alarming point was that there are at least 2 scenarios that a DBA may carry out and even schedule during core production hours without realising the impact until the users start complaining about the mysterious slow downs…..

These are

  • Deleting a database snapshot.
  • Restoring a database backup.

I’d recommend reading the article as there are 5 other situations in which this occurs that may affect you.

Categories: Performance, SQL Server

What is that SPID doing?

March 27, 2006 Leave a comment

This is certainly not a new topic but something that i felt was worth revisiting and reminding people about, especially because i find many people are not aware of the ability to extract the statement within a stored procedure being executed.

So, prior to SQL 2000 SP3, when we want to know what code a SPID is executing we have to use DBCC INPUTBUFFER (). The problem with this command is that its restricted to the first 256 characters and only returns the outermost statement executing. That means that if we have stored procedure 1 calling stored procedure 2 the input buffer will only tell us about stored procedure 1 and it may even get truncated!

I clearly remember reading the SP3 readme and thinking Christmas has come early as i found the entry about a new function being introduced called ::fn_get_sql. I’ve got to say right now this is the best present yet! It has made trouble shooting so much easier. Why? Well read on.

Running the code below will now show you the inner most statement executing and we now get up to 4000 characters. In the case of a stored procedure you see the whole procedure.

DECLARE @Handle BINARY(252)
SELECT @Handle = sql_handle FROM MASTER..sysprocesses WHERE spid = 109 --CHANGE SPID NUMBER
SELECT * FROM ::fn_get_sql(@Handle)

The part that really got me? When it returns the code of a stored procedure you could be left wondering what statement within the stored procedure is executing. Fear not! They have that covered to. All you have to do is run the modified code below and the exact statement causing you a problem is identified.

DECLARE @Handle BINARY(20), @start INT, @end INT
SELECT
@Handle = sql_handle, @start = stmt_start, @end = stmt_end
FROM MASTER..sysprocesses WHERE sql_handle <> 0x00 AND spid = 226  --CHANGE SPID NUMBER
IF NOT EXISTS (SELECT * FROM ::fn_get_sql(@Handle))  PRINT 'Handle not found in cache'
ELSE
SELECT 
'Current statement'=
SUBSTRING(TEXT, (@start + 2)/2,
                                
CASE @end
                                     
WHEN -1 THEN (DATALENGTH(TEXT))
ELSE (@end -@start + 2)/2
                                   
END)

FROM ::fn_get_sql(@Handle)

So, now i’ve reminded you of this really useful tool its time for a few "Battle Tips".

1) Make sure in query analyser your "maximum characters per column" is set to 4000. Otherwise you are only going to get 256….
2) If  SQL generates a "zero cost plan" it will not put it in cache because its not worth it. As its not in cache ::fn_get_sql won’t return any information.
3) This one comes with a health warning….. If for some reason you need SQL to cache the zero cost plans so that ::fn_get_sql will always work then you can enable trace flag 2861. I would recommend you do not use this in production as the overhead is substantial and profiler should be used instead but some people might find it beneficial on a dev box.
4) I’ve not adjusted the script above to deal with parallel spids where you get multiple handles for 1 spid. Just a note, i keep on forgetting.  😀

There are many uses for this function and i encourage you to explore them. SQL 2005 makes life even easier through DMV’s but that’s another blog……

A performance analysis example

January 15, 2006 2 comments

Last week i was dragged kicking and screaming away from my allocated SQL 2005 days to spend 2 days diagnosing performance problems on one of the firms Systems and write a report.
Due to the sensitive nature of the data on the system it has been managed by another team and this was the first time i had looked at it. I thought i’d share my thought process to identifying the problem on the server and reference tools i used. Its the first iteration of the review and once my recommendations are implemented there will be another review to see what else may need dealing with. Please note however i have not included every single step i took otherwise i’d be writing a novel!

Scenario

The users of the system have been reporting intermittent time outs occurring in the front end application. Current system administrators have noted periods of high cpu usage. The server is Windows 2000 SP4 running SQL 2000 SP3a with 2 physical processors, hyperthreading enabled, 2gb memory, SAN attached (EVA 5000).

Analysis

The first port of call was to review the event logs and sql server logs for any errors that may indicate what is happening. These logs revealed nothing. Next port of call was to connect via Query Analyser and see what was going on. A quick look at current activity showed nothing much going on, so i did the standard SELECT * FROM SYSPROCESSES. Now this revealed that 80% of the connected spids last wait times had been for a latch and specifically the waitresource was 2:3:69 or 2:1:24.

Read more…

Categories: Performance, SQL Server