An Awfully Big SQL Adventure: sql server

Showing posts with label sql server. Show all posts

Monday, 23 March 2015

Parallelism and a big performance difference between temp tables and table variables

Interesting challenge the other day while working on my French Vocab Game, "French Tiger".

I needed to create a table variable to hold some data for use later on within a more complex query. Running the table variable declaration and the select statement was very fast (less than one second). Not surprising because the number of results returned was typically small (0-50) and the "French Word" column was nicely indexed.

DECLARE @p TABLE (Id int PRIMARY KEY)
SELECT Id FROM Vocab WHERE FrenchWord LIKE @Value + '%'

However, when I added the simple step of inserting those 50 rows in to my table variable, the operation slowed down to 2 seconds.

DECLARE @p TABLE (Id int PRIMARY KEY)
INSERT INTO @p
SELECT Id FROM Vocab WHERE FrenchWord LIKE @Value + '%'

This seemed bizarre – how could such a simple operation take so long? I use this approach in many places to help optimize the elements of complex queries, and I had never seen anything like this before. After spending hours experimenting with different indexes and minor adjustments to the query (not easy to come up with because it is so simple) I finally found an answer that brought the speed of the operation back down to zero seconds. The answer was to use a temporary table instead of the table variable:

INSERT INTO #p
SELECT Id FROM Vocab WHERE FrenchWord LIKE @Value + '%'

I wasn’t happy to discover this solution because I have always felt drawn to favour the table variable in most cases where there is a choice between a table variable or a temp table (because of not having to clear up after it). It seems the key reason that in this particular circumstance the temp table was superior is that any query including a table variable cannot benefit from parallelism. My server has numerous CPUs, and it seems that for whatever reason it was necessary to have several of them involved in order to run this insert quickly.

In order to prove that parallelism was the deciding factor I ran the temp table version again with the use of

option (maxdop 1)

… to effectively switch off parallelism, and bam! – back to 2 seconds for my insert.

Lesson learned for this table variable fan – always have to consider the alternative.

Wednesday, 20 March 2013

SQL Server Connection Timeout

I had a problem recently on play-free-games.com whereby I was hitting connection timeouts while connecting to SQL Server 2008 from my .Net application using ADO.

I would stress the point that these were connection timeouts rather than command timeouts. In many cases the timeout occurred when calling SqlConnection.Open(), without even defining a command to execute.

What surprised me (I'm easily surprised) was that the timeouts were almost instantaneous - within 1-2 seconds. SQL Server connection strings include a parameter called "Connection Timeout", which I had not set, but the default is listed as 15 seconds.

I wasted a lot of time asking network nerds to trace the connection, check for dodgy switches etc, only to find a Microsoft blog post stating that in some circumstances the timeout applied will only be 8% of what is specified if your SQL server is mirrored:

http://msdnrss.thecoderblogs.com/2011/05/ado-net-application-connecting-to-a-mirrored-sql-server-database-may-timeout-long-before-the-actual-connection-timeout-elapses-sometimes-within-milliseconds/

With this in mind I increased the timeout value in my connection string to 200, and the problem disappeared except at peak usage times.

I then added a "Min Pool Size" setting of 30, and the problem disappeared altogether.

Top hole!

Thursday, 21 February 2013

Multiple Column Primary Keys on Table Variables

I luuurrrve table variables in SQL Server - in fact they are my third favorite thing after otters and custard.

I actually used them for a couple of years without realizing that I could define a primary key, boosting performance:

DECLARE @Cakes table (CakeId int PRIMARY KEY, CakeName varchar(30), HasFrosting bit)

For some of my recent work on Giant Panda Planet I found I needed my table variable to have a multiple column primary key... was it an impossible dream?

Not really:

DECLARE @PandaFriends table (
PandaId int,
FriendId int,
IsBestFriend bit,
PRIMARY KEY(PandaId, FriendId)
)

Side Pocket!

Sunday, 13 March 2011

Importing data into SQL Server from a text file using BULK INSERT

As a .Net / SQL Server developer it is often necessary to import data in to your databases from other sources, such as text files. Ideally you would leave this type of data manipulation activity to the kind of weirdos who enjoy it (DBAs). Sadly however this is not always possible.

Therefore, in the old days of SQL Server 2000 I often created DTS packages to suck in my data. DTS was OK, especially the funky grey arrows, but it was a little time consuming. Then - disaster - DTS was phased out in favour of SSIS. At No Frills Corp our DBAs have not been able to give me a server that SSIS actually works on, so I was a bit stuck.

It was then that I came across the beatifully simple SQL command, BULK INSERT.

Just point a BULK INSERT command at your text file, tell it where to put the data, and you are all set. It can't be that easy can it? Yes, it really can:

BULK INSERT Chimps
   FROM 'D:\primates\chimps.csv'
   WITH 
      (
         FIELDTERMINATOR =',',
         ROWTERMINATOR ='\n'
      )

Cashback!

Friday, 4 March 2011

Finding the active node in a SQL Server Cluster

DBAs are an eccentric breed aren't they?

At my place of work, No Frills Corp, we have 2 types of DBA: DBAs who are perpetually on the edge of a nervous breakdown and will happily go a whole day without saying anything except "No", "I'm busy" and "Must be a problem with your app", and DBAs whom you should not engage in the simplest of conversations unless you have a couple of hours to spare.

I much prefer the taciturn DBAs, but in both cases the best policy is not to talk to them unless you really have to. So, what if you just need to know something simple but important, like which node of our active-passive cluster is currently active?

The other day I came across the very simple answer to this question:

SELECT ServerProperty('ComputerNamePhysicalNetBIOS')

Bosh!

Monday, 28 February 2011

Dormant Full Text Index in SQL Server

I recently created a full text index to improve search performance on some large text fields. The resulting performance was great, so I took it to show my boss (an IT "Bruno").

I had to wait a few minutes before I could see him, and then when I fired up my lovely new search page disaster struck in the form of a timeout - making me look like an imbecile.

I covered by showing him something shiny, and hastily hit refresh. The search came back in less than a second, as did each subsequent search. Happy boss.

When I got back to my desk I did another search and got another timeout. It seems my Full Text Index is going to sleep if not used for 5 minutes, and then takes 30 seconds or more to wake up. Rubbish!

So, I have had to set up a job to hit the FTI with a simple query every 5 minutes, keeping it "awake".

Something like this:

SELECT * FROM Dreams WHERE CONTAINS(SubText, 'fear of trombones')

Doesn't feel great having to do that, but now my Full Text Index is always ready for action.

Optional Search Parameters in SQL Server Stored Procedures

I often create Stored Procedures in SQL Server for searching - but what is the best way to handle optional search parameters?

In the old days I wrote queries like this:

SELECT *
FROM Chimps
WHERE (@Age = 0 OR Age = @Age)
AND (@Weight = 0 OR Weight = @Weight)

... defaulting @Age and @Weight to zero if not specified.

However, I later found that this performs a lot better:

SELECT *
FROM Chimps
WHERE Age = ISNULL(@Age, Age)
AND Weight = ISNULL(@Weight, Weight)

... with the params defaulting to NULL if not specified. Lovely.

But wait! What if there is a chance that some of our chimps have a NULL value for Age or Weight? Then we would have to go a bit further:

SELECT *
FROM Chimps
WHERE ISNULL(Age, 0) = COALESCE(@Age, Age, 0)
AND ISNULL(Weight, 0) = COALESCE(@Weight, Weight, 0)

This still performs well in most of my scenarios, but let's face it - ideally our chimps will not have a NULL weight. Weigh those chimps!