Monday, 23 March 2015

Parallelism and a big performance difference between temp tables and table variables

Interesting challenge the other day while working on my French Vocab Game, "French Tiger".

I needed to create a table variable to hold some data for use later on within a more complex query. Running the table variable declaration and the select statement was very fast (less than one second). Not surprising because the number of results returned was typically small (0-50) and the "French Word" column was nicely indexed.

SELECT Id FROM Vocab WHERE FrenchWord LIKE @Value + '%'

However, when I added the simple step of inserting those 50 rows in to my table variable, the operation slowed down to 2 seconds.

SELECT Id FROM Vocab WHERE FrenchWord LIKE @Value + '%'

This seemed bizarre – how could such a simple operation take so long? I use this approach in many places to help optimize the elements of complex queries, and I had never seen anything like this before. After spending hours experimenting with different indexes and minor adjustments to the query (not easy to come up with because it is so simple) I finally found an answer that brought the speed of the operation back down to zero seconds. The answer was to use a temporary table instead of the table variable:

SELECT Id FROM Vocab WHERE FrenchWord LIKE @Value + '%'

I wasn’t happy to discover this solution because I have always felt drawn to favour the table variable in most cases where there is a choice between a table variable or a temp table (because of not having to clear up after it). It seems the key reason that in this particular circumstance the temp table was superior is that any query including a table variable cannot benefit from parallelism. My server has numerous CPUs, and it seems that for whatever reason it was necessary to have several of them involved in order to run this insert quickly.

In order to prove that parallelism was the deciding factor I ran the temp table version again with the use of

option (maxdop 1)

… to effectively switch off parallelism, and bam! – back to 2 seconds for my insert.

Lesson learned for this table variable fan – always have to consider the alternative.