Query Plans | Thomas Kejser's Database Blog

Myth Busting Query Optimisation

July 1, 2013 8 comments

It’s been some time since I posted here, but recent customer events prompted me to write up some notes I have been taking the past years.

The problem I would like to talk to you about is one that is well examined and well understood, yet continues to create frustrations with customers. Unfortunately, some of these frustrations come from “best practices” that people apply consistently, which in turn consistently gets them into trouble.

When you are done reading this post, I hope you will have a better understanding on how to avoid these pitfalls. It is also my hope that some of your expectations about query optimisation will have been adjusted.

How Vertical Partitioning and Deep Joins Kill Parallelism

July 16, 2012 17 comments

Query plans with deep joins trees are often the result of high levels of normalisation in the data model. There are large advantages to normalising data as it minimizes the amount of data that must be written when a change happens. In traditional OLTP systems, this can be a boon.

However, normalisation is not without its costs – especially not in read intensive workloads like data warehouses.

Modeling Dimensions with History Tracked, Generic Attributes

July 6, 2012 5 comments

Sometimes, you need to model a database in such a way that you can dynamically extend the model without altering any tables. Perhaps the attributes changes faster than you can add new columns or the data you store has a “ragged” structure that does not lend itself well to being described as a traditional table structure.

The typical approach taken in these cases is to “pivot” the required “flexible columns” of a table into rows instead and dynamically reconstruct the schema at runtime. When you model like this, adding new attributes to the model is simply a question of inserting rows into the database.

As I sure you are aware, there are issues with this approach. Tuning columns into rows can cause interesting issues for database engines. The flexibility requirement and performance loss to be balanced carefully. In this blog, I will walk you through an example of how to handle this generic case in a star schema.

Why “Date BETWEEN FromDate AND ToDate” is a dangerous join criteria

June 29, 2012 10 comments

I have been meaning to write this blog post for some time and the discussion about Data Vault finally prompted me to do it.

Sometimes, you find yourself in situations where you have to join a table that has a structure like this:

CREATE TABLE TemporalTracking (
SomeKey INT
, FromDate DATETIME
, ToDate DATEIME
, <more columns>
)

The join criteria is expressed:

FROM <OtherTable> OT
INNER JOIN TemporalTracking T
ON OT.SomeTimeColumn BETWEEN T.FromDate AND T.ToDate
AND OT.SomeKey = T.SomeKey

Or more commonly, this variant with a semi open interval:

FROM <OtherTable> OT
INNER JOIN TemporalTracking T
ON OT.SomeTimeColumn >= T.FromDate
AND OT.SomeTimeColumn < T.ToDate
AND OT.SomeKey = T.SomeKey

Data models that promote these types of joins are very dangerous to relational optimizers and you have to step carefully when executing queries with many of these joins. Let us have a look at why this is so.

An Interesting Blog to Follow

September 1, 2011 2 comments

I know some of you track Conor Cunningham’s blog, which I highly recommend. The information on query optimizers is rather sparse out there – and it is a real privilege when people who know what they are talking about share their information freely on a blog. There are just not that many of them.

Here is something to think about in that context: Even if you are only interested in SQL Server, there are good things to be learned from studying other database engines – they are after all very similar. One of the good blogs for that purpose is the Oracle optimizer blog: http://blogs.oracle.com/optimizer/. It is delightfully free from marketing, and packed with good stuff. Don’t worry about the fact that they like textual query plans over there in Oracle. You might be used to viewing plans graphically, but I am sure you agree that such minor difference in display preferences can be set aside in the name of reading good material.

The Ascending Key Problem in Fact Tables– Part one: Pain!

July 1, 2011 4 comments

Time for another deep dive into SQL Server. In the last couple of days, I have been having coffee with Joe Chang in the afternoon, who some of you may know. He is visiting Sweden a lot by the way, so shoot him an email if you are interested in having him present. Joe and I ended up talking about an interesting problem in data warehousing: the ascending key problem. I think it is best illustrated by an example.

Thomas Kejser's Database Blog

Archive

Myth Busting Query Optimisation

How Vertical Partitioning and Deep Joins Kill Parallelism

Modeling Dimensions with History Tracked, Generic Attributes

Why “Date BETWEEN FromDate AND ToDate” is a dangerous join criteria

An Interesting Blog to Follow

The Ascending Key Problem in Fact Tables– Part one: Pain!

Categories