Archive

Posts Tagged ‘Histogram’

Why “Date BETWEEN FromDate AND ToDate” is a dangerous join criteria

June 29, 2012 10 comments

I have been meaning to write this blog post for some time and the discussion about Data Vault finally prompted me to do it.

Sometimes, you find yourself in situations where you have to join a table that has a structure like this:

CREATE TABLE TemporalTracking (
  SomeKey INT
, FromDate DATETIME
, ToDate DATEIME
, <more columns>
)

The join criteria is expressed:

FROM <OtherTable> OT
INNER JOIN TemporalTracking T
  ON OT.SomeTimeColumn BETWEEN T.FromDate AND T.ToDate
  AND OT.SomeKey = T.SomeKey

 

Or more commonly, this variant with a semi open interval:

FROM <OtherTable> OT
INNER JOIN TemporalTracking T
  ON OT.SomeTimeColumn >= T.FromDate
  AND OT.SomeTimeColumn < T.ToDate
  AND OT.SomeKey = T.SomeKey

Data models that promote these types of joins are very dangerous to relational optimizers and you have to step carefully when executing queries with many of these joins. Let us have a look at why this is so.

Read more…

What Structure does your Data Have?

May 15, 2012 2 comments

I am currently thinking about measuring compression rates for different column store indexing strategies. In order for me to get some realistic data, I am looking for people who can share a very small anonymised data sample with me.

Read more…

When Statistics are not Enough – Search Patterns

February 21, 2012 4 comments

Co-author: Lasse Nedergaard

Yesterday, Lasse ran into an issues with a query pattern in the large database that he is responsible for. Based on our conversation, we wrote up this blog and created a repro.

The troublesome query we were debugging executed like this:

  1. Find a list of keys values to search for
  2. Insert these keys in a temp table – lets call this the SearchFor table
  3. Join the temp table to a large table (lets call it BigTable) and retrieve the full row from the large table

Why not use a correlated sub query in step 2? In this case, the customer in question had multiple code paths (including one accepting XML queries) that all needed to pass thousands of key to a final search procedure. They wanted a generic way to pass these key filters to to the final access of BigTable.

Read more…

The Ascending Column Problem in Fact Tables –Part two: Stat Job!

July 7, 2011 3 comments

In my last post I described a common problem with statistics in data warehouses. I also directed you to a trace flag that partially solves the problem. There is an issue with this trace flag: at the time of writing this blog, it does not work on a partitioned tables. Considering that large tables tend to be partitioned, this could be an issue.

Read more…