Relational | Thomas Kejser's Database Blog

How do Column Stores Work?

July 4, 2012 9 comments

In this blog, I will provide you with some basic information about column stores. Nothing I am writing here is vendor specific IP, but merely taken from the papers published throughout history. One of the best papers that serves as an introduction is by Stonebraker:

Stonebraker et a, Proceedings of 31st VLDB Conference, 2005: C-Store: “C-Store: A Column-oriented DBMS”

The idea is older than that though, with the first papers published in the 1970’ies.

Shamefully standing on the shoulders of giants, I will walk you through a simple example which illustrate one of the key principles of column stores: Run Length Encoding (RLE).

Implementing Message Queues in Relational Databases

May 25, 2012 18 comments

At the last SQL Bits X I held the FusionIO fireside chat during the launch party. During this presentation, I demonstrated how it is possible to build a table structure inside a relational engine that will act is a message queue and deliver nearly 100K messages/second.

Why You Need to Stop Worrying about UPDATE Statements

April 27, 2012 4 comments

There seems to be a myth perpetuated out there in the database community that UPDATE statements are somehow “bad” and should be avoided in data warehouses.

Let us have a look at the facts for a moment and weigh up if this myth has any merit.

Why Surrogate Keys are not Good Keys

October 22, 2011 33 comments

History tracking in warehouses is a controversial discipline. I this post, I will begin to unravel some of the apparent complexities by taking apart the history tracking problem, piece by piece.

Good keys, what are they like?

May 13, 2011 28 comments

A central value add of data warehouses is their ability to restore the sanity that comes from using good keys. Taking a model-agnostic view of keys, they refer to “something” that is uniquely identifiable. Depending on what you are modeling, those “somethings” have different names, for example: entities, rows, tuples, cells, members, documents, attributes, object instances, and relations between the any of the latter. Because this article is about relational databases and because I don’t want to move up to the “logical” level with you (as I promised before), I will use the term: “row” as the “something” that a key refers to, and the term “column” as the data structure that implements the key.

Moving on, the definition of a “good key” is a column that has the following properties:

It forced to be unique
It is small
It is an integer
Once assigned to a row, it never changes
Even if deleted, it will never be re-used to refer to a new row
It is a single column
It is stupid
It is not intended as being remembered by users

Thomas Kejser's Database Blog

Archive

How do Column Stores Work?

Implementing Message Queues in Relational Databases

Why You Need to Stop Worrying about UPDATE Statements

Why Surrogate Keys are not Good Keys

Good keys, what are they like?

Categories