Archive

Posts Tagged ‘ETL’

The Information Staircase

July 1, 2012 1 comment

With the Big Data wave rolling over us these days, it seems everyone is trying to wrap their heads around how these new components fit into the overall information architecture of the enterprise.

Not only that, there are also organisational challenges on how to staff the systems drinking the big data stream. We are hearing about new job roles such as "Data Scientist” being coined (the banks have had them for a long time, they call them Quants) and old names being brought back like “Data Steward”.

While thinking of these issues, I have tried to put together a visual representation of the different architecture layers and the roles interacting with them:

image

Read more…

Advertisements

Don’t Become a One-trick Architect

December 8, 2011 12 comments

imageWe are near the dawn of a new workload: BigData. While some people say that “it is always darkest just before the dawn”. I beg to differ: I think it is darkest just before it goes pitch black.  Have a cup of wakeup coffee, get your eyes adjusted to the new light, and to flying blind a bit, because the next couple of years are going to be really interesting.

In this post, I will be sharing my views on where we have been and a bit about where we are heading in the enterprise architecture space. I will say in advance that my opinions on BigData are just crystalizing, and it is most likely that I will be adjusting them and changing my mind.

Read more…

Why Surrogate Keys are not Good Keys

October 22, 2011 33 comments

History tracking in warehouses is a controversial discipline. I this post, I will begin to unravel some of the apparent complexities by taking apart the history tracking problem, piece by piece.

Read more…

Guest Posts Coming up

September 20, 2011 2 comments

Lately, I have been discussing data modeling with a lot of people and there is simply so much ground to cover. Lots of exciting developments – it seems I was on to something when I started a blog about this topic.

To make faster progress, I have teamed up with Marcel Franke and  Cardory Van Rij.  I am very exciting about this development, We will be doing some blogging together in the near future, both here on blog.kejser.org and on their individual blogs.

Based on feedback on my previous post, I will be writing about Type2 dimensions and history tracking for the next post in the modeling series. It will be a little delayed, since I am traveling a lot the next month.

Stay tuned…

The Big Picture – EDW/DW architecture

August 30, 2011 36 comments

Now that the cat is out of the bag on the Kimball forum, I figured it would be a good idea to present the full architecture that I will be arguing for. I was hoping to build up to it slowly, by establishing each premise on its own before moving on to the conclusion.

But perhaps it is better to start from the conclusion and then work my way down to the premises and show each one in turn.

Read more…

Physically Placing the Maps in the architecture

August 19, 2011 2 comments

Before we leave the maps behind, I need to live up to my promise of describing the storage characteristics of tables visited during the journey through the warehouse architecture. This must include the physical location of maps. Since believe form must follow function in a DW, let us just recall their function:

From a functional perspective, I have shown you how map tables can be used to both track and correct source system keys. Maps are not visible to the end user, but they are a necessary part of the data’s journey from the source to the final data model. Maps also provide the abstraction of, or interface to, master data sources. In the absence of those sources – the maps can even serve as a makeshift master data repository.

Read more…

Transforming Source Keys to Real Keys – Part 2: Using Maps To Fix Key Problems

August 15, 2011 11 comments

KeysIn part 1 of this post, I introduced the idea of map tables. These tables serve as an abstraction between the source systems and the entities in the data warehouse. In this post, I will describe how you can use the maps to correct for source key errors.

Using the taxonomy of key pathologies described earlier, I will walk you through some examples of map usage.

Read more…