Archive

Posts Tagged ‘Scale’

Clustered Indexes vs. Heaps

January 12, 2014 21 comments

At Stack Overflow the other day, I once again found myself trying to debunk a lot of the “revealed wisdom” in the SQL Server community. You can find the post here: Indexing a PK GUID in SQL Server 2012 to read the discussion. However, this post is not about GUID or sequential keys, which I have written about elsewhere, it is about cluster indexes and the love affair that SQL Server DBAs seem to have with them.

Read more…

Synchronisation in .NET– Part 4: Partitioned Data Structures

January 5, 2014 5 comments

In this final instalment of the synchronisation series, we will look at fully scalable solutions to the problem first stated in Part 1 – adding monitoring that is scalable and minimally intrusive.

Thus far, we have seen how there is an upper limit on how fast you can access cache lines shared between multiple cores. We have tried different synchronisation primitives to get the best possible scale.

Throughput this series, Henk van der Valk has generously lent me his 4 socket machine and been my trusted lab manager and reviewer. Without his help, this blog series would not have been possible.

And now, as is tradition, we are going to show you how to make this thing scale.

Read more…

Synchronisation in .NET– Part 2: Unsafe Data Structures and Padding

December 27, 2013 9 comments

In the previous blog post we saw how the lock() statement in .NET scales very poorly when there is a contention on a data structure. It was clear that a performance logging framework that relies on an array with a lock on each member to store data will not scale.

Today, we will try to quantify just how much performance we should expect to get from the data structure if we somehow solve locking. We will also see how the underlying hardware primitives bubble up through the .NET framework and break the pretty object oriented abstraction you might be used to.

Because we have already proven that ConcurrentDictionary adds to much overhead, we will focus on arrays as the backing store for the data structure in all future implementations.

Read more…

Synchronisation in .NET– Part 1: lock(), Dictionaries and Arrays

December 25, 2013 10 comments

As part of our tuning efforts at Livedrive, I ran into a deceptively simple problem that beautifully illustrates some of the scale principles I have been teaching to the SQL Server community for years.

In this series of blog entries, I will use what appears to be a simple .NET class to explore how modern CPU architectures handle high speed synchronisation. In the first part of the series, I set the stage and explore the .NET lock() method of coordinating data.

Read more…

My First Course is now Available

June 25, 2012 Leave a comment

I am proud to announce that my first course: Tuning – Diagnosing and Fixing Hard Problems is now available. This is the first in what I expect to be a series of 5 courses that I am developing to share my knowledge with the field (for a price this time).

You can find details of the course on my Courses Page.

Reading Material: Abstractions, Virtualisation and Cloud

May 1, 2012 9 comments

Two SocketsWhen speaking at conferences, I often get asked questions about virtualization and how fast databases will run on it (and even if they are “supported” on virtualised systems).  This is complex question to answer. Because it requires a very deep understanding of CPU caches, memory and I/O systems to fully describe the tradeoffs.

Read more…

Thread Synchronization in SQL Server

November 9, 2011 4 comments

Any code optimized for highly concurrent workloads must worry about thread synchronization. SQL Server is no exception, because in a database system, synchronization is one of the core functionalities you rely on the engine to provide (the noSQL crowd may ponder that a bit). In this blog post, I will describe the synchronization primitives, like locks, latches and spinlocks, used by SQL Server to coordinate access to memory regions between threads.

Read more…

Exploring Hash Functions in SQL Server

November 6, 2011 16 comments

Hash distributing rows is a wonderful trick that I often apply. It forms one of the foundations for most scale-out architectures. It is therefore natural to ask which hash functions are most efficient, so we may chose intelligently between them.

In this blog post, I will benchmark the build in function in SQL Server. I will focus on answering two questions:

  • How fast is the hash function?
  • How well does the hash function spread data over a 32-bit integer space

Read more…

SQLBits and Phones

October 6, 2011 2 comments

imageMy presentation from SQLBits: “Finding the Limits: The Grade of The Steel” should be online soon. There is a lot of stuff to blog about and so little time to do it. It was some fun days of tuning as the picture shows.

I am curious to hear comments on my session. Was it useful? What other tests would you like to see? Do you prefer this presentation style over other styles (no, I won’t do demos!).

Special thanks to the good people over at Fusion-io for letting me use their kit to run tests. You guys rock!

In other news: I finally found a phone that is just a phone. It is called the Nokia X2, I had it for only a few days and I am already liking it a lot. So far, it has survived on only one charge.

Boosting INSERT Speed by Generating Scalable Keys

October 5, 2011 20 comments

Throughout history, similar ideas tend to surface at about the same time. Last week, at SQLBits 9, I did some “on stage” tuning of the Paul Randal INSERT challenge.

It turns out that at almost the same time, a lab run was being done that demonstrated, on a real world workload, a technique similar to the one I ended up using. You can find it at this excellent blog: Rick’s SQL Server Blog.

Now, to remind you of the Paul Randal challenge, it consists of doing as many INSERT statements as possible into a table of this format (the test does 160M inserts total)

CREATE TABLE MyBigTable (
    c1 UNIQUEIDENTIFIER ROWGUIDCOL DEFAULT
NEWID ()
    ,c2 DATETIME DEFAULT GETDATE ()
    ,c3 CHAR (111) DEFAULT ‘a’
    ,c4 INT DEFAULT 1
    ,c5 INT DEFAULT 2
    ,c6 BIGINT DEFAULT 42);

Last week, I was able to achieve  750K rows/sec (runtime: 213 seconds) on a SuperMicro, AMD 48 Core machine with 4 Fusion-io cards with this test fully tuned. I used 48 data files for best throughput, the subject of a future blog.

Read more…