Home > SQL Server > Default Configuration of SQL Server (and query hints)

Default Configuration of SQL Server (and query hints)

January 15, 2014 Leave a comment Go to comments

Throughout the years, I have become convinced that the default settings used in SQL Server are often wrong for systems that need scale or are properly managed. There is a series of settings I find myself consistently changing when I audit or install a new server. They are “my defaults”

I thought I would share those here. Bear in mind that these are setting that assume a certain level of rational behaviour in the organisation you find yourself in. If you are working in a bank, they may not apply to you.

Server Configuration

Here is what I always do

Configuration Reason
Grant Lock Pages in Memory Paging the buffer pool is bad.
Set Max Memory So you don’t take too many of above
Grant Perform Volume Maintenance tasks to the SQL Server account Allow instant file initialisation. See: http://technet.microsoft.com/en-us/library/ms175935(v=sql.105).aspx 
tempdb #files = #cores I have shown that this is the most scalable configuration
model in simple mode and autogrowth off Prevents new users from allocating a database that he has not consciously made good backup choices on
Add new filegroup named DATA to model, set this to the default All my database have at least two filegroups (more about this later)
As pointed out in comments (Thanks commenters), this has to be done the hard way, during database creation time
Enabled Async Stats in Model I have yet to find a case where async stats are not better than the default
Named Pipes off, TCP/IP on I never found a real use case for named pipes. Turning them off is one less surface area for malicious attackers to exploit
max worker threads On a modern 2 socket, I generally set this to 8192. Typically, you can run more threads than the default configuration allows. Threads that are un-used only burn a bit of memory, which is cheap.
-T1118 Mixed extends are a joke. Off with them!
-T1117 If you are managing you data files right, you need them to grow at the same rate. Failure to do so means you are doing something wrong – and I don’t configure for people who do things wrong
-T4199 Always enable the latest optimiser fixes. There is enough bad stuff happening with the optimiser all the time (nature of the beast). Being on the latest version is much better than being stuck in the past
-T3226 Logging successful backups to the error log is a royal pain. That log is reserved for error conditions and other things you need to carefully manage. Logging of backup status belongs in the agent or in the backup system you run.
-T835 Enable Locked Pages on Standard Edition

For a documentation of many of these trace flags, see this: http://technet.microsoft.com/en-us/library/ms188396.aspx

Database / Table Configuration

In addition to server configuration, there are a series of settings I automatically change on databases and tables

Setting Reason
Async stats As described above. In my experience, always the superior choice. You need a good reason to have it off, not the other way around.
New filegroup: DATA, set it to default I never put objects in the PRIMARY filegroup. PRIMARY contains the metadata, and needs extra protection (I normally put it on the TLOG drives). The additional filegroups allow me to freely reallocate new tables and move things around between disk
sp_autostats OFF on large tables In nearly every case I have seen on large tables, auto stats are harmful. They always kick in at the most inconvenient times. And when they kick in, the cause the one thing that is worse than bad performance and data loss: Unpredictability.

I leave auto create stats enabled as this happens rarely enough to be useful and not disruptive.

Table lock escalation OFF on large tables Same problem as auto stats, when lock escalation kicks in it nearly always spells trouble. If you are running out of locks, you are doing something wrong that needs to be addressed. Lock escalation is not the solution – it typically makes the problem worse or hides the underlying issue.

See background details on the trade offs: http://technet.microsoft.com/en-us/library/ms184286(v=sql.105).aspx

ALLOW_PAGE_LOCKS = OFF, ALLOW_ROW_LOCKS = ON for small, and sometimes large, tables The lock manager often becomes the bottleneck at scale. For small tables, getting rid of page locks help. For large tables, make sure you can live with the the impact on table scans before turning it off
#files = #cores Yes, even for user databases!

 

On Important Queries and Hinting

I would like to wrap this post up with another place where I deviate from the standard configuration and “best practices” in the SQL Server world.

If I find a stored procedure or query that either

A) Gets run often… or

B) Is very important and needs to be consistent (“important” is nearly always the same as consistent)

… I simply don’t trust the optimiser with these types of queries. Mission critical queries are much too important to leave in the hands of an unpredictable process like that.

Once I have seen a good plan for the query (and validated that I can’t make a better one myself) I force the query using hints. I don’t wait for it to go wrong, I do it when I see the query the first time (even if the plan is good). This has already saved our production systems from trouble and I am teaching this technique to my colleagues.

However, I do restrict myself to these hints:

Hint Reason
OPTION (<join type> JOIN) Pretty safe, even under changing index conditions. In OLTP system, if OPTION (LOOP JOIN) cant give you a good plan, you are likely missing an index (and it will be obvious which one when you see the forced plan)
LEFT/INNER <join type> JOIN Allows me to force ordering and really control execution to harvest filters in the right order.
FORCESCAN The optimiser often gets this wrong for reporting queries. I find that it is a little too fond of seeking when a scan is better. Scans are often “safer” in the sense that you know they are proportional with the size of the table scanning. Seeks gone wrong (especially if they cause table spools) can really wreck havoc.
FORCESEEK While this hint can be risky (because it assumes the existence of an index) – there are rare cases where the optimiser needs a bit of manhandling. This is a hint of last resort if the join strategies don’t give me enough
CROSS APPLY Yes, I use this as a hint. If I know that a result will return a specific number of rows (and the optimiser does not) I can do a CROSS APPLY to SELECT TOP 1. A good example of this is BETWEEN joins on dates
OPTIMIZE FOR UNKNOWN My favourite hint. Most of the times, I don’t want a plan to assuming anything about the boundary values of stats in another table (my fondness for GUID helps here as they are more likely to be nicely spread out). This hint is great for stability under changing conditions
MAXDOP Very few queries (except those tuned by master Machanic) takes advantage of more than 8-16 cores.
OPTION (HASH GROUP) Sometimes, the optimiser will trigger a massive sort before going into an aggregate operation. If I know that the aggregate will have a small result (often the case) I force the hashing strategy instead to avoid the excessive memory usage of the sort.

Hints I avoid:

Hint Reason
Index hints I don’t want queries to break if I changes indexes. In my entire career, I think I have found only 1-2 cases where I needed an index hint and forcing of join orders wasn’t good enough. Index hints make queries very brittle – which I suspect is why query hints in general have such a bad rap.
FORCE ORDER If I force order, I also force the join strategy (as above). So I just haven’t found a need for this hint.
OPTIMISE FOR (@var = x) Making something dependent on the contents of a column is dangerous. Data changes faster than code.

However, Adam Machanic has some really cool ways to force some interesting side effects by using this hint together with a TOP (@var) (See his blog: http://sqlblog.com/blogs/adam_machanic/)

  1. January 16, 2014 at 09:22

    Hi, great info, but adding filegroups is not allowed on the model database as far as I know. (I wish we could though)

  2. January 16, 2014 at 21:10

    Reblogged this on Gianluca's space and commented:
    Test?

  3. January 17, 2014 at 10:53

    Hi Thomas,
    Would you mind elaborating a bit on “CROSS APPLY”?
    We do a lot of queries on date ranges, and anything that can help is much appreciated.
    Best regards,
    Henrik

    • January 18, 2014 at 09:50

      Hi Henrik

      Certainly (almost the subject of a new blog entry). There are a few tricks you can play with CROSS APPLY. First of all, imagine you have a query like this:

      SELECT
      FROM Fact
      JOIN Dimension D ON Fact.SomeDate BETWEEN D.FromDate AND D.ToDate

      As I have described here: http://blog.kejser.org/2012/06/29/why-date-between-fromdate-and-todate-is-a-dangerous-join-criteria/ this join is poison to the query optimiser, because it has no way of knowing that the join returns only one row.

      One way to rewrite is this:

      SELECT
      FROM Fact
      CROSS APPLY (SELECT TOP 1 * FROM Dimension D WHERE Fact.SomeDate BETWEEN D.FromDate AND D.ToDate)

      This guarantees an estimate of 1 row from the join.

      The other trick you can play involves increasing parallelism. AFAIK, this technique was invented by Adam Machanic and you can find the details in his blog: http://sqlblog.com/blogs/adam_machanic/archive/2012/04/03/sqlbits-london-2012-demos.aspx

      If you are still runnning the warehouse I am thinking of, you may find Adam’s tricks of particular interest 🙂

  4. January 17, 2014 at 11:09

    Hi Thomas,
    Interesting post, few things I am not doing and few that I am too. I have always wanted to add an additional file group to model, especially on development servers. I have tried in the past and generated an error. Just tried again…

    “Alter failed for Database ‘model’. (Microsoft.SqlServer.Smo)
    ….
    User-defined filegroups are not allowed on “model”. (Microsoft SQL Server, Error:1826)

    Could you clarify the steps that you take to achieve this as I am keen to get this in place.

    • January 18, 2014 at 09:42

      Hi Ally

      You are right, it is not possible to modify Model with a filegroup. We recently changed our database layouts to use the additional filegroup, seems we must have included it in the defaults of the DB create script instead.

      • January 18, 2014 at 11:27

        That’s a shame. I am going to suggest having it configured in a template SSDT project for our Devs to use instead.

      • allyruk
        January 18, 2014 at 11:28

        Having just thought about it a bit more I guess it would mess with tempdb too and perhaps that explains the restriction.

      • January 18, 2014 at 12:21

        One would think that adding a single line of code saying: only clone primary when creating tempdb would be possible in the SQL code base 🙂

  5. Jon
    February 7, 2014 at 13:19

    Hi Thomas

    Many thanks for this. I have a few questions about the trace flag recommendations, in the server configuration section:

    Traces are a bit new to me. I’ve found that the following syntax works:
    DBCC TRACEON (4199, -1)

    However, you’ve included -T in all cases. Presumably, with the command above, the trace is lost if the SQL Server service is restarted?

    So some further searching suggests that I should:
    – run cmd
    – navigate to C:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\Binn\ (or thereabouts)
    – and (for each flag) run: sqlservr.exe -T4199

    Is that correct?

    Thanks very much!
    Jon

  6. Jon
    February 7, 2014 at 16:57

    I believe -T835 should be -T845

  7. February 14, 2014 at 08:11

    Hi Thomas,

    I would add setting the power plan to maximum performance, configuring tran logs to be 8Gb in size ( initially ) ad grow in equal 8 Gb increments ( as recommended by Kimberley Tripp ) in the absence of any empirical evidence on which to configure these, good advice. Regarding hints, there have been some scenarios under which QUERYTRACEON(2301) can yield some very impression performance gains. You mention Adam Machanic and the TOP hint, this can also be used to “Brute force” the cardinality of SELECT statements in the source query for merge statements ** and ** in the absence of a cardinality hint which Oracle has, this can be used to achieve something similar.

    On the how clever / dumb you should assume people are, a lot of my engagements are at places where there is a mind set of: “Its a Microsoft product therefore developers who can do everything from the GUI right to the back end can do the job”, another chestnut was “I don’t believe that working with SQL Server is real work, as its just a bloated version of Access”. I agree with your sentiment, however the reality of a lot of places, the small medium sized businesses beneath the CAT teams radar, plays into the scenarios Brent mentions.In fact I see so many recurring problems from one site to another, that I could create a standard report and only have to change the name of the customer in order to make it applicable to a specific customer.

    There is a wider debate as to whether Microsoft should enforce a certain number of “Best practices” on people out of the box, i.e. not allow people to create databases on the system drive, not allow people to specify auto growth settings as percentages, not allow people to set max memory to the maximum memory in the server ( where does the thread stack go ? ).

    On you your comment about the land of uncle Sam, its also the land of some institutions which are absolute temples to our trade, Carnegie Melon, MIT and Berkeley. I also believe that Donald Knuth hailed from there and John Von Neumann carried most his pioneering work in the states.

    The really funny thing about blogs in general is that the comments sections can sometimes be more interesting and insightful than the actual articles that precipitates them.

    Regards,

    Chris

    • February 14, 2014 at 14:57

      Hi Chris

      I am no a big fan of setting the power plan to high performance. While it does yield some performance, the cost in power is non trivial. For a high performance, guaranteed response time, system I agree, but not as a generic recommendation.

      With regards to the clever bit on MS: The “best practice” mindset and catering to the lowest common denominator only reinforces this image – it does nothing to dispel it. The defaults could obviously be better than they are, so there are steps MS can take to avoid too many common mistakes. In particular, putting recommendations in place that favor scale at the sacrifice of performance would greatly help refocus DBA on what matters.

      I am happy to see you like the comment – always good with some additional entertainment on the blog. It is indeed true that a lot of great computer science has come out of the states. This makes taking the lower common denominator particularly insulting to the pioneers and their hard work.

      Best regards -TK

      On 14 February 2014 08:11, Thomas Kejser's Database Blog wrote:

      >

  1. January 22, 2014 at 09:38

Leave a comment