Default Configuration of SQL Server (and query hints)
Throughout the years, I have become convinced that the default settings used in SQL Server are often wrong for systems that need scale or are properly managed. There is a series of settings I find myself consistently changing when I audit or install a new server. They are “my defaults”
I thought I would share those here. Bear in mind that these are setting that assume a certain level of rational behaviour in the organisation you find yourself in. If you are working in a bank, they may not apply to you.
Server Configuration
Here is what I always do
Configuration | Reason |
Grant Lock Pages in Memory | Paging the buffer pool is bad. |
Set Max Memory | So you don’t take too many of above |
Grant Perform Volume Maintenance tasks to the SQL Server account | Allow instant file initialisation. See: http://technet.microsoft.com/en-us/library/ms175935(v=sql.105).aspx |
tempdb #files = #cores | I have shown that this is the most scalable configuration |
model in simple mode and autogrowth off | Prevents new users from allocating a database that he has not consciously made good backup choices on |
Enabled Async Stats in Model | I have yet to find a case where async stats are not better than the default |
Named Pipes off, TCP/IP on | I never found a real use case for named pipes. Turning them off is one less surface area for malicious attackers to exploit |
max worker threads | On a modern 2 socket, I generally set this to 8192. Typically, you can run more threads than the default configuration allows. Threads that are un-used only burn a bit of memory, which is cheap. |
-T1118 | Mixed extends are a joke. Off with them! |
-T1117 | If you are managing you data files right, you need them to grow at the same rate. Failure to do so means you are doing something wrong – and I don’t configure for people who do things wrong |
-T4199 | Always enable the latest optimiser fixes. There is enough bad stuff happening with the optimiser all the time (nature of the beast). Being on the latest version is much better than being stuck in the past |
-T3226 | Logging successful backups to the error log is a royal pain. That log is reserved for error conditions and other things you need to carefully manage. Logging of backup status belongs in the agent or in the backup system you run. |
-T835 | Enable Locked Pages on Standard Edition |
For a documentation of many of these trace flags, see this: http://technet.microsoft.com/en-us/library/ms188396.aspx
Database / Table Configuration
In addition to server configuration, there are a series of settings I automatically change on databases and tables
Setting | Reason |
Async stats | As described above. In my experience, always the superior choice. You need a good reason to have it off, not the other way around. |
New filegroup: DATA, set it to default | I never put objects in the PRIMARY filegroup. PRIMARY contains the metadata, and needs extra protection (I normally put it on the TLOG drives). The additional filegroups allow me to freely reallocate new tables and move things around between disk |
sp_autostats OFF on large tables | In nearly every case I have seen on large tables, auto stats are harmful. They always kick in at the most inconvenient times. And when they kick in, the cause the one thing that is worse than bad performance and data loss: Unpredictability.
I leave auto create stats enabled as this happens rarely enough to be useful and not disruptive. |
Table lock escalation OFF on large tables | Same problem as auto stats, when lock escalation kicks in it nearly always spells trouble. If you are running out of locks, you are doing something wrong that needs to be addressed. Lock escalation is not the solution – it typically makes the problem worse or hides the underlying issue.
See background details on the trade offs: http://technet.microsoft.com/en-us/library/ms184286(v=sql.105).aspx |
ALLOW_PAGE_LOCKS = OFF, ALLOW_ROW_LOCKS = ON for small, and sometimes large, tables | The lock manager often becomes the bottleneck at scale. For small tables, getting rid of page locks help. For large tables, make sure you can live with the the impact on table scans before turning it off |
#files = #cores | Yes, even for user databases! |
On Important Queries and Hinting
I would like to wrap this post up with another place where I deviate from the standard configuration and “best practices” in the SQL Server world.
If I find a stored procedure or query that either
A) Gets run often… or
B) Is very important and needs to be consistent (“important” is nearly always the same as consistent)
… I simply don’t trust the optimiser with these types of queries. Mission critical queries are much too important to leave in the hands of an unpredictable process like that.
Once I have seen a good plan for the query (and validated that I can’t make a better one myself) I force the query using hints. I don’t wait for it to go wrong, I do it when I see the query the first time (even if the plan is good). This has already saved our production systems from trouble and I am teaching this technique to my colleagues.
However, I do restrict myself to these hints:
Hint | Reason |
OPTION (<join type> JOIN) | Pretty safe, even under changing index conditions. In OLTP system, if OPTION (LOOP JOIN) cant give you a good plan, you are likely missing an index (and it will be obvious which one when you see the forced plan) |
LEFT/INNER <join type> JOIN | Allows me to force ordering and really control execution to harvest filters in the right order. |
FORCESCAN | The optimiser often gets this wrong for reporting queries. I find that it is a little too fond of seeking when a scan is better. Scans are often “safer” in the sense that you know they are proportional with the size of the table scanning. Seeks gone wrong (especially if they cause table spools) can really wreck havoc. |
FORCESEEK | While this hint can be risky (because it assumes the existence of an index) – there are rare cases where the optimiser needs a bit of manhandling. This is a hint of last resort if the join strategies don’t give me enough |
CROSS APPLY | Yes, I use this as a hint. If I know that a result will return a specific number of rows (and the optimiser does not) I can do a CROSS APPLY to SELECT TOP 1. A good example of this is BETWEEN joins on dates |
OPTIMIZE FOR UNKNOWN | My favourite hint. Most of the times, I don’t want a plan to assuming anything about the boundary values of stats in another table (my fondness for GUID helps here as they are more likely to be nicely spread out). This hint is great for stability under changing conditions |
MAXDOP | Very few queries (except those tuned by master Machanic) takes advantage of more than 8-16 cores. |
OPTION (HASH GROUP) | Sometimes, the optimiser will trigger a massive sort before going into an aggregate operation. If I know that the aggregate will have a small result (often the case) I force the hashing strategy instead to avoid the excessive memory usage of the sort. |
Hints I avoid:
Hint | Reason |
Index hints | I don’t want queries to break if I changes indexes. In my entire career, I think I have found only 1-2 cases where I needed an index hint and forcing of join orders wasn’t good enough. Index hints make queries very brittle – which I suspect is why query hints in general have such a bad rap. |
FORCE ORDER | If I force order, I also force the join strategy (as above). So I just haven’t found a need for this hint. |
OPTIMISE FOR (@var = x) | Making something dependent on the contents of a column is dangerous. Data changes faster than code.
However, Adam Machanic has some really cool ways to force some interesting side effects by using this hint together with a TOP (@var) (See his blog: http://sqlblog.com/blogs/adam_machanic/) |
Hi, great info, but adding filegroups is not allowed on the model database as far as I know. (I wish we could though)
Reblogged this on Gianluca's space and commented:
Test?
Hi Thomas,
Would you mind elaborating a bit on “CROSS APPLY”?
We do a lot of queries on date ranges, and anything that can help is much appreciated.
Best regards,
Henrik
Hi Henrik
Certainly (almost the subject of a new blog entry). There are a few tricks you can play with CROSS APPLY. First of all, imagine you have a query like this:
SELECT
FROM Fact
JOIN Dimension D ON Fact.SomeDate BETWEEN D.FromDate AND D.ToDate
As I have described here: http://blog.kejser.org/2012/06/29/why-date-between-fromdate-and-todate-is-a-dangerous-join-criteria/ this join is poison to the query optimiser, because it has no way of knowing that the join returns only one row.
One way to rewrite is this:
SELECT
FROM Fact
CROSS APPLY (SELECT TOP 1 * FROM Dimension D WHERE Fact.SomeDate BETWEEN D.FromDate AND D.ToDate)
This guarantees an estimate of 1 row from the join.
The other trick you can play involves increasing parallelism. AFAIK, this technique was invented by Adam Machanic and you can find the details in his blog: http://sqlblog.com/blogs/adam_machanic/archive/2012/04/03/sqlbits-london-2012-demos.aspx
If you are still runnning the warehouse I am thinking of, you may find Adam’s tricks of particular interest 🙂
Hi Thomas,
Interesting post, few things I am not doing and few that I am too. I have always wanted to add an additional file group to model, especially on development servers. I have tried in the past and generated an error. Just tried again…
“Alter failed for Database ‘model’. (Microsoft.SqlServer.Smo)
….
User-defined filegroups are not allowed on “model”. (Microsoft SQL Server, Error:1826)
”
Could you clarify the steps that you take to achieve this as I am keen to get this in place.
Hi Ally
You are right, it is not possible to modify Model with a filegroup. We recently changed our database layouts to use the additional filegroup, seems we must have included it in the defaults of the DB create script instead.
That’s a shame. I am going to suggest having it configured in a template SSDT project for our Devs to use instead.
Having just thought about it a bit more I guess it would mess with tempdb too and perhaps that explains the restriction.
One would think that adding a single line of code saying: only clone primary when creating tempdb would be possible in the SQL code base 🙂
Hi Thomas
Many thanks for this. I have a few questions about the trace flag recommendations, in the server configuration section:
Traces are a bit new to me. I’ve found that the following syntax works:
DBCC TRACEON (4199, -1)
However, you’ve included -T in all cases. Presumably, with the command above, the trace is lost if the SQL Server service is restarted?
So some further searching suggests that I should:
– run cmd
– navigate to C:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\Binn\ (or thereabouts)
– and (for each flag) run: sqlservr.exe -T4199
Is that correct?
Thanks very much!
Jon
I believe -T835 should be -T845
Hi Thomas,
I would add setting the power plan to maximum performance, configuring tran logs to be 8Gb in size ( initially ) ad grow in equal 8 Gb increments ( as recommended by Kimberley Tripp ) in the absence of any empirical evidence on which to configure these, good advice. Regarding hints, there have been some scenarios under which QUERYTRACEON(2301) can yield some very impression performance gains. You mention Adam Machanic and the TOP hint, this can also be used to “Brute force” the cardinality of SELECT statements in the source query for merge statements ** and ** in the absence of a cardinality hint which Oracle has, this can be used to achieve something similar.
On the how clever / dumb you should assume people are, a lot of my engagements are at places where there is a mind set of: “Its a Microsoft product therefore developers who can do everything from the GUI right to the back end can do the job”, another chestnut was “I don’t believe that working with SQL Server is real work, as its just a bloated version of Access”. I agree with your sentiment, however the reality of a lot of places, the small medium sized businesses beneath the CAT teams radar, plays into the scenarios Brent mentions.In fact I see so many recurring problems from one site to another, that I could create a standard report and only have to change the name of the customer in order to make it applicable to a specific customer.
There is a wider debate as to whether Microsoft should enforce a certain number of “Best practices” on people out of the box, i.e. not allow people to create databases on the system drive, not allow people to specify auto growth settings as percentages, not allow people to set max memory to the maximum memory in the server ( where does the thread stack go ? ).
On you your comment about the land of uncle Sam, its also the land of some institutions which are absolute temples to our trade, Carnegie Melon, MIT and Berkeley. I also believe that Donald Knuth hailed from there and John Von Neumann carried most his pioneering work in the states.
The really funny thing about blogs in general is that the comments sections can sometimes be more interesting and insightful than the actual articles that precipitates them.
Regards,
Chris
Hi Chris
I am no a big fan of setting the power plan to high performance. While it does yield some performance, the cost in power is non trivial. For a high performance, guaranteed response time, system I agree, but not as a generic recommendation.
With regards to the clever bit on MS: The “best practice” mindset and catering to the lowest common denominator only reinforces this image – it does nothing to dispel it. The defaults could obviously be better than they are, so there are steps MS can take to avoid too many common mistakes. In particular, putting recommendations in place that favor scale at the sacrifice of performance would greatly help refocus DBA on what matters.
I am happy to see you like the comment – always good with some additional entertainment on the blog. It is indeed true that a lot of great computer science has come out of the states. This makes taking the lower common denominator particularly insulting to the pioneers and their hard work.
Best regards -TK
On 14 February 2014 08:11, Thomas Kejser's Database Blog wrote:
>