Learn 10 useful tips and tricks that help boost SQL Server application performance—and streamline development and deployment as well.
学习10个有用的技巧来提高SQL Server应用程序的运行效率,简化开发和部署。
-by Roman Rehak, 本文属于转载,谢谢原作者Roman Rehak.
Developers love database programming tips, tricks, and workarounds—especially those that slash development time or boost application performance. Here's a collection of such nuggets for developing SQL Server applications.
1) Replace COUNT(*) With EXISTS When Checking for Existence (当检查记录是否存在时,使用EXISTS代替COUNT(*))
Developers often use the value from the COUNT(*) function when enforcing business rules in Transact-SQL code. However, try the EXIST clause instead if you're using the COUNT(*) value only to evaluate whether you have at least one row that meets certain conditions. For example, consider this code from the Northwind database:
IF (SELECT COUNT(*) FROM Orders
WHERE ShipVia = 3) > 0
PRINT 'You cannot delete this shipper'
The execution plan shows that SQL Server has to read all 255 rows in the Orders table before evaluating the IF expression. You can achieve the same result more efficiently with EXISTS because the IF condition evaluates to true as soon as SQL Server finds the first occurrence of 3 in the ShipVia column:
IF EXISTS (SELECT * FROM Orders
WHERE ShipVia = 3)
PRINT 'You cannot delete this shipper'
The difference in total execution time isn't much in a sample database such as Northwind, but use this efficient query against an Orders table with millions of rows and you'll see a major speed improvement.
2) Be Careful When Using WHERE IN and WHERE NOT IN(谨慎使用WHERE IN 和 WHERE NOT IN)
SQL Server doesn't always choose an optimal execution plan when you have a substantial list of values in the WHERE IN clause. Using WHERE IN and WHERE NOT IN clauses in T-SQL code can produce an execution plan involving one or more nested loops. This increases the number of comparisons SQL Server must perform exponentially. Use the WHERE IN clause only if you have a short list of values you need to evaluate:
USE Northwind
--This query takes 9 ms to execute
SELECT *
FROM Customers
WHERE CustomerID NOT IN
(SELECT CustomerID FROM Orders)
Replace the WHERE IN clause with OUTER JOIN if you're using a subquery to generate a potentially large list. Doing so can improve performance significantly:
USE Northwind
--This query takes 3 ms to execute
SELECT c.*
FROM Customers c
LEFT OUTER JOIN Orders o
ON o.CustomerID = c.CustomerID
WHERE o.CustomerID IS NULL
In this case, the second query uses LEFT OUTER JOIN, producing an execution plan that lets it run about three times faster than the first query.
The LEFT OUTER JOIN selects all rows from the Customer table—whether or not a customer placed any orders—and joins them with the Orders table. Then the WHERE clause filters out the rows where the columns from the Orders table have NULL values. Either way, you get a list of customers who placed no orders, but the second way gives SQL Server a lot less work to do. I rewrote a query recently using this technique, and the execution time went from 50 seconds to about 500 ms.
3) Randomize Resultset Orders With NewID() (使用NewID()来产生随机记录集)
You occasionally might need to randomize the order of the resultset retrieved from SQL Server. This is often the case in database searches where certain products or services would gain unfair advantage against others based simply on their name. I've seen a few clever (and not so clever) solutions for randomizing resultsets, but the solution is actually simple. You can use the NewID() function in Transact-SQL to generate a GUID for each row, then order the results by the generated GUID:
SELECT * FROM Products
ORDER BY NEWID()
SQL Server returns products in a different order every time you run the query. You also can use this technique to return a random row from a table:
SELECT TOP 1 * FROM Products
ORDER BY NEWID()
However, be careful when using this technique with large tables. You're only asking for one random row, but the execution plan shows that SQL Server gives you that random row only after reading each row in the table, generating a GUID for each row, then sorting all the rows. Consequently, SQL Server needs several seconds to give you a random row from a table with a few million rows. So don't use the "SELECT TOP 1…" technique on huge tables. Instead, restrict the random selection to a subset of the large table. Select a random range, then use "SELECT TOP 1…" within that range.
4) Increase Default Packet Size for Large Data Fields(对大的数据字段,提高缺省的数据包容量)
SQL Server client applications communicate with instances of SQL Server through Net-Libraries. Client Net-Libraries and Server Net-Libraries communicate over the network by exchanging network packets. The size of network packets depends on the data access API you're using. The default packet size is 4,096 bytes in ADO and 8,192 bytes in ADO.NET.
These sizes work well in most scenarios, but sometimes you can improve data exchange velocity greatly by increasing packet size—especially if you're sending or receiving large amounts of XML data, or if you're storing images in SQL Server. The client and server libraries must exchange and process fewer packets when you increase packet size. Maximum packet size is 32,767 bytes. You can increase packet size in ADO or ADO.NET simply by including the Packet Size property in the connection string:
"…;Packet Size=32767;…"
The speed gained in data upload and download depends on the size of your data fields and on your network topology. One of my applications stores XML data in SQL Server, with each XML field about 500K in size. My benchmarks show that the application exchanged XML data with SQL Server about twice as fast after increasing the packet size to 32,767 bytes.
5) Name Your SQL Server Applications(命名SQL Server应用程序)
You can often easily identify what process belongs to which application when you examine SQL Server activity in Enterprise Manager or with SQL Server Profiler, because the Application column contains a value instead of being blank. SQL Server receives the application name from the client data access API. You've probably noticed that .NET applications often show ".Net SqlClient Data Provider" as their name. This happens when you don't specify the application name when you create an ADO.NET connection. ".Net SqlClient Data Provider" becomes the useless value ADO.NET winds up sending to SQL Server.
You can easily populate the Application Name property with a unique name for your application. Set the application name in ADO or ADO.NET by including Application Name in the connection string:
"…;Application Name=MyApplication;…"
Populating the Application Name property simplifies identifying your application activity on the server; you can use this name as the value of the Application Name filter in SQL Server Profiler traces. This can help you with distributed applications where your app might be running on multiple servers and possibly using multiple databases. Use Application Name as the filter, then run a single trace and capture the workload specific to your app coming from multiple client computers.