Wednesday, May 5, 2010

Optimizing Designing Procedure



1. Use stored procedures instead of heavy-duty queries.

This can reduce network traffic as your client will send to the server only the stored procedure name (perhaps with some parameters) instead of all the text from a large heavy-duty query. Stored procedures can be used to enhance security and conceal underlying data objects as well. For example, you can give the users permission to execute the stored procedure to work with restricted sets of columns and data.


2. Include the SET NOCOUNT ON statement in your stored procedures to stop the message indicating the number of rows affected by a Transact-SQL statement.

This can reduce network traffic due to the fact that your client will not receive the message indicating the number of rows affected by a Transact-SQL statement.


3. Call stored procedures using their fully qualified name.

The complete name of an object consists of four identifiers: the server name, database name, owner name, and object name. An object name that specifies all four parts is known as a fully qualified name.

Using fully qualified names eliminates any confusion about which stored procedure you want to run and can boost performance because SQL Server has a better chance to reuse the stored procedures execution plans if they were executed using fully qualified names.


4. Consider returning the integer value as a RETURN statement instead of returning an integer value as part of a recordset.

The RETURN statement exits unconditionally from a stored procedure, so the statements following RETURN are not executed. Though the RETURN statement is generally used for error checking, you can use this statement to return an integer value for any other reason. Using the RETURN statement can boost performance because SQL Server will not create a recordset.


5. Don't use the prefix "sp_" in the stored procedure name if you need to create a stored procedure to run in a database other than the master database.

The prefix "sp_" is used in the system stored procedures names. Microsoft does not recommend using the prefix "sp_" in user-created stored procedure names as SQL Server always looks for a stored procedure beginning with "sp_" in the following order: the master database, the stored procedure based on the fully qualified name provided, followed by the stored procedure using dbo as the owner (if one is not specified).

When you have the stored procedure with the prefix "sp_" in a database other than master, the master database is always checked first. If the user-created stored procedure has the same name as a system stored procedure, the user-created stored procedure will never be executed.


6. Use the sp_executesql stored procedure instead of the EXECUTE statement.

The sp_executesql stored procedure supports parameters. So, using the sp_executesql stored procedure instead of the EXECUTE statement improves readability of your code when many parameters are used.

When you use the sp_executesql stored procedure to execute a Transact-SQL statement that will be reused many times, the SQL Server query optimizer will reuse the execution plan it generates for the first execution when the change in parameter values to the statement is the only variation.


7. Use the sp_executesql stored procedure instead of temporary stored procedures.

Microsoft recommends using temporary stored procedures when connecting to earlier versions of SQL Server that do not support the reuse of execution plans. Applications connecting to SQL Server 7.0 or SQL Server 2000 should use the sp_executesql system stored procedure instead of temporary stored procedures in order to have a better chance of reusing the execution plans.


8. If you have a very large stored procedure, try to break down the stored procedure into several sub-procedures, and call them from a controlling stored procedure.

The stored procedure will be recompiled when any structural changes are made to a table or view referenced by the stored procedure (an ALTER TABLE statement, for example), or when a large number of INSERTS, UPDATES or DELETES are made to a table referenced by a stored procedure. So, if you break down a very large stored procedure into several sub-procedures, there's a chance that only a single sub-procedure will be recompiled, while other sub-procedures will not.


9. Try to avoid using temporary tables inside your stored procedures.

Using temporary tables inside stored procedures reduce the chance to reuse the execution plan.


10. Try to avoid using DDL (Data Definition Language) statements inside your stored procedure.

Using DDL statements inside stored procedures also reduce the chance to reuse the execution plan.


11. Add the WITH RECOMPILE option to the CREATE PROCEDURE statement if you know that your query will vary each time it is run from the stored procedure.

The WITH RECOMPILE option prevents reusing the stored procedure execution plan, so SQL Server does not cache a plan for this procedure and the procedure is always recompiled at run time. Using the WITH RECOMPILE option can boost performance if your query will vary each time it is run from the stored procedure, because in this case the wrong execution plan will not be used.


12. Use SQL Server Profiler to determine which stored procedures have been recompiled too often.

To check if a stored procedure has been recompiled, run SQL Server Profiler and choose to trace the event in the "Stored Procedures" category called "SP:Recompile". You can also trace the event "SP:StmtStarting" to see at what point in the procedure it is being recompiled. When you identify these stored procedures, you can take some correction actions to reduce or eliminate the excessive recompilations.


Data Type Performance Tuning Tips for Microsoft SQL Server

 

Always specify the narrowest columns you can. The narrower the column, the less amount of data SQL Server has to store, and the faster SQL Server is able to read and write data. In addition, if any sorts need to be performed on the column, the narrower the column, the faster the sort will be. [6.5, 7.0, 2000, 2005] 

*****

If you need to store large strings of data, and they are less than 8,000 characters, use a VARCHAR data type instead of a TEXT data type. TEXT data types have extra overhead that drag down performance. [7.0, 2000, 2005] 

*****

If you have a database running on SQL Server 7.0, 2000, or 2005 that used to run under version 6.5, and because of the limited row size had to split a column into two or more columns because the column width exceed what SQL Server version 6.5 was able to support, consider altering the table so that the multiple columns now fit back into one column again. This of course assumes that your column width is 8,000 characters or less for ASCII data. This will reduce server overhead and boost performance. [6.5, 7.0, 2000, 2005] 

*****

Don't use the NVARCHAR or NCHAR data types unless you need to store 16-bit character (Unicode) data. They take up twice as much space as VARCHAR or CHAR data types, increasing server I/O and wasting unnecessary space in your buffer cache. [7.0, 2000, 2005] 

*****

If the text data in a column varies greatly in length, use a VARCHAR data type instead of a CHAR data type. The amount of space saved by using VARCHAR over CHAR on variable length columns can greatly reduce I/O reads cache memory used to hold data, improving overall SQL Server performance.

Another advantage of using VARCHAR over CHAR columns is that sorts performed on VARCHAR columns are generally faster than on CHAR columns. This is because the entire width of a CHAR column needs to be sorted. [6.5, 7.0, 2000, 2005]

*****

If a column's data does not vary widely in length, consider using a fixed-length CHAR field instead of a VARCHAR. While it may take up a little more space to store the data, processing fixed-length columns is faster in SQL Server than processing variable-length columns. [6.5, 7.0, 2000, 2005] 

*****

Always choose the smallest data type you need to hold the data you need to store in a column. For example, if all you are going to be storing in a column are the numbers 1 through 10, then the TINYINT data type is more appropriate that the INT data type. The same goes for CHAR and VARCHAR data types. Don't specify more characters in character columns that you need. This allows you to store more rows in your data and index pages, reducing the amount of I/O needed to read them. It also reduces the amount of data moved from the server to the client, reducing network traffic and latency. And last of all, it reduces the amount of wasted space in your buffer cache.  [6.5, 7.0, 2000, 2005] Updated 8-21-2005

*****

If you have a column that is designed to hold only numbers, use a numeric data type, such as INTEGER, instead of a VARCHAR or CHAR data type. Numeric data types generally require less space to hold the same numeric value as does a character data type. This helps to reduce the size of the columns, and can boost performance when the columns is searched (WHERE clause), joined to another column, or sorted. [6.5, 7.0, 2000, 2005] Updated 10-16-2005

*****

Don't use FLOAT or REAL data types for primary keys, as they add unnecessary overhead that hurts performance. Use one of the integer data types instead. [6.5, 7.0, 2000, 2005] 

*****

When specifying data types during table creation, always specify NULL or NOT NULL for each column. If you don't, then the column will default to NOT NULL if the ANSI NULL DEFAULT database option is not selected (the default), and will default to NULL of the ANSI NULL DEFAULT database option is selected.

For best performance, and to reduce potential code bugs, columns should ideally be set to NOT NULL. For example, use of the IS NULL keywords in the WHERE clause makes that portion of the query non-sargable, which means that portion of the query cannot use an index. [6.5, 7.0, 2000, 2005] 

*****

If you are using fixed length columns (CHAR, NCHAR) in your table, do your best to avoid storing NULLs in them. If you do, the entire amount of space dedicated to the column will be used up. For example, if you have a fixed length column of 255 characters, and if you place a NULL in it, then 255 characters have to be stored in the database. This is a large waste of space that will cause SQL Server to have to perform extra disk I/O to read data pages. It also wastes space in the data cache buffer. Both of these contribute to reduced SQL Server performance.

Instead of using NULLs, use a coding scheme similar to this in your databases:

  • NA: Not applicable
  • NYN: Not yet known
  • TUN: Truly unknown

Such a scheme provides the benefits of using NULLs, but without the drawbacks.

If you really must use NULLs, use a variable length column instead of a fixed length column. Variable length columns only use a very small amount of space to store a NULL. [7.0, 2000, 2005]


Optimizing Designing Tables


  1. Normalize your tables to the third normal form.
    A table is in third normal form (3NF) if it is in second normal form (2NF) and if it does not contain transitive dependencies. In most cases, you should normalize your tables to the third normal form. The normalization is used to reduce the total amount of redundant data in the database. The less data there is, the less work SQL Server has to perform, speeding its performance.

  2. Consider the denormalization of your tables from the forth or fifth normal forms to the third normal form.
    Normalization to the forth and fifth normal forms can result in some performance degradation, especially when you need to perform many joins against several tables. It may be necessary to denormalize your tables to prevent performance degradation.

  3. Consider horizontal partitioning of the very large tables into the current and the archives versions.
    The less space used, the smaller the table, the less work SQL Server has to perform to evaluate your queries. For example, if you need to query only data for the current year in your daily work, and you need all the data only once per month for the monthly report, you can create two tables: one with the current year's data and one with the old data.

  4. Create the table's columns as narrow as possible.
    This can reduce the table's size and improve performance of your queries as well as some maintenance tasks (such as backup, restore and so on).

  5. Try to reduce the number of columns in a table.
    The fewer the number of columns in a table, the less space the table will use, since more rows will fit on a single data page, and less I/O overhead will be required to access the table's data.

  6. Try to use constraints instead of triggers, rules, and defaults whenever possible.
    Constraints are much more efficient than triggers and can boost performance. Constraints are more consistent and reliable in comparison to triggers, rules and defaults, because you can make errors when you write your own code to perform the same actions as the constraints.

  7. If you need to store integer data from 0 through 255, use tinyint data type.
    The columns with tinyint data type use only one byte to store their values, in comparison with two bytes, four bytes and eight bytes used to store the columns with smallint, int and bigint data types accordingly. For example, if you design tables for a small company with 5-7 departments, you can create the departments table with the DepartmentID tinyint column to store the unique number of each department.

  8. If you need to store integer data from -32,768 through 32,767, use smallint data type.
    The columns with smallint data type use only two bytes to store their values, in comparison with four bytes and eight bytes used to store the columns with int and bigint data types respectively. For example, if you design tables for a company with several hundred employees, you can create an employee table with the EmployeeID smallint column to store the unique number of each employee.

  9. If you need to store integer data from -2,147,483,648 through 2,147,483,647, Use int data type.
    The columns with int data type use only four bytes to store their values, in comparison with eight bytes used to store the columns with bigint data types. For example, to design tables for a library with more than 32,767 books, create a books table with a BookID int column to store the unique number of each book.

  10. Use smallmoney data type instead of money data type, if you need to store monetary data values from 214,748.3648 through 214,748.3647.
    The columns with smallmoney data type use only four bytes to store their values, in comparison with eight bytes used to store the columns with money data types. For example, if you need to store the monthly employee payments, it might be possible to use a column with the smallmoney data type instead of money data type.

  11. Use smalldatetime data type instead of datetime data type, if you need to store the date and time data from January 1, 1900 through June 6, 2079, with accuracy to the minute.
    The columns with smalldatetime data type use only four bytes to store their values, in comparison with eight bytes used to store the columns with datetime data types. For example, if you need to store the employee's hire date, you can use column with the smalldatetime data type instead of datetime data type.

  12. Use varchar/nvarchar columns instead of text/ntext columns whenever possible.
    Because SQL Server stores text/ntext columns on the Text/Image pages separately from the other data, stored on the Data pages, it can take more time to get the text/ntext values.

  13. Use char/varchar columns instead of nchar/nvarchar if you do not need to store unicode data.
    The char/varchar value uses only one byte to store one character, the nchar/nvarchar value uses two bytes to store one character, so the char/varchar columns use two times less space to store data in comparison with nchar/nvarchar columns.

  14. Consider setting the 'text in row' SQL Server 2000 table's option.
    The text, ntext, and image values are stored on the Text/Image pages, by default. This option specifies that small text, ntext, and image values will be placed on the Data pages with other data values in a data row. This can increase the speed of read and write operations and reduce the amount of space used to store small text, ntext, and image data values. You can set the 'text in row' table option by using the sp_tableoption stored procedure.

  15. If you work with SQL Server 2000, use cascading referential integrity constraints instead of triggers whenever possible.
    For example, if you need to make cascading deletes or updates, specify the ON DELETE or ON UPDATE clause in the REFERENCES clause of the CREATE TABLE or ALTER TABLE statements. The cascading referential integrity constraints are much more efficient than triggers and can boost performance.


Transact-SQL Optimization Tips


1. Try to restrict the queries result set by using the WHERE clause.

This can result in a performance benefit, as SQL Server will return to the client only particular rows, not all rows from the table(s). This can reduce network traffic and boost the overall performance of the query.


2. Try to restrict the queries result set by returning only the particular columns from the table, not all the table's columns.

This can result in a performance benefit as well, because SQL Server will return to the client only particular columns, not all the table's columns. This can reduce network traffic and boost the overall performance of the query.


3. Use views and stored procedures instead of heavy-duty queries.

This can reduce network traffic as your client will send to the server only stored procedures or view name (perhaps with some parameters) instead of large heavy-duty queries text. This can be used to facilitate permission management also, because you can restrict user access to table columns they should not see.


4. Whenever possible, try to avoid using SQL Server cursors.

SQL Server cursors can result in some performance degradation in comparison with select statements. Try to use correlated subquery or derived tables, if you need to perform row-by-row operations.


5. If you need to return the total table's row count, you can use an alternative way instead of the SELECT COUNT(*) statement.

Because the SELECT COUNT(*) statement makes a full table scan to return the total table's row count, it can take an extremely long time for large tables. There is another way to determine the total row count in a table. In this case, you can use the sysindexes system table. There is a ROWS column in the sysindexes table. This column contains the total row count for each table in your database. So, you can use the following select statement instead of SELECT COUNT(*):

SELECT rows FROM sysindexes WHERE id = OBJECT_ID('table_name') AND indid < 2

This way, you can improve the speed of such queries by several times. See this article for more details: Alternative way to get the table's row count.


6. Try to use constraints instead of triggers, whenever possible.

Constraints are much more efficient than triggers and can boost performance. So, whenever possible, you should use constraints instead of triggers.


7. Use table variables instead of temporary tables.

Table variables require fewer locking and logging resources than temporary tables, so table variables should be used whenever possible. The table variables are available in SQL Server 2000 only.


8. Try to avoid the HAVING clause, whenever possible.

The HAVING clause is used to restrict the result set returned by the GROUP BY clause. When you use GROUP BY with the HAVING clause, the GROUP BY clause divides the rows into sets of grouped rows and aggregates their values, and then the HAVING clause eliminates undesired aggregated groups. In many cases, you can write your select statement so that they will contain only WHERE and GROUP BY clauses without the HAVING clause. This can improve the performance of your query.


9. Whenever possible, try to avoid using the DISTINCT clause.

Because using the DISTINCT clause will result in some performance degradation, you should use this clause only when it is absolutely necessary.


10. Include SET NOCOUNT ON statement into your stored procedures to stop the message indicating the number of rows affected by a T-SQL statement.

This can reduce network traffic, as your client will not receive the message indicating the number of rows affected by a T-SQL statement.


11. Use select statements with the TOP keyword or the SET ROWCOUNT statement if you need to return only the first n rows.

This can improve performance of your queries, as a smaller result set will be returned. This can also reduce the traffic between the server and the clients.


12. Use the FAST number_rows table hint if you need to quickly return 'number_rows' rows.

You can quickly get the n rows and can work with them when the query continues execution and produces its full result set.


13. Try to use UNION ALL statement instead of UNION, whenever possible.

The UNION ALL statement is much faster than UNION, because UNION ALL statement does not look for duplicate rows, while the UNION statement does look for duplicate rows, whether they exist or not.


14. Do not use optimizer hints in your queries.

Because the SQL Server query optimizer is very clever, it is highly unlikely that you can optimize your query by using optimizer hints; more often than not, this will hurt performance.


Optimization tips

  • Every index increases the time in takes to perform INSERTS, UPDATES and DELETES, so the number of indexes should not be very much. Try to use maximum 4-5 indexes on one table, not more. If you have read-only table, then the number of indexes may be increased.

  • Keep your indexes as narrow as possible. This reduces the size of the index and reduces the number of reads required to read the index.

  • Try to create indexes on columns that have integer values rather than character values.

  • If you create a composite (multi-column) index, the order of the columns in the key are very important. Try to order the columns in the key as to enhance selectivity, with the most selective columns to the leftmost of the key.

  • If you want to join several tables, try to create surrogate integer keys for this purpose and create indexes on their columns.

  • Create surrogate integer primary key (identity for example) if your table will not have many insert operations.

  • Clustered indexes are more preferable than nonclustered, if you need to select by a range of values or you need to sort results set with GROUP BY or ORDER BY.

  • If your application will be performing the same query over and over on the same table, consider creating a covering index on the table.

  • You can use the SQL Server Profiler Create Trace Wizard with "Identify Scans of Large Tables" trace to determine which tables in your database may need indexes. This trace will show which tables are being scanned by queries instead of using an index.

  • You can use sp_MSforeachtable undocumented stored procedure to rebuild all indexes in your database. Try to schedule it to execute during CPU idle time and slow production periods.

    sp_MSforeachtable @command1="print '?' DBCC DBREINDEX ('?')"



 



No comments:

Post a Comment