Thursday, March 28, 2013

Data table as PARAMETER to Stored Procedure in SQL SERVER 2008

Here i will explain how to pass Data-table as parameter to a Stored Procedure in SQL Server 2008.

Create a table:

Create table ShoppingList
(ShoppingId INT IDENTITY(1,1) PRIMARY KEY,
Shoppingdate DATETIME DEFAULT GETDATE(),
ProductID INT,
Quantity INT,
ProductPrice MONEY)
 

Create the table data type:

CREATE TYPE dbo.TblShoppingList AS TABLE
(
Product INT,
Quantity INT,
ProductPrice MONEY
)
 

Create the Stored Procedure that has the table variable as the parameter:
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE PROCEDURE dbo.InsertShoppingList
(
    @ShoppingVAR dbo.TblShoppingList READONLY
)
AS
BEGIN
 SET NOCOUNT ON;
 INSERT INTO ShoppingList(ProductID, Quantity, ProductPrice)
 SELECT Product, Quantity, ProductPrice FROM @ShoppingVAR
 SET NOCOUNT OFF; 
END
GO

Here is the C# code for a console application:

static void Main(string[] args)
{
    DataSet dataset = new DataSet();
    DataTable dataTable = new DataTable();
    dataTable.Columns.Add("ProductId");
    dataTable.Columns.Add("Quantity");
    dataTable.Columns.Add("ProductPrice"); 

    var dr = dataTable.NewRow(); 
    dr["ProductId"] = 1; 
    dr["Quantity"] = 1;
    dr["ProductPrice"] = 1; 

    dataTable.Rows.Add(dr);
    using (SqlConnection conn = new SqlConnection(
      @"Data Source=.\SqlExpress;Initial Catalog=Shopping;Integrated Security=True;"))
    {
        conn.Open();       
        using (SqlCommand cmd = new SqlCommand("[dbo].[InsertShoppingList]"))
        {
            cmd.CommandType = CommandType.StoredProcedure;
            cmd.Connection = conn;
            cmd.Parameters.AddWithValue("@ShoppingVAR", dataTable);
            using (SqlDataAdapter adp = new SqlDataAdapter(cmd))
            {
                adp.Fill(dataset);
            }
        }
    }
}

Clustered Tables vs Heap Tables



Problem
One very important design aspect when creating a new table is the decision to create or not create a clustered index.  A table that does not have a clustered index is referred to as a HEAP and a table that has a clustered index is referred to as a clustered table.  A clustered table provides a few benefits over a heap such as physically storing the data based on the clustered index, the ability to use the index to find the rows quickly and the ability to reorganize the data by rebuilding the clustered index.  Depending on the INSERT, UPDATE and DELETE activity against your tables your physical data can become very fragmented.  This fragmentation can lead to wasted space in your database, because of partly full pages as well as the need to read several more pages in order to satisfy the query.  So what can be done?
Solution
The primary issue that we want to address is the fragmentation that occurs with normal database activity.  Depending on whether your table has a clustered index or not will determine if you can easily address the fragmentation problem down to the physical data level.  Because a heap or a clustered index determines the physical storage of your table data, there can only be one of these per table.  So a table can either have one heap or one clustered index.
Let's take a look at the differences between a heap and clustered table.
HEAP
  • Data is not stored in any particular order
  • Specific data can not be retrieved quickly, unless there are also non-clustered indexes
  • Data pages are not linked, so sequential access needs to refer back to the index allocation map (IAM) pages
  • Since there is no clustered index, additional time is not needed to maintain the index
  • Since there is no clustered index, there is not the need for additional space to store the clustered index tree
  • These tables have a index_id value of 0 in the sys.indexes catalog view

Clustered Table
  • Data is stored in order based on the clustered index key
  • Data can be retrieved quickly based on the clustered index key, if the query uses the indexed columns
  • Data pages are linked for faster sequential access
  • Additional time is needed to maintain clustered index based on INSERTS, UPDATES and DELETES
  • Additional space is needed to store clustered index tree
  • These tables have a index_id value of 1 in the sys.indexes catalog view
source: SQL Server 2005 books online
So based on the above you can see there are a few fundamental differences on whether a table has a clustered index or not.
Fragmentation
A problem that occurs on all tables is the issue of becoming fragmented.  Depending on the activity performed such as DELETES, INSERTS and UPDATES, your heap tables and clustered tables can become fragmented.  A lot of this depends on the activity as well as the key values that are used for your clustered index. 
  • If your heap table only has INSERTS occurring, your table will not become fragmented, since only new data is written.
  • If your clustered index key is sequential, such as an identity value, and you only have INSERTS, again this will not become fragmented since the new data is always written at the end of the clustered index.
  • But if your table is either a heap or a clustered table and there are a lot of INSERTS, UPDATES and DELETES the data pages can become very fragmented.  This results in wasted space as well as additional data pages to read to satisfy the queries. 
    • When a table is created as a heap, SQL Server does not force where the new data pages are written.  Whenever new data is written this data is always written at the end of the table or on the next available page that is assigned to this table.  When data is deleted the space becomes free in the data pages, but it is not reused because new data is always written to the next available page.
    • With a clustered index, depending on the index key, new records may be written to existing pages where free space exists or there may be need to split a page into multiple pages in order to insert the new data.  When deletes occur the same issue occurs as with a heap, but this free space may be used again if data needs to be inserted into one of the existing pages that has free space.
    • So based on this, your heap table could become more fragmented then your clustered table.
Identifying Fragmentation
To identify whether your clustered table or heap table is fragmented you need to either run DBCC SHOWCONTIG (2000 or 2005) or use the new DMV sys.dm_db_index_physical_stats (2005).  These commands will give you insight into the fragmentation problems that may exist in your table.  For further information on this take a look at this past tip: SQL Server 2000 to 2005 Crosswalk - Database Fragmentation.
Resolving Fragmentation
Clustered Tables
Resolving the fragmentation for a clustered table can be done easily by rebuilding or reorganizing your clustered index.  

Heap Tables
For heap tables this is not as easy.  The following are different options you can take to resolve the fragmentation:
  1. Create a clustered index
  2. Create a new table and insert data from the heap table into the new table based on some sort order
  3. Export the data, truncate the table and import the data back into the table
Additional Info
When creating a new table via Enterprise Manager or Management Studio when you specify a primary key for the table, the management tools automatically make this a clustered index, but this can be overridden.  When creating a new table via scripts you need to identify that the table be created with a clustered index.   So based on this most of your tables are going to have a clustered index, because of the primary key, but if you do not specify a primary key or build a clustered index the data will be stored as a heap.

Best Practices while creating Stored Procedures



  1. Use proper indentation for the statements in SQL Server. It will improve the readability.
  2. Write the proper comments between the logics. So the others can understand quickly.
  3. Write all the SQL Server keywords in the CAPS letter. For example SELECT, FROM and CREATE.
  4. Write the stored procedure name with full qualified names.

    CREATE PROCEDURE [dbo].EmployeeSalaryCalculation
     
  5. Always try to declare the DECLARATION and initialization at the beginning of the stored procedure.
  6. It is not recommended to use more variables in the procedure. It will occupy more space in the memory. 
  7. Do not write the stored procedure name beginning with sp_. It is reserved for the system stored procedures in SQL Server and when the request comes to the SQL Server engine, it will be considerd to be a system stored procedure and looks for it in the master database. After it understands that this is a user defined stored procedure, it requires a bit more response time. So name the procedure name with another prefix such a proc_.
  8. Set the SET NOCOUNT ON option in the beginning of the stored procedure to avoid the unnecessary message like number of rows affected by the SQL Server.
  9. Try to avoid the temp table in the stored procedure. Stored procedures usually use a cached execution plan to increase the performance. When you use the temp table it will do the compilation every time.
  10. Do not use the select all columns (SELECT *) option; use only specific columns to query the result.
  11. Try to avoid the cursor in the stored procedure. It will consume more memories. It will degrade the performance of the stored procedure. Try to use the table variable and WHILE loop statement to iterate the query result set.
  12. Set the default value to the parameter and always set the size of the variable to be equivalent to or more than the table field column length. For example Name (10) in the table, but if you give Name(25) in the procedure then you will get the run time error time "string truncated  error".
  13. Use the Try catch statement properly in the stored procedure to handle the errors in the runtime.
  14. Move the complex query into views.
  15. If you want to return the single column result then prefer to use the output statement to return the result to the calling program rather than table result.
  16. Avoid the sub-queries and use the INNER JOIN. Try to avoid the filtering condition in the where clause and it can be written in the joining time itself. When joins the table itself it will be filtered and it will filter again from the joined result table.
  17. Use the SELECT TOP 1 in the exists condition checking.
  18. Do not do like this:

    SELECT @name=name FROM employees WHERE name like '%rob%'

    This will give the run time error when returns more than one result.

    SELECT TOP 1 @name=name FROM employees WHERE name like '%rob%'

    It is always recommended to use the TOP 1 in that case. The result may differ from what is  expected.
  19. Avoid the nested IF statements and use the CASE statement. It will execute the matching part immediately.
  20. Dynamic Queries - Try to minimize the usage of dynamic queries. If you are using a dynamic query like:

    SELECT * FROM mydb.dbo.emp where empid = @eid then there is no problem.

    You can supply a value for the @eid parameter and there is no recompilation of the execution plan in the database cache. But if you are using a SQL query like SELECT * FROM emp where empid = " + @eid and supply a parameter.
     
  21. Use the ORDER BY and DISTINCT, TOP only when requires. The SQL Server engine will get the result first and it will do again the query execution for these operations.
  22. It is recommended to use a Table variable when the result set is small. It is always in the memory and when the limit exceeds it will be created as a table in the temp. But the temp table will be created on the temp database and that makes it slower.
  23. Use the proper indexing to the columns in the table. Do not create an index on the columns that are not used anywhere in the where clause. It will require an extra roundtrip to query the result.

Find all tables not referenced in stored procedures

SELECT SCHEMA_NAME(t.schema_id) [Owner],
       t.name [TableName]
FROM   sys.tables t
WHERE  is_ms_shipped = 0
AND not  EXISTS (SELECT *
                       FROM   sys.sql_expression_dependencies d
                       WHERE  d.referenced_entity_name = t.name
                              AND (( is_ambiguous = 1 or is_caller_dependent=1)
                                     OR
                          d.referenced_id = t.object_id)  )