Wise man or Wise guy? You decide: Code Example

Showing posts with label Code Example. Show all posts

Thursday, January 28, 2010

What the Exec?

I’ve been working with SQL Server for 10 years and have worked in production with versions 6.5 – 2005 and used 2008 personally. I know I haven’t seen everything, but a couple of weeks ago a friend of mine asked me to look at some stored procedures he was working on and I saw something I’d never seen before. The code was something like this (I’m using the AdventureWorks database, 2005 version):

CREATE PROCEDURE dbo.exec_example
(
@parameter VARCHAR(10) = 'employee',
@ID INT = 10 
)
AS

SET NOCOUNT ON;

DECLARE @sp_name NVARCHAR(100);

IF @parameter = 'employee'
BEGIN
SET @sp_name = 'dbo.uspGetEmployeeManagers'
END
ELSE
BEGIN
SET @sp_name = 'dbo.uspGetManagerEmployees'
END;

EXEC @sp_name @id;

RETURN;

The interesting part is where the stored procedure name is put in a variable and then you just use Exec @sp_name with the parameters listed after. I’d never seen this done before, and my first comment on the code was, “I don’t think that should work”, but it does. I also looked it up in BOL and here’s a couple of snippets from the EXECUTE (Transact-SQL) entry:

From the Arguments section:

@ module_name_var

Is the name of a locally defined variable that represents a module name.

From the Examples section:

E. Using EXECUTE with a stored procedure variable
The following example creates a variable that represents a stored procedure name.
DECLARE @proc_name varchar(30);
SET @proc_name = 'sys.sp_who';
EXEC @proc_name;

I had been contacted to try to help clean up and improve the code my friend was writing and this was a time where I learned something. This is one of the things I love about SQL Server, and technology in general, you can learn from anyone, even when you’ve been called in as the expert.

Tuesday, December 8, 2009

T-SQL Tuesday #001 – Dates and Times

T-SQL Tuesday was started by Adam Machanic (@AdamMachanic) on his blog to encourage SQL Bloggers to share their tips and tricks about a specific topic once a month. A great idea and a great way to get a topic to blog about!
I’ve been working with SQL Server since 1999 and it wasn’t until the last couple of years that I finally learned a better way to retrieve a date range. I used to do:

SELECT
columns
FROM
TABLE
WHERE
date_column BETWEEN start_date AND end_date;

I’m guessing that there are people reading this that are saying, “I do that all the time and it works fine for me, so what’s the problem?”. I understand where they are coming from, but I finally really understood that, until the specific DATE data type (date only) in SQL Server 2008, datetime and smalldatetime columns ALWAYS have a time part. This can and does affect date range queries. Here’s an example, albeit slightly contrived, but I’ve seen it happen in the real word:

DECLARE @sales TABLE (sale_id INT IDENTITY(1,1) PRIMARY KEY, sale_date DATETIME, sale_amt FLOAT);

WITH cteNums AS 
(
SELECT TOP 50
ROW_NUMBER() OVER (ORDER BY AC.NAME) AS N
FROM
sys.all_columns AS AC 
)
INSERT INTO @sales (
sale_date,
sale_amt
)
SELECT
DATEADD(DAY, -N, GETDATE()) AS sales_date,
N * ABS(CHECKSUM(NEWID()))/10000.00 AS sale_amt 
FROM
cteNums 

/* 
Setup for contrived example.
Make sure the first sale of the month is at midnight 
*/     
UPDATE @sales
SET sale_date = DATEADD(DAY, DATEDIFF(DAY, 0, sale_date), 0)
WHERE
DATEPART(DAY, sale_date) = 1

/*
Task is to get sales for the previous full month
*/
/*
Set up variables for the first and last day of th month
*/
DECLARE @start_date DATETIME,
@end_date DATETIME

/*
Set the variables to the first of last month and the last day of 
last month at the time of writing '2009-11-01' and '2009-11-30' See this blog post by Lynn Pettis for why I am using DATEADD and DATEPART with 0
*/        
SELECT
@start_date = DATEADD(MONTH, DATEDIFF(MONTH, 0, GETDATE())-1, 0),
@end_date = DATEADD(DAY, -1, DATEADD(MONTH, 1, @start_date))

SELECT @start_date, @end_date, DATEADD(DAY, 1, @end_date)

/*
My old method
*/
SELECT
COUNT(*) AS sales,
MIN(sale_date) AS first_sale,
MAX(sale_date) AS last_sale,
SUM(sale_amt) AS total_sales 
FROM
@sales
WHERE
sale_date BETWEEN @start_date AND @end_date 

/*
My new method - accurate
*/    
SELECT
COUNT(*) AS sales,
MIN(sale_date) AS first_sale,
MAX(sale_date) AS last_sale,
SUM(sale_amt) AS total_sales 
FROM
@sales
WHERE
sale_date >= @start_date AND 
/*First of the next month */
sale_date < DATEADD(DAY, 1, @end_date)

Now I'm sure someone out there will say, “Hey in your last example you are using the first day of the next month and you could do that with between.” Well yes I could and here is the query:

SELECT
COUNT(*) AS sales,
MIN(sale_date) AS first_sale,
MAX(sale_date) AS last_sale,
SUM(sale_amt) AS total_sales 
FROM
@sales
WHERE
sale_date BETWEEN @start_date AND DATEADD(DAY, 1, @end_date)

But this does not return the correct results either. Here are the results for each query:

	Count	First Sale	Last Sale	Total
1st Between	29	2009-11-01 00:00:00.000	2009-11-29 11:09:30.107	86362080.49
>= and <	30	2009-11-01 00:00:00.000	2009-11-30 11:09:30.107	86362080.49
2nd Between	31	2009-11-01 00:00:00.000	2009-12-01 00:00:00.000	86362080.49

All that to show that I believe you are better off using >= and < instead of BETWEEN when comparing dates.

Thursday, October 8, 2009

Maintaining Security and Performance using Stored Procedures Part I – Using EXECUTE AS

Anyone who knows me, or has worked with me, knows that I am a proponent of using stored procedures for all database access. I believe that using stored procedures makes your database more secure and makes it simpler to maintain a well performing system. One area where stored procedures are more difficult to work with than building queries in the GUI or business layer are with dynamic search queries. Here are some traditional issues with dynamic search in stored procedures:

If you use traditional IF, ELSE statements to build the procedure you get a long and hard read procedure, and you are less likely to get plan re-use.
If you try tricks like WHERE LastName = IsNull(@LastName, LastName) and FirstName = IsNull(@FirstName, FirstName) you can get plan re-use, but the plan used may not be, and many times is not, the best plan to use.
If you use dynamic SQL using the EXEC (@sql) syntax you do not get plan re-use, you open yourself up for SQL Injection, and, prior to SQL Server 2005, you had to grant access to the objects used in the query defeating part of the reason for using stored procedures in the first place.
If you use dynamic SQL using sp_executsql and parameters you are more likely to get plan re-use, you are safe from sql injection, but, pre-2005, you still needed to grant access to the queried objects.
Either dynamic SQL option means creating a large string of SQL and concatenating it, so it can be and, in my opinion is, a pain to read and a pain to make sure you have all your syntax right.

See Erland Sommarskog's excellent articles, The curse and blessings of dynamic SQL and Dynamic Search Conditions, for more details.

Early in my career, when working with SQL Server 7 and 2000 I tended to use option 1, sometimes with temporary tables, then I moved to option 2. I never used dynamic SQL because I did not want to grant select access to the tables being queried. I sacrificed performance for security and counted on ownership chaining to handle access to the underlying tables. With the advent of SQL Server 2005 and the EXECUTE AS I have moved to option 4, dynamic SQL using sp_executesql and parameters as I believe it gives me the best of both worlds.

How’s it work

Essentially you create the stored procedure and add the WITH EXECUTE AS Caller/Owner/Self/’user name’/’login name’ (see the Books on Line entry for EXECUTE AS for more details) and this changes the context in which the code within the procedure is run. So you can create a user (SelectAll) in the database that has select rights on all the tables and then no matter who calls the stored procedure the procedure will run correctly. If you choose to use EXECUTE AS OWNER then the procedure executes in the security context of the Owner of the procedure so you can simulate ownership chaining.

Example

Security

A post like this wouldn’t be complete without at least a simple example. I will be using the AdventureWorks database (get it at CodePlex, I’m still using the 2005 version) with some added data (I used RedGate SQLDataGenerator). All the code to run the examples is available for download here.

First you need to create a user with limited persmissions:

Use AdventureWorks;

Go

Create User DynamicSQLTest without login;

Notice that I used the Without Login syntax so I did not need to create a login as well. This is because I will also be using EXECUTE AS before running the stored procedure to change my execution context to this limited rights user instead of creating a new connection with a limited rights login. Next you need to create the stored procedure. I’m going to start with a “normal” stored procedure using Option 2 from above, because I also want to demonstrate the difference in performance. Here’s the procedure:

Use AdventureWorks;
GO

IF OBJECT_ID('dbo.FindPhoneByName', N'P') Is Not Null
    Begin
        Drop Procedure dbo.FindPhoneByName;
    End;

Go
    
CREATE PROCEDURE dbo.FindPhoneByName
    (
    @LastName nvarchar(50) = null,
    @FirstName nvarchar(50) = null
    )
AS

BEGIN
    SET NOCOUNT ON;

    Select
         Title,
         FirstName,
         MiddleName,
         LastName,
         Suffix,
         Phone  
    From
        Person.Contact
    Where 
        LastName Like IsNull(@LastName, LastName) + N'%' And 
        FirstName Like IsNull(@FirstName, FirstName) + N'%';
        
    Return;         
END
GO

This procedure is pretty self explanatory. Now we need to give the limited rights use, DynamicSQLTest, execute rights on our procedure:

Use AdventureWorks;

Go

Grant Exec on dbo.FindPhoneByName to DynamicSQLTest;

To test the security and performance of the stored procedure we are going to execute it 3 times with a dbo user and then repeat as the limited rights user, DynamicSQLTest. Here is what I used:

Exec dbo.FindPhoneByName @FirstName = 'J', @LastName = 'A';

Go

Exec dbo.FindPhoneByName @FirstName = 'J';

Go

Exec dbo.FindPhoneByName @LastName = 'A';

Go

Then execute the same 3 calls, but run this first to change the security context:

Execute AS User = 'DynamicSQLTest';

Go

If you are running the code in the same SSMS session be sure to issue the REVERT command to return to your original security context.

The stored procedure calls should run successfully for both users and should produce the same results and performance for both users. Now we’ll ALTER the procedure to use dynamic SQL:

Alter PROCEDURE dbo.FindPhoneByName
    (
    @LastName nvarchar(50) = null,
    @FirstName nvarchar(50) = null
    )
AS

BEGIN
    SET NOCOUNT ON;

    Declare @sql_cmd nvarchar(2000),
            @select nvarchar(1000),
            @where nvarchar(1000),
            @parameters nvarchar(1000);
    
    Set @parameters = N'@FirstName nvarchar(50), @LastName nvarchar(50)';
    
    Set @select = N'Select
                     Title,
                     FirstName,
                     MiddleName,
                     LastName,
                     Suffix,
                     Phone  
                From
                    Person.Contact';
    
    Set @where = N' Where 1=1 '                

    If @LastName is not null
        Begin
            Set @where = @where + N' And LastName Like @LastName + N''%'' ';
        End;
        
    If @FirstName is not null     
        Begin
            Set @where = @where + N' And FirstName Like @FirstName + N''%''';
        End;

    Set @sql_cmd = @select + @where;
     
    Exec sys.sp_executesql @sql_cmd, @parameters, @LastName = @LastName, @FirstName = @FirstName;
    
    Return;         
END

Now when you run our examples, you’ll see that it runs successfully under your original security context, but you receive an error when you run it as the limited rights user:

Msg 229, Level 14, State 5, Line 1

The SELECT permission was denied on the object 'Contact', database 'AdventureWorks', schema 'Person'.

This is because the execution context changed and ownership chaining no longer applies. To get the Dynamic SQL Stored procedure to work add WITH EXECUTE AS OWNER after the parameter definition like this:

ALTER PROCEDURE dbo.FindPhoneByName
    (
    @LastName nvarchar(50) = null,
    @FirstName nvarchar(50) = null
    )
With Execute As owner
AS

Then you can re-run the your stored procedure calls and they should work both for the dbo user and the limited rights user because the EXECUTE AS OWNER has enabled access to the tables.

Perfomance

I ran all of my examples with SET STATISTICS IO ON so I could see the results. Here are those results (also part of the download):

Parameters	Non-Dynamic SQL		Dynamic SQL
Parameters	Scans	Reads	Scans	Reads
@FirstName = 'J', @LastName = 'A'	1	593	1	593
@FirstName = 'J'	1	7792	1	1116
@LastName='A'	1	3039	1	1116

Notice the reduced number of reads required by the Dynamic SQL when only 1 parameter is supplied. This is because it is using a different query plan, while the Non-Dynamic procedure has one query plan which is not optimal when only one parameter is supplied

Conclusions

As you can see some of the limitations of Dynamic SQL have been “cured” by the advent of the EXECUTE AS clause. This has made it simpler to use Dynamic SQL and get the performance benefits provided by getting a proper execution plan and getting plan re-use. Again all code is available here.

Next I’ll be discussing using a Certificate to sign a stored procedure.

Part Ib, Part II

Tuesday, October 6, 2009

Index on a Foreign Key?

About two weeks ago, I had a discussion on Twitter and via email with Jeremiah Peschka (@peschkaj) about placing indexes on Foreign Key columns. I made the comment that I while indexing a foreign key is not required that I thought they made good candidates. I also shared this blog post by Greg Low with him.

Jeremiah mentioned that he had never had SQL Server recommend an index on a foreign key which I thought was strange so I decided to run some tests. I ran the tests against my copy of AdventureWorks to which I have added data. Here is the query I ran:

SELECT
    SOD.SalesOrderDetailID,
    SOD.OrderQty,
    SOD.ProductID,
    SOD.UnitPrice,
    SOD.UnitPriceDiscount,
    SOD.LineTotal,
    SOD.UnitPrice2,
    SOH.SalesOrderID,
    SOH.RevisionNumber,
    SOH.OrderDate,
    SOH.DueDate,
    SOH.ShipDate,
    SOH.Status,
    SOH.SalesOrderNumber,
    SOH.PurchaseOrderNumber,
    SOH.AccountNumber,
    SOH.CustomerID,
    SOH.SubTotal,
    SOH.TaxAmt,
    SOH.Freight,
    SOH.TotalDue,
    SOH.Comment
FROM
    Sales.SalesOrderHeader AS SOH JOIN
    Sales.SalesOrderDetail AS SOD ON 
        SOH.SalesOrderID = SOD.SalesOrderID 
WHERE
    SOH.OrderDate >= '12/1/2007'

In my copy of AdventureWorks I have 131465 rows in SalesOrderHeader and 1121317 rows in SalesOrderDetail. The Max(SalesOrderHeader.OrderDate) is 12/30/2007 so I just queried for the last month (1095 rows). In this query SalesOrderDetail.SalesOrderID is a foreign key referencing SalesOrderHeader.SalesOrderID and before modification the Clustered Primary Key on SalesOrderDetail is on SalesOrderID, SalesOrderDetailID. Because SalesOrderID is the lead column in the clustering key the optimizer uses a clustered index seek to access SalesOrderDetail as shown here:

Now, to see if the Optimizer would recommend an index on SalesOrderDetail.SalesOrderID, I altered the clustered primary key on SalesOrderDetail, removing SalesOrderID with the following code:

USE [AdventureWorks]
GO

IF  EXISTS (SELECT * FROM sys.indexes WHERE object_id = OBJECT_ID(N'[Sales].[SalesOrderDetail]') AND name = N'PK_SalesOrderDetail_SalesOrderID_SalesOrderDetailID')
ALTER TABLE [Sales].[SalesOrderDetail] DROP CONSTRAINT [PK_SalesOrderDetail_SalesOrderID_SalesOrderDetailID]
GO

USE [AdventureWorks]
GO

ALTER TABLE [Sales].[SalesOrderDetail] ADD  CONSTRAINT [PK_SalesOrderDetail_SalesOrderID_SalesOrderDetailID] PRIMARY KEY CLUSTERED 
(
    [SalesOrderDetailID] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO

I then re-ran my first query and now you can see that I there is a clustered index scan, a hash match join, and, in green at the top, a index recommended on SalesOrderDetail.SalesOrderID:

Next I added the suggested index (remember that you should not just add indexes to production without checking other queries as well):

USE [AdventureWorks]
GO
CREATE NONCLUSTERED INDEX SalesOrderDetails_SalesOrderId
ON [Sales].[SalesOrderDetail] ([SalesOrderID])
INCLUDE ([SalesOrderDetailID],[OrderQty],
        [ProductID],[UnitPrice],
        [UnitPriceDiscount],[LineTotal],
        [UnitPrice2])
GO

Then I re-ran my query, and here’s the execution plan:

Here are the Statistics IO results from the 3 queries as well:

Results with SalesOrderID as lead column in Primary Key (Original State)

Table 'SalesOrderDetail'. Scan count 147, logical reads 516, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'SalesOrderHeader'. Scan count 1, logical reads 461, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Results with SalesOrderID removed from Primary Key and not added as an index.

Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'SalesOrderDetail'. Scan count 1, logical reads 12584, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'SalesOrderHeader'. Scan count 1, logical reads 461, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Results with the suggested index on SalesOrderID.

Table 'SalesOrderDetail'. Scan count 147, logical reads 504, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'SalesOrderHeader'. Scan count 1, logical reads 461, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Conclusion

In this case you can see that adding an index on the foreign key, whether the lead column in the Clustered Primary Key or in a non-clustered index, you get much improved performance particularly regarding reads, and that the non-clustered index actual provides a slightly better access path than being part of the clustered primary key. Certainly you need to test for your individual situations but, I would say that the majority of the time a foreign key will benefit from an index since it is used as a filter in the JOIN criteria.

AdventureWorks can be downloaded from codeplex. Execution Plans and T-SQL can be downloaded from here.

Wednesday, September 30, 2009

On vs. Where

Does it matter if you put your criteria in the ON clause or the WHERE clause? Well, as with most things SQL the answer is, “It depends”. If you are dealing with INNER JOIN’s then it really doesn’t matter because the query optimizer is smart enough to come up with the same execution plan for both. For example these 2 queries evaluate to the same execution plan:

SELECT
    SOD.SalesOrderDetailID,
    SOH.SalesOrderID
FROM
    Sales.SalesOrderHeader AS SOH JOIN
    Sales.SalesOrderDetail AS SOD ON 
        SOH.SalesOrderID = SOD.SalesOrderID 
WHERE
    SOH.OrderDate >= '7/1/2004' AND 
    SOH.OrderDate < '8/1/2004'    
    
SELECT
    SOD.SalesOrderDetailID,
    SOH.SalesOrderID
FROM
    Sales.SalesOrderHeader AS SOH,
    Sales.SalesOrderDetail AS SOD 
WHERE
    SOH.SalesOrderID = SOD.SalesOrderID AND
    SOH.OrderDate >= '7/1/2004' AND 
    SOH.OrderDate < '8/1/2004'

The old SQL 6.5 OUTER JOIN syntax (*= and =*) has been discontinued beginning with SQL Server 2005, so you have to do the JOIN for OUTER JOIN’s in the ON as demonstrated in this code:

SELECT

   SOD.SalesOrderDetailID,

   SOH.SalesOrderID

FROM

   Sales.SalesOrderHeader AS SOH LEFT JOIN

   Sales.SalesOrderDetail AS SOD

        ON SOH.SalesOrderID = SOD.SalesOrderID

WHERE

   SOH.OrderDate >='7/1/2004' AND

   SOH.OrderDate <'8/1/2004'

Now let’s create a sandbox to play in.

If OBJECT_ID('sales.orders', 'U') Is Not Null
    Begin
        Drop Table sales.orders;
    End;
    
If OBJECT_ID('sales.order_items', 'U') Is Not Null
    Begin
        Drop Table sales.order_items;
    End;
    
 If SCHEMA_ID('sales') Is Not Null
    Begin
        Drop Schema sales;
    End;   

Go

Create Schema sales;

Go
/* 
Tables to hold sample data 
*/ 
Create Table sales.orders
    ( 
    order_id INT IDENTITY(1,1)PRIMARY KEY, 
    customer_id INT 
    ); 

Create Table sales.order_items
    ( 
    order_detail_id INT IDENTITY(1, 1)PRIMARY KEY, 
    order_id INT, 
    product_id INT, 
    quantity INT 
    ) 

/* 
Load Sample data
*/ 
INSERT INTO sales.orders (customer_id) 
    SELECT TOP 5
        AO.[object_id] 
    FROM 
        sys.all_objects AS AO; 

INSERT INTO sales.order_items 
    ( 
    order_id, 
    product_id, 
    quantity 
    ) 
    SELECT 
        1,
        1,
        7
    Union ALL
    Select
        2,
        1,
        4
    Union ALL
    Select
        3,
        2,
        6
     Union ALL
     Select
        4,
        2,
        11
     Union ALL
     Select
        5,
        3,
        1;

Now we want to return all the customers who have placed an order, but we only want to return the items where the quantity is greater than 5.  Here is method 1:

Select
    O.customer_id,
    OI.order_id,
    OI.product_id,
    OI.quantity      
From
    sales.orders AS O LEFT OUTER JOIN
    sales.order_items AS OI ON
        O.order_id = OI.order_id 
Where
    OI.quantity > 5;

This returns:

customer_id order_id    product_id  quantity
----------- ----------- ----------- -----------
3           1           1           7
7           3           2           6
8           4           2           11

Hmmm, we know we have orders from five customers, but this only returns the three rows. Let’s look at the execution plan:

What’s that nest loops (inner join) operator? Well, by putting the criteria for the RIGHT (second) table in the WHERE clause we essentially converted our LEFT OUTER JOIN to an INNER JOIN. The correct way to get the data we want would be this way:

SELECT O.customer_id, OI.order_id, OI.product_id, OI.quantity FROM sales.orders AS O LEFT OUTER JOIN sales.order_items AS OI ON O.order_id = OI.order_id AND OI.quantity > 5;

This returns what we would expect to see:

customer_id order_id    product_id  quantity
----------- ----------- ----------- -----------
3           1           1           7
5           NULL        NULL        NULL
7           3           2           6
8           4           2           11
17          NULL        NULL        NULL

And here is the execution plan:

Where you can see the Nested Loops (left outer join) operator.

So yes, it does matter where you put your criteria when dealing with OUTER JOINS.

Wednesday, September 23, 2009

Venturing into Powershell

As you probably know if you read my blog regularly, I recently put together a presentation called, Dive into the Default Trace, and as part of that presentation I wanted to provide examples of different ways to archive the Default Trace files. Here are the option I considered:

fn_trace_gettable to load the data into a table.
SSIS to archive the files.
PowerShell to archive the files.
xp_cmdshell and the copy command.

I immediately eliminated #4 because I don’t have xp_cmdshell enabled and because Chad Miller (@cmille19) had said in his Powershell talk that anything that requires xp_cmdshell is a good candidate for Powershell. So I sent Chad an email asking him where to start and he was kind enough to help out. I got a “working” script together and sent it off to Chad for review and he kindly told me my errors (there were several) and included a modified script that cleaned up my code. I did make a couple of minor modifications after receiving the script from Chad. The idea and starting point were mine, even if much of the finished product is Chad’s. So here’s the code (please comment if you have any improvements):

# Written with much Assistance from Chad Miller 
# (http://chadwickmiller.spaces.live.com/Blog)
param ($serverName)

#######################
function Get-SqlData
{
    param([string]$serverName=$(throw 'serverName is required.'),
            [string]$databaseName=$(throw 'databaseName is required.'),
              [string]$query=$(throw 'query is required.'))
 
    Write-Verbose "Get-SqlData serverName:$serverName 
                    databaseName:$databaseName query:$query"
 
    $connString = "Server=$serverName;Database=$databaseName;
                    Integrated Security=SSPI;"
    $da = New-Object "System.Data.SqlClient.SqlDataAdapter" ($query,$connString)
    $dt = New-Object "System.Data.DataTable"
    [void]$da.fill($dt)
    $dt
 
} #Get-SqlData

#######################
function Get-TraceInfo
{
    param($serverName)

$qry = @"
SELECT
   RIGHT(T.path, CharIndex('\', 
   Reverse(T.path))-1) as current_file,                      
   Substring(T.path, 1, Len(T.Path) - 
       CharIndex('\', Reverse(T.path))+ 1) as trace_path
FROM
    sys.traces T 
WHERE
    T.is_default = 1
"@

    Get-SqlData $serverName 'master' $qry

} #Get-TraceInfo

#######################
# MAIN
#######################

if ($serverName -eq $null)
    {
        $serverName = $env:COMPUTERNAME
    }

$result = Get-TraceInfo $serverName
$copy_path = $result.trace_path + 'Archive' 

if ((Test-Path -Path $copy_path) -eq $false)
    {
        New-Item -Path $result.trace_path -Name 'Archive' -ItemType directory
    }

Get-ChildItem -Path $result.trace_path -Exclude $result.current_file | 
    where {$_.name -like "*.trc"} | Move-Item -destination $copy_path

Once I knew what cmdlet’s I needed the PowerShell was probably the second easiest one to implement. I know SSIS fairly well and it got a bit complex because of having to exclude the file currently in use. Using fn_trace_gettable is simple because it allows me to use T-SQL which I have many years experience in.

Using PowerShell is really a matter of knowing what cmdlet’s you need, once you have that, the built in help (get-help) is very good, so you can just about do anything. PowerShell is very powerful and flexible. It can actually do so much that I think you need to be careful. For example, the New-Item cmdlet can create just about any item depending on the ItemType you use. '

So while I definitely recommend that any DBA learn some PowerShell I also recommend being very careful and Test, Test, Test!

Tuesday, April 14, 2009

The Deception of IsNumeric()

Has anyone else ever used the IsNumeric() function to try to eliminate this error (or similar):

Msg 8114, Level 16, State 5, Line 1
Error converting data type varchar to numeric.

yet still get the error? Isn’t that what the function is for? Here’s what is says in BOL:

ISNUMERIC returns 1 when the input expression evaluates to a valid numeric data type; otherwise it returns 0

But run this code (generates a list of ASCII Characters):

;WITH cteNumbers AS      
    (      
    SELECT TOP 255      
        ROW_NUMBER() OVER(ORDER BY NAME) AS n      
    FROM      
        sys.all_columns AS AC      
    )      
    SELECT      
        n AS ASCII_code,      
        CHAR(n) AS [character],      
        ISNUMERIC(CHAR(n)) AS is_numeric      
    FROM      
        cteNumbers      
    WHERE      
        n <=255 AND      
        ISNUMERIC(CHAR(n)) = 1

And you’ll see that the first 10 rows and last 8 rows returned are not what I would consider numeric. Or how about this:

SELECT      
    ISNUMERIC('20,00')      
GO      
SELECT      
    CONVERT(DECIMAL(9,2), '20,00')

The first statement returns 1, while the second fails.

Here is the last interesting behavior of IsNumeric() in relation to Cast/Convert provided by Frank Kalis on this SQLServerCentral thread:

SELECT ISNUMERIC(0X0e) AS E, ISNUMERIC(0X0f)  AS F     
      
E           F           
----------- -----------     
0           0     
      
(1 row(s) affected)     
      
WHILE     
      
SELECT CAST(0X0e AS INT) AS E, CAST(0X0f AS INT) AS F     
      
E           F           
----------- -----------     
14          15

The moral of the story is that IsNumeric() <> CanBeConvertedToNumeric().

So what is the answer? I don’t know. You would need to customize the solution to meet your specific situation. Jeff Moden suggests an IsAllDigits function in this thread on SQLServerCentral for instances where you want to eliminate rows with non-numeric characters.

Do you have anything that you recommend?

Tuesday, December 23, 2008

Simple, but effective code example

The other day I answered this question at SQLServerCentral, "Need a date function to return every Wednesday of the year", and got a couple of kudos, one from Jonathan Kehayias and one from Chris Morris (frequent contributor to the SSC forums), so I thought I'd post the code here and then explain it a bit. So here's the code:

SELECT TOP 366
   IDENTITY(INT, 1, 1) AS n
INTO
   #nums
FROM
   sys.all_objects
   
   SELECT 
       DATEADD(DAY, n, '1/1/2008')
   FROM
       #nums
   WHERE
       DATEPART(weekday, DATEADD(DAY, n, '1/1/2008')) = 
              CASE @@DateFirst
                   WHEN 7 THEN 4
                   WHEN 6 THEN 5
                   WHEN 5 THEN 6
                   WHEN 4 THEN 7
                   WHEN 3 THEN 1
                   WHEN 2 THEN 2
                   ELSE 3
              END      
DROP TABLE #nums

The first section just creates a temporary numbers table and you can do that several ways, I just opted for a simple one. If you don't have a numbers table already, you should as there are many good uses for it.

Then I just run a simple select against the temporary numbers table adding each number to the first day of the year. The WHERE clause then just compares the value returned from the DATEPART function (weekday) to the value for Wednesday based on the @@Datefirst value. According to Books on line @@Datefirst:

SET DATEFIRST indicates the specified first day of each week. The U.S. English default is 7, Sunday.

Language settings affect date information.

This can also be affected by using SET DATEFIRST if you want it to be different from the default for the language. This case statement is important if you expect the code to work on any database in any SQL Server install.

Ideally to solve a problem like this you would have a dates table as Jonathan recommends in the thread.

For more information on a numbers table check out these articles:

http://www.sqlservercentral.com/articles/TSQL/62867/
http://www.sqlservercentral.com/articles/Advanced+Querying/2547/

Wise man or Wise guy? You decide