Tuesday, May 20, 2008

ANSI_PADDING and Variable Length Character Columns in SQL Server

I have been involved in a very interesting thread on SQLServerCentral.com about trailing spaces in strings. It has changed the way I will be looking as strings going forward. I had always assumed that when a character column was defined as variable length (varchar, nvarchar) any trailing spaces inserted into the column would be trimmed. Well this WAS how SQL Server handled it because SQL Server defaulted the ANSI PADDING setting to OFF. Now SQL Server in SSMS (SQL Server Management Studio) the connection settings set ANSI PADDING to ON. This means that any table created using SSMS will use the ANSI PADDING ON setting, regardless of the default database setting. ANSI PADDING ON means that 'a ' will store the 'a' and the space while ANSI PADDING OFF only stores the 'a'. This becomes an issue when upgrading from an earlier version of SQL Server or, as in the thread, you are not aware of this change and are not coding appropriately for it.

Why is this a problem? Well. Here is a table showing how SQL Server acts with the 2 ANSI_PADDING options (courtesy of Matt Miller in the thread mentioned above and some testing):

SQL Function

Result (ANSI Padding ON)

Result (ANSI Padding OFF)

Primary Key

Trim it. Duplicate key error

Trim it

Unique Index

Trim it. Duplicate key error

Trim it

Comparison ("=", "!=")

Trim it. 'a' = 'a '

Trim it

Comparison ("like using _")

Don't trim it

Trim it

Concatenation

Don't trim it. 'a ' + 'a' = 'a a'

Trim it 'a ' + 'a' = 'aa'

Storage

Don't trim it

Trim it

Group By

Trim it

Trim it

Order By

Trim it

Trim it


The RED rows show where there are differences in behavior. Here is what Microsoft has to say from SQL Server 2005 BOL:

Important:

In a future version of SQL Server ANSI_PADDING will always be ON and any applications that explicitly set the option to OFF will produce an error. Avoid using this feature in new development work, and plan to modify applications that currently use this feature.


Note:

We recommend that ANSI_PADDING always be set to ON


Here is some code that proves the behavior:

SET Ansi_Padding OFF

CREATE TABLE
test.categories
(
CategoryID INT PRIMARY KEY CLUSTERED IDENTITY(1,1),
CategoryName VARCHAR(15),
Name2 VARCHAR(15),
NCategoryName NVARCHAR(15),
NName2 NVARCHAR(15)
)
--Create Unique Index UX_varchar_test On test.categories(CategoryName)
--Create Unique Index UX_nvarchar_test On test.categories(NCategoryName)

INSERT INTO test.categories
(
CategoryName,
Name2,
NCategoryName,
NName2
)
SELECT 'a', 'b', 'a', 'b' UNION ALL
SELECT 'a ', 'b','a ', 'b' UNION ALL
SELECT 'a ', 'b', 'a ', 'b' UNION ALL
SELECT 'a ', 'b', 'a ', 'b' UNION ALL
SELECT 'a ', 'b','a ', 'b' UNION ALL
SELECT 'a ', 'b','a ', 'b' UNION ALL
SELECT 'a ', 'b','a ', 'b'

-- length, storage, concatenation
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
CategoryID,
CategoryName,
CategoryName + Name2 AS concat_names,
LEN(CategoryName) AS len_name,
DATALENGTH(CategoryName) AS Datalength_Name,
NCategoryName,
NCategoryName + NName2 AS concat_unicode_names,
LEN(NCategoryName) AS len__unicode_name,
DATALENGTH(NCategoryName) AS Datalength_Unicode_Name
FROM
test.categories

-- varchar group by
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
'varchar' AS data_type,
'group by' AS test_function,
COUNT(categoryid) AS num_recs,
categoryName
FROM
test.categories
GROUP BY
categoryName

-- nvarchar group by
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
'nvarchar' AS data_type,
'group by' AS test_function,
COUNT(categoryid) AS num_recs,
NcategoryName
FROM
test.categories
GROUP BY
NcategoryName

-- varchar order by
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
'varchar' AS data_type,
'order by' AS test_function,
categoryid,
categoryName
FROM
test.categories
ORDER BY
categoryName DESC

-- nvarchar order by
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
'nvarchar' AS data_type,
'order by' AS test_function,
categoryid,
ncategoryName
FROM
test.categories
ORDER BY
ncategoryName DESC

-- varchar =
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
'varchar' AS data_type,
'=' AS test_function,
categoryid,
categoryName
FROM
test.categories
WHERE
categoryName = 'a'

-- nvarchar =
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
'nvarchar' AS data_type,
'=' AS test_function,
categoryid,
ncategoryName
FROM
test.categories
WHERE
NcategoryName = 'a'

-- varchar like %
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
'varchar' AS data_type,
'Like %' AS test_function,
categoryid,
categoryName
FROM
test.categories
WHERE
categoryName LIKE 'a%'

-- nvarchar like %
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
'nvarchar' AS data_type,
'Like %' AS test_function,
categoryid,
ncategoryName
FROM
test.categories
WHERE
ncategoryName LIKE 'a%'

-- varchar like _
SELECT
CASE
WHEN SESSIONPROPEERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
'varchar' AS data_type,
'Like _' AS test_function,
categoryid,
categoryName
FROM
test.categories
WHERE
categoryName LIKE 'a_'

-- nvarchar like _
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
'nvarchar' AS data_type,
'Like _' AS test_function,
categoryid,
ncategoryName
FROM
test.categories
WHERE
ncategoryName LIKE 'a_'
GO

SET Ansi_Padding ON

CREATE TABLE
test.categories2
(
CategoryID INT PRIMARY KEY CLUSTERED IDENTITY(1,1),
CategoryName VARCHAR(15),
Name2 VARCHAR(15),
NCategoryName NVARCHAR(15),
NName2 NVARCHAR(15)
)
--Create Unique Index UX_varchar_test On test.categories2(CategoryName)
--Create Unique Index UX_nvarchar_test On test.categories2(NCategoryName)

INSERT INTO test.categories2
(
CategoryName,
Name2,
NCategoryName,
NName2
)
SELECT 'a', 'b', 'a', 'b' UNION ALL
SELECT 'a ', 'b','a ', 'b' UNION ALL
SELECT 'a ', 'b', 'a ', 'b' UNION ALL
SELECT 'a ', 'b', 'a ', 'b' UNION ALL
SELECT 'a ', 'b','a ', 'b' UNION ALL
SELECT 'a ', 'b','a ', 'b' UNION ALL
SELECT 'a ', 'b','a ', 'b'

-- length, storage, concatenation
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
CategoryID,
CategoryName,
CategoryName + Name2 AS concat_names,
LEN(CategoryName) AS len_name,
DATALENGTH(CategoryName) AS Datalength_Name,
NCategoryName,
NCategoryName + NName2 AS concat_unicode_names,
LEN(NCategoryName) AS len__unicode_name,
DATALENGTH(NCategoryName) AS Datalength_Unicode_Name
FROM
test.categories2

-- varchar group by
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
'varchar' AS data_type,
'group by' AS test_function,
COUNT(categoryid) AS num_recs,
categoryName
FROM
test.categories2
GROUP BY
categoryName

-- nvarchar group by
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
'nvarchar' AS data_type,
'group by' AS test_function,
COUNT(categoryid) AS num_recs,
NcategoryName
FROM
test.categories2
GROUP BY
NcategoryName

-- varchar order by
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
'varchar' AS data_type,
'order by' AS test_function,
categoryid,
categoryName
FROM
test.categories2
ORDER BY
categoryName DESC

-- nvarchar order by
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
'nvarchar' AS data_type,
'order by' AS test_function,
categoryid,
ncategoryName
FROM
test.categories2
ORDER BY
ncategoryName DESC

-- varchar =
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
'varchar' AS data_type,
'=' AS test_function,
categoryid,
categoryName
FROM
test.categories2
WHERE
categoryName = 'a'

-- nvarchar =
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
'nvarchar' AS data_type,
'=' AS test_function,
categoryid,
ncategoryName
FROM
test.categories2
WHERE
NcategoryName = 'a'

-- varchar like %
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
'varchar' AS data_type,
'Like %' AS test_function,
categoryid,
categoryName
FROM
test.categories2
WHERE
categoryName LIKE 'a%'

-- nvarchar like %
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
'nvarchar' AS data_type,
'Like %' AS test_function,
categoryid,
ncategoryName
FROM
test.categories2
WHERE
ncategoryName LIKE 'a%'

-- varchar like _
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
'varchar' AS data_type,
'Like _' AS test_function,
categoryid,
categoryName
FROM
test.categories2
WHERE
categoryName LIKE 'a_'

-- nvarchar like _
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
'nvarchar' AS data_type,
'Like _' AS test_function,
categoryid,
ncategoryName
FROM
test.categories2
WHERE
ncategoryName LIKE 'a_'


DROP TABLE test.categories2
DROP TABLE test.categories


The key, as always, is to understand the system you are working with and design and code against it appropriately. As noted by the Original Poster in the SSC thread, "Trim your strings before inserting".

Friday, May 2, 2008

SSIS SCript Component Rant

I certainly don't consider myself an SSIS expert, but I have done some work with it. It is a very powerful tool, but some things bother me about it. The first thing is the inability to debug Script Components in a data flow. I can set breakpoints in a Script Task, but in a Script Component I am left with message boxes and logging. This is a very painful way to try to debug code. Anyone with classic ASP experience can relate (response.write, anyone?). Microsoft went to the trouble of including debugging capabilities in ASP.NET and I certainly hope it is added to Script Components in SSIS in 2008.

Another annoyance is that I would like to be able to "save" a Script Component script as a file and then re-use it in other Script Components. Why can't I do this? I realize I could write a custom component (currently beyond my abilities), but shouldn't I be able to load code from a .vb file?

Lastly, error messages. Why doesn't an unhandled error tell me the Script Component's name instead of the useless, Script_Component_GUID? Just give me something I can easily decipher.

Anyway, I don't want to sound like I am down on SSIS, Script Components particularly, but I just see areas it could be improved and they seem like reasonable things to ask.

Monday, April 28, 2008

Another Microsoft Inconsistency

Brian Knight just did a blog post about how, with SQL Server 2005, using (local) for the server name in a connection times out. Now, his post specifically deals with DTS/SSIS, but his discovery about why is times out, (local) uses named pipes which is disabled by default in SQL Server 2005, is interesting when discussing Linq To SQL. The SQL Server team is disabling protocols to try to make SQL Server more secure, while a the same time the Linq To SQL team has designed Linq To SQL to ONLY use Named Pipes to connect to SQL Server. So while one team is attempting to close a door another is forcing that door to be open. Then again, maybe the SQL Server team doesn't like Linq To SQL, like most DBA's based on my reading, and shut the door for this reason.

Another note is that SSDS (SQL Server Data Services or SQL Server on the Cloud) requires developers to use Linq. As referenced in Andy Warren's It Depends blog.

Friday, April 25, 2008

Feedback on my first published article

Well, Error Handling in SSIS was published this past Monday on SQLServerCentral.com and has been a fairly successful article with over 5000 reads in 5 days. All the feedback I received was positive which was surprising, but very satisfying. I was anticipating some either negative or constructive criticisms/suggestions and have not received any yet. There are also a couple of people who have implemented the solution! The best comment I received was that I should create an SSRS report to package with it. I have done some work on that and have a basic report created which I posted on the discussion thread.

Anyway, I was encouraged by the feedback and hope I can find another subject interesting enough to write an article on and maybe put together a presentation for user groups and a SqlSaturday.

Linq and Linq To SQL

There is an interesting discussion on SQL ServerCentral.com about Linq To SQL based on this editorial by Steve Jones. You will see the DBA's perspective on such tools and, I think, some misunderstanding, particularly about Linq and even Linq To SQL. Linq stands for Language Integrated Query and Linq To SQL is just one implementation of Linq (go here for more Linq implementations), so please when discussing Linq be specific as Mike Taulty has asked in his blog as well.

Basically Linq allows you to access objects with a common syntax something like (in C#):

var query = from p in persons
where p.LastName == "Corbett"
orderby p.Age
select p

The great part is that persons is the object you are querying and it can be a class, a list, an array, a database query, etc... You can also have Linq query that joins objects which is a lot easier to read than nested loops to find matches.

Now back to Linq To SQL. As a hybrid DBA\Developer I see some good things about Linq To SQL, but I also know, as the "protector of data", that having to allow table level access, which you need for Linq To SQL, opens some security holes. I like the way Linq To SQL builds my .NET objects, but wish it made using stored procedures for CRUD (create, update, delete) operations simpler or the default. Since it already has to build the SQL to do the operations, why not have it create a stored procedure? I think it could be an option when building the objects and the designer could ask you for a template for naming the stored procedures.

Also mentioned in the discussion is a mapping tool called iBatis.Net, which looks interesting and, I think has a nice implementation using XML files. It would be nice if MS incorporated something like that in Linq To SQL so you could more easily see, test, and tune SQL statement. Wouldn't it be nice to have Linq To SQL build your objects with all of the method signatures and properties and an XML file that defines the SQL for each of the methods. If it did that, then you could just edit the XML to use stored procedures.

It's important to remember that Linq (including Linq To SQL) is a 1.0 product and that there is definitely room for improvement and we can encourage it by letting MS know what we think the improvements can be.

Remember it is a tool and just like a hammer has certain things you use it for, the same can be said of Linq and Linq To SQL. Make sure it is the right tool for the job!

Monday, April 21, 2008

Error Handling in SSIS published on SqlServerCentral.com

I'm excited to report that my first-ever article has been published on SQLServerCentral.com. While I called it Error Handling in SSIS it would probably have been better titled, Logging Data Errors in an SSIS DataFlow Task, as that more accurately identifies the subject. Nonetheless, it is out there, and it was fun to write. Hopefully I'll get some good feedback from the members at SSC so my writing will improve.

Thursday, April 17, 2008

Another Undocumented Database Mail "Feature"

I was setting up Database Mail on a new server, we'll call it Jack1, the other day and, as usual, I could not remember the name of our SMTP server. Having setup Database Mail on other servers, I decided to take advantage of the new modularity of SSMS and open up Database Mail settings on another server, Jack2, so I could find my SMTP server name. I found the SMTP server name and copied over to the new account I was setting up on my Jack1 I finished setting up the account and profile and saved it. Everything should work, right? So I go to Jack1 Database Mail, right-click, and select send test mail. Error, no profiles setup on server! What!? I just finished setting up the Profile. So I right-click and select Configure Database Mail, and lo and behold, there IS NOT a profile setup. I checked the Jack2 server and there I found the new Profile! So it appears that SSMS saves the mail profile to the last server on which you had opened Database Mail setup. I will do a little more testing and post any new findings I have.