Tuesday, May 20, 2008

ANSI_PADDING and Variable Length Character Columns in SQL Server

I have been involved in a very interesting thread on SQLServerCentral.com about trailing spaces in strings. It has changed the way I will be looking as strings going forward. I had always assumed that when a character column was defined as variable length (varchar, nvarchar) any trailing spaces inserted into the column would be trimmed. Well this WAS how SQL Server handled it because SQL Server defaulted the ANSI PADDING setting to OFF. Now SQL Server in SSMS (SQL Server Management Studio) the connection settings set ANSI PADDING to ON. This means that any table created using SSMS will use the ANSI PADDING ON setting, regardless of the default database setting. ANSI PADDING ON means that 'a ' will store the 'a' and the space while ANSI PADDING OFF only stores the 'a'. This becomes an issue when upgrading from an earlier version of SQL Server or, as in the thread, you are not aware of this change and are not coding appropriately for it.

Why is this a problem? Well. Here is a table showing how SQL Server acts with the 2 ANSI_PADDING options (courtesy of Matt Miller in the thread mentioned above and some testing):

SQL Function

Result (ANSI Padding ON)

Result (ANSI Padding OFF)

Primary Key

Trim it. Duplicate key error

Trim it

Unique Index

Trim it. Duplicate key error

Trim it

Comparison ("=", "!=")

Trim it. 'a' = 'a '

Trim it

Comparison ("like using _")

Don't trim it

Trim it

Concatenation

Don't trim it. 'a ' + 'a' = 'a a'

Trim it 'a ' + 'a' = 'aa'

Storage

Don't trim it

Trim it

Group By

Trim it

Trim it

Order By

Trim it

Trim it


The RED rows show where there are differences in behavior. Here is what Microsoft has to say from SQL Server 2005 BOL:

Important:

In a future version of SQL Server ANSI_PADDING will always be ON and any applications that explicitly set the option to OFF will produce an error. Avoid using this feature in new development work, and plan to modify applications that currently use this feature.


Note:

We recommend that ANSI_PADDING always be set to ON


Here is some code that proves the behavior:

SET Ansi_Padding OFF

CREATE TABLE
test.categories
(
CategoryID INT PRIMARY KEY CLUSTERED IDENTITY(1,1),
CategoryName VARCHAR(15),
Name2 VARCHAR(15),
NCategoryName NVARCHAR(15),
NName2 NVARCHAR(15)
)
--Create Unique Index UX_varchar_test On test.categories(CategoryName)
--Create Unique Index UX_nvarchar_test On test.categories(NCategoryName)

INSERT INTO test.categories
(
CategoryName,
Name2,
NCategoryName,
NName2
)
SELECT 'a', 'b', 'a', 'b' UNION ALL
SELECT 'a ', 'b','a ', 'b' UNION ALL
SELECT 'a ', 'b', 'a ', 'b' UNION ALL
SELECT 'a ', 'b', 'a ', 'b' UNION ALL
SELECT 'a ', 'b','a ', 'b' UNION ALL
SELECT 'a ', 'b','a ', 'b' UNION ALL
SELECT 'a ', 'b','a ', 'b'

-- length, storage, concatenation
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
CategoryID,
CategoryName,
CategoryName + Name2 AS concat_names,
LEN(CategoryName) AS len_name,
DATALENGTH(CategoryName) AS Datalength_Name,
NCategoryName,
NCategoryName + NName2 AS concat_unicode_names,
LEN(NCategoryName) AS len__unicode_name,
DATALENGTH(NCategoryName) AS Datalength_Unicode_Name
FROM
test.categories

-- varchar group by
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
'varchar' AS data_type,
'group by' AS test_function,
COUNT(categoryid) AS num_recs,
categoryName
FROM
test.categories
GROUP BY
categoryName

-- nvarchar group by
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
'nvarchar' AS data_type,
'group by' AS test_function,
COUNT(categoryid) AS num_recs,
NcategoryName
FROM
test.categories
GROUP BY
NcategoryName

-- varchar order by
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
'varchar' AS data_type,
'order by' AS test_function,
categoryid,
categoryName
FROM
test.categories
ORDER BY
categoryName DESC

-- nvarchar order by
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
'nvarchar' AS data_type,
'order by' AS test_function,
categoryid,
ncategoryName
FROM
test.categories
ORDER BY
ncategoryName DESC

-- varchar =
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
'varchar' AS data_type,
'=' AS test_function,
categoryid,
categoryName
FROM
test.categories
WHERE
categoryName = 'a'

-- nvarchar =
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
'nvarchar' AS data_type,
'=' AS test_function,
categoryid,
ncategoryName
FROM
test.categories
WHERE
NcategoryName = 'a'

-- varchar like %
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
'varchar' AS data_type,
'Like %' AS test_function,
categoryid,
categoryName
FROM
test.categories
WHERE
categoryName LIKE 'a%'

-- nvarchar like %
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
'nvarchar' AS data_type,
'Like %' AS test_function,
categoryid,
ncategoryName
FROM
test.categories
WHERE
ncategoryName LIKE 'a%'

-- varchar like _
SELECT
CASE
WHEN SESSIONPROPEERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
'varchar' AS data_type,
'Like _' AS test_function,
categoryid,
categoryName
FROM
test.categories
WHERE
categoryName LIKE 'a_'

-- nvarchar like _
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
'nvarchar' AS data_type,
'Like _' AS test_function,
categoryid,
ncategoryName
FROM
test.categories
WHERE
ncategoryName LIKE 'a_'
GO

SET Ansi_Padding ON

CREATE TABLE
test.categories2
(
CategoryID INT PRIMARY KEY CLUSTERED IDENTITY(1,1),
CategoryName VARCHAR(15),
Name2 VARCHAR(15),
NCategoryName NVARCHAR(15),
NName2 NVARCHAR(15)
)
--Create Unique Index UX_varchar_test On test.categories2(CategoryName)
--Create Unique Index UX_nvarchar_test On test.categories2(NCategoryName)

INSERT INTO test.categories2
(
CategoryName,
Name2,
NCategoryName,
NName2
)
SELECT 'a', 'b', 'a', 'b' UNION ALL
SELECT 'a ', 'b','a ', 'b' UNION ALL
SELECT 'a ', 'b', 'a ', 'b' UNION ALL
SELECT 'a ', 'b', 'a ', 'b' UNION ALL
SELECT 'a ', 'b','a ', 'b' UNION ALL
SELECT 'a ', 'b','a ', 'b' UNION ALL
SELECT 'a ', 'b','a ', 'b'

-- length, storage, concatenation
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
CategoryID,
CategoryName,
CategoryName + Name2 AS concat_names,
LEN(CategoryName) AS len_name,
DATALENGTH(CategoryName) AS Datalength_Name,
NCategoryName,
NCategoryName + NName2 AS concat_unicode_names,
LEN(NCategoryName) AS len__unicode_name,
DATALENGTH(NCategoryName) AS Datalength_Unicode_Name
FROM
test.categories2

-- varchar group by
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
'varchar' AS data_type,
'group by' AS test_function,
COUNT(categoryid) AS num_recs,
categoryName
FROM
test.categories2
GROUP BY
categoryName

-- nvarchar group by
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
'nvarchar' AS data_type,
'group by' AS test_function,
COUNT(categoryid) AS num_recs,
NcategoryName
FROM
test.categories2
GROUP BY
NcategoryName

-- varchar order by
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
'varchar' AS data_type,
'order by' AS test_function,
categoryid,
categoryName
FROM
test.categories2
ORDER BY
categoryName DESC

-- nvarchar order by
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
'nvarchar' AS data_type,
'order by' AS test_function,
categoryid,
ncategoryName
FROM
test.categories2
ORDER BY
ncategoryName DESC

-- varchar =
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
'varchar' AS data_type,
'=' AS test_function,
categoryid,
categoryName
FROM
test.categories2
WHERE
categoryName = 'a'

-- nvarchar =
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
'nvarchar' AS data_type,
'=' AS test_function,
categoryid,
ncategoryName
FROM
test.categories2
WHERE
NcategoryName = 'a'

-- varchar like %
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
'varchar' AS data_type,
'Like %' AS test_function,
categoryid,
categoryName
FROM
test.categories2
WHERE
categoryName LIKE 'a%'

-- nvarchar like %
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
'nvarchar' AS data_type,
'Like %' AS test_function,
categoryid,
ncategoryName
FROM
test.categories2
WHERE
ncategoryName LIKE 'a%'

-- varchar like _
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
'varchar' AS data_type,
'Like _' AS test_function,
categoryid,
categoryName
FROM
test.categories2
WHERE
categoryName LIKE 'a_'

-- nvarchar like _
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
'nvarchar' AS data_type,
'Like _' AS test_function,
categoryid,
ncategoryName
FROM
test.categories2
WHERE
ncategoryName LIKE 'a_'


DROP TABLE test.categories2
DROP TABLE test.categories


The key, as always, is to understand the system you are working with and design and code against it appropriately. As noted by the Original Poster in the SSC thread, "Trim your strings before inserting".

2 comments:

  1. Does anyone know why Microsoft decided to default to ON? Seems like no upside to me having to store all those spaces.

    ReplyDelete
  2. It is the SQL92 Standard, https://support.microsoft.com/en-us/kb/316626/en-us

    ReplyDelete

So what do you think I am?