I have been involved in a very interesting thread on SQLServerCentral.com about trailing spaces in strings. It has changed the way I will be looking as strings going forward. I had always assumed that when a character column was defined as variable length (varchar, nvarchar) any trailing spaces inserted into the column would be trimmed. Well this WAS how SQL Server handled it because SQL Server defaulted the ANSI PADDING setting to OFF. Now SQL Server in SSMS (SQL Server Management Studio) the connection settings set ANSI PADDING to ON. This means that any table created using SSMS will use the ANSI PADDING ON setting, regardless of the default database setting. ANSI PADDING ON means that 'a ' will store the 'a' and the space while ANSI PADDING OFF only stores the 'a'. This becomes an issue when upgrading from an earlier version of SQL Server or, as in the thread, you are not aware of this change and are not coding appropriately for it.
Why is this a problem? Well. Here is a table showing how SQL Server acts with the 2 ANSI_PADDING options (courtesy of Matt Miller in the thread mentioned above and some testing):
SQL Function | Result (ANSI Padding ON) | Result (ANSI Padding OFF) |
Primary Key | Trim it. Duplicate key error | Trim it |
Unique Index | Trim it. Duplicate key error | Trim it |
Comparison ("=", "!=") | Trim it. 'a' = 'a ' | Trim it |
Comparison ("like using _") | Don't trim it | Trim it |
Concatenation | Don't trim it. 'a ' + 'a' = 'a a' | Trim it 'a ' + 'a' = 'aa' |
Storage | Don't trim it | Trim it |
Group By | Trim it | Trim it |
Order By | Trim it | Trim it |
The RED rows show where there are differences in behavior. Here is what Microsoft has to say from SQL Server 2005 BOL:
Important: |
In a future version of SQL Server ANSI_PADDING will always be ON and any applications that explicitly set the option to OFF will produce an error. Avoid using this feature in new development work, and plan to modify applications that currently use this feature. |
Note: |
We recommend that ANSI_PADDING always be set to ON |
Here is some code that proves the behavior:
SET Ansi_Padding OFF
CREATE TABLE test.categories
(
CategoryID INT PRIMARY KEY CLUSTERED IDENTITY(1,1),
CategoryName VARCHAR(15),
Name2 VARCHAR(15),
NCategoryName NVARCHAR(15),
NName2 NVARCHAR(15)
)
--Create Unique Index UX_varchar_test On test.categories(CategoryName)
--Create Unique Index UX_nvarchar_test On test.categories(NCategoryName)
INSERT INTO test.categories
(
CategoryName,
Name2,
NCategoryName,
NName2
)
SELECT 'a', 'b', 'a', 'b' UNION ALL
SELECT 'a ', 'b','a ', 'b' UNION ALL
SELECT 'a ', 'b', 'a ', 'b' UNION ALL
SELECT 'a ', 'b', 'a ', 'b' UNION ALL
SELECT 'a ', 'b','a ', 'b' UNION ALL
SELECT 'a ', 'b','a ', 'b' UNION ALL
SELECT 'a ', 'b','a ', 'b'
-- length, storage, concatenation
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
CategoryID,
CategoryName,
CategoryName + Name2 AS concat_names,
LEN(CategoryName) AS len_name,
DATALENGTH(CategoryName) AS Datalength_Name,
NCategoryName,
NCategoryName + NName2 AS concat_unicode_names,
LEN(NCategoryName) AS len__unicode_name,
DATALENGTH(NCategoryName) AS Datalength_Unicode_Name
FROM
test.categories
-- varchar group by
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
'varchar' AS data_type,
'group by' AS test_function,
COUNT(categoryid) AS num_recs,
categoryName
FROM
test.categories
GROUP BY
categoryName
-- nvarchar group by
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
'nvarchar' AS data_type,
'group by' AS test_function,
COUNT(categoryid) AS num_recs,
NcategoryName
FROM
test.categories
GROUP BY
NcategoryName
-- varchar order by
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
'varchar' AS data_type,
'order by' AS test_function,
categoryid,
categoryName
FROM
test.categories
ORDER BY
categoryName DESC
-- nvarchar order by
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
'nvarchar' AS data_type,
'order by' AS test_function,
categoryid,
ncategoryName
FROM
test.categories
ORDER BY
ncategoryName DESC
-- varchar =
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
'varchar' AS data_type,
'=' AS test_function,
categoryid,
categoryName
FROM
test.categories
WHERE
categoryName = 'a'
-- nvarchar =
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
'nvarchar' AS data_type,
'=' AS test_function,
categoryid,
ncategoryName
FROM
test.categories
WHERE
NcategoryName = 'a'
-- varchar like %
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
'varchar' AS data_type,
'Like %' AS test_function,
categoryid,
categoryName
FROM
test.categories
WHERE
categoryName LIKE 'a%'
-- nvarchar like %
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
'nvarchar' AS data_type,
'Like %' AS test_function,
categoryid,
ncategoryName
FROM
test.categories
WHERE
ncategoryName LIKE 'a%'
-- varchar like _
SELECT
CASE
WHEN SESSIONPROPEERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
'varchar' AS data_type,
'Like _' AS test_function,
categoryid,
categoryName
FROM
test.categories
WHERE
categoryName LIKE 'a_'
-- nvarchar like _
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
'nvarchar' AS data_type,
'Like _' AS test_function,
categoryid,
ncategoryName
FROM
test.categories
WHERE
ncategoryName LIKE 'a_'
GO
SET Ansi_Padding ON
CREATE TABLE test.categories2
(
CategoryID INT PRIMARY KEY CLUSTERED IDENTITY(1,1),
CategoryName VARCHAR(15),
Name2 VARCHAR(15),
NCategoryName NVARCHAR(15),
NName2 NVARCHAR(15)
)
--Create Unique Index UX_varchar_test On test.categories2(CategoryName)
--Create Unique Index UX_nvarchar_test On test.categories2(NCategoryName)
INSERT INTO test.categories2
(
CategoryName,
Name2,
NCategoryName,
NName2
)
SELECT 'a', 'b', 'a', 'b' UNION ALL
SELECT 'a ', 'b','a ', 'b' UNION ALL
SELECT 'a ', 'b', 'a ', 'b' UNION ALL
SELECT 'a ', 'b', 'a ', 'b' UNION ALL
SELECT 'a ', 'b','a ', 'b' UNION ALL
SELECT 'a ', 'b','a ', 'b' UNION ALL
SELECT 'a ', 'b','a ', 'b'
-- length, storage, concatenation
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
CategoryID,
CategoryName,
CategoryName + Name2 AS concat_names,
LEN(CategoryName) AS len_name,
DATALENGTH(CategoryName) AS Datalength_Name,
NCategoryName,
NCategoryName + NName2 AS concat_unicode_names,
LEN(NCategoryName) AS len__unicode_name,
DATALENGTH(NCategoryName) AS Datalength_Unicode_Name
FROM
test.categories2
-- varchar group by
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
'varchar' AS data_type,
'group by' AS test_function,
COUNT(categoryid) AS num_recs,
categoryName
FROM
test.categories2
GROUP BY
categoryName
-- nvarchar group by
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
'nvarchar' AS data_type,
'group by' AS test_function,
COUNT(categoryid) AS num_recs,
NcategoryName
FROM
test.categories2
GROUP BY
NcategoryName
-- varchar order by
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
'varchar' AS data_type,
'order by' AS test_function,
categoryid,
categoryName
FROM
test.categories2
ORDER BY
categoryName DESC
-- nvarchar order by
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
'nvarchar' AS data_type,
'order by' AS test_function,
categoryid,
ncategoryName
FROM
test.categories2
ORDER BY
ncategoryName DESC
-- varchar =
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
'varchar' AS data_type,
'=' AS test_function,
categoryid,
categoryName
FROM
test.categories2
WHERE
categoryName = 'a'
-- nvarchar =
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
'nvarchar' AS data_type,
'=' AS test_function,
categoryid,
ncategoryName
FROM
test.categories2
WHERE
NcategoryName = 'a'
-- varchar like %
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
'varchar' AS data_type,
'Like %' AS test_function,
categoryid,
categoryName
FROM
test.categories2
WHERE
categoryName LIKE 'a%'
-- nvarchar like %
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
'nvarchar' AS data_type,
'Like %' AS test_function,
categoryid,
ncategoryName
FROM
test.categories2
WHERE
ncategoryName LIKE 'a%'
-- varchar like _
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
'varchar' AS data_type,
'Like _' AS test_function,
categoryid,
categoryName
FROM
test.categories2
WHERE
categoryName LIKE 'a_'
-- nvarchar like _
SELECT
CASE
WHEN SESSIONPROPERTY('ANSI_PADDING') = 0 THEN 'Off'
ELSE 'ON'
END AS ansi_padding_setting,
'nvarchar' AS data_type,
'Like _' AS test_function,
categoryid,
ncategoryName
FROM
test.categories2
WHERE
ncategoryName LIKE 'a_'
DROP TABLE test.categories2
DROP TABLE test.categories
The key, as always, is to understand the system you are working with and design and code against it appropriately. As noted by the Original Poster in the SSC thread, "Trim your strings before inserting".
Does anyone know why Microsoft decided to default to ON? Seems like no upside to me having to store all those spaces.
ReplyDeleteIt is the SQL92 Standard, https://support.microsoft.com/en-us/kb/316626/en-us
ReplyDelete