Wise man or Wise guy? You decide: Fun with a String

I recently had a friend contact me recently to ask for some help with TSQL. The basic issue is that he had 2 tables. Table 1 has numeric key values which are contained within a character column in Table 2 which is also in another database on the same server. His job was to parse the character column in Table 2 so he could join to the numeric key in Table 1. He originally said that the numeric data in the character column would start at position 7 and be 12 characters long so his original query was something like this:

SELECT
    *
FROM
    db.schemaname.table1 AS T1
    JOIN db.schemaname.table2 AS T2
        ON T1.id = SUBSTRING(T2.character_col, 7, 12)

I replied that the query would work, but was he guaranteed that the data would ALWAYS start at position 7 and be 12 characters? Also that using the SUBSTRING function meant that the optimizer could not use an index. As I thought about the situation I came up with this as probably the most flexible solution.

Here’s some test data:

/* Don't like to work in a user database */
USE tempdb ;
GO

CREATE TABLE #test1 (id INT)
    
/* create a test table */
CREATE TABLE #test
    (
     string VARCHAR(50) NOT NULL
    ) ;
          
    /* put in some data */
INSERT  INTO #test1
        (
         id            
        )
        SELECT
            123456789012
        UNION ALL
        SELECT
            234567890121
        UNION ALL
        SELECT
            345678901212
        UNION ALL
        SELECT
            456789012123
        UNION ALL
        SELECT
            12345678901
        UNION ALL
        SELECT
            23456789011
        UNION ALL
        SELECT
            1234567890123
        UNION ALL
        SELECT
            12345678901234 ;
        
INSERT  INTO #test
        (
         string            
        )
        SELECT
            'Jack 123456789012'
        UNION ALL
        SELECT
            '234567890121'
        UNION ALL
        SELECT
            'Jack 345678901212'
        UNION ALL
        SELECT
            'Jack abcdefg'
        UNION ALL
        SELECT
            'Jack 456789012123x'
        UNION ALL
        SELECT
            'Jack 12345678901'
        UNION ALL
        SELECT
            'Jack 23456789011a'
        UNION ALL
        SELECT
            'Jack 1234567890123'
        UNION ALL
        SELECT
            'Jack 12345678901234a' ;

/* Use PATINDEX inside SUBSTRING to find the first instance of a numeric character 
to set the starting index.  Use PATINDEX on the string REVERSE to find the
last instance of a numeric character. */        
SELECT
   *
FROM
    #test1 AS T1 JOIN
    #test AS T2 ON T1.id = SUBSTRING(T2.string, PATINDEX('%[0-9]%', T2.string),
              LEN(T2.string) - PATINDEX('%[0-9]%', REVERSE(T2.string)) -
              PATINDEX('%[0-9]%', T2.string) + 2);
    
DROP TABLE #test1;    
DROP TABLE #test ;

This code works for this situation. Will it work in every situation? Probably not. Will it scale? Probably not, but it was an interesting exercise.

Do you have a better solution? If you do post it in the comments or post a link to a blog post where you put your solution.

Wise man or Wise guy? You decide

Thursday, January 6, 2011

Fun with a String

No comments:

Post a Comment

Tags

FeedBurner FeedCount

About Me

Yup, I Twitter

Blog Archive

Links

Jack's shared items

Hit Counter