SQL Nested Replace

18,096

Solution 1

The problem is that T-SQL does not easily allow to mark expressions with a name so you can refer to them from a different place. There is a way to do this though:

select replaceN
from T
cross apply (select replace1 = replace(T.col, 'x', 'y')) r1
cross apply (select replace2 = replace(replace1, 'x', 'y')) r2
cross apply (select replace3 = replace(replace2, 'x', 'y')) r3
...

This at least gets rid of the crazy nesting. It has no negative performance impact.

Solution 2

Maybe use a function to strip the non-numeric characters:

Create Function [dbo].[RemoveNonAlphaCharacters](@Temp VarChar(1000))
Returns VarChar(1000)
AS
Begin

    While PatIndex('%[^a-z]%', @Temp) > 0
        Set @Temp = Stuff(@Temp, PatIndex('%[^a-z]%', @Temp), 1, '')

    Return @Temp
End

Then you will reference this function in your join:

select a.col1 a, b.col1 b
from tablea a
left join tableb b
  on dbo.RemoveNonAlphaCharacters(a.col1) = b.col1

See SQL Fiddle with Demo

Share:
18,096

Related videos on Youtube

dopplesoldner
Author by

dopplesoldner

Updated on September 21, 2022

Comments

  • dopplesoldner
    dopplesoldner over 1 year

    I have a complex nested replace which I am using to join two tables in MSSQL.

    select * from A
      left outer join 
    select * from B
    on
      replace(
         replace(
            replace(
               replace(
                  replace(A.Column1, '1114', ''),
                 '1160', ''), 
              '1162', ''),
            '1167', ''),
          '1176', ''),
        '1177', '')  = B.Column1
    

    The whole reason I am doing this is because data in Table1 contains of some noise - numbers like 1160, 1162 etc wheres Table2 is clean characters.

    Eg. - Table 1 - 'HELLO1160WORLD'
          Table 2 - 'HELLOWORLD'
    

    Now in my situation I should be able to match them as one entry.

    My current approach of nested replace does work but I am not convinced that this is an elegant way to do this. Any help will be much appreciated. Thanks

    • swasheck
      swasheck over 11 years
      is 'HELLO1160WORLD' a valid piece of data? if yes, bummer. if no, why not sanitize on the front end?
  • dopplesoldner
    dopplesoldner over 11 years
    Thanks, that would be an elegant solution. The only problem is that I don't want to remove all non-apha numerics. Sorry for not being clear. There could be cases such as HELLO1114WORLD1 and HELLOWORLD1 I have a very specific set of values (transaction codes to be specific) which I would like to remove
  • Taryn
    Taryn over 11 years
    @dopplesoldner so those two values would not be a match then? Can you post a larger data sample and then the expected result?
  • dopplesoldner
    dopplesoldner over 11 years
    Hi @bluefeet I would like the following values to be treated as matching HELLO1114WORLD1 HELLOWORLD1 M125ICKY1114MOUSE M125ICKYMOUSE RONA1160DINHO10 RONALDINHO10 Currently by using my nested replace, I am replacing certain known identifiers - eg 1114, 1160 to give me exact matches. Hope it helps and thanks again!
  • Taryn
    Taryn over 11 years
    @dopplesoldner can you apply the function to both columns? similar to this demo -- sqlfiddle.com/#!3/11e59/1
  • dopplesoldner
    dopplesoldner over 11 years
    But that would match M125ICKY1114MOUSE to M999ICKYMOUSE ?
  • dopplesoldner
    dopplesoldner over 11 years
    Btw thanks for showing me SQL FIddle, I had never heard of it!
  • dopplesoldner
    dopplesoldner over 11 years
    But this way, I am still left with the nasy nested loop. (which I am trying to avoid in the first place)
  • PowerUser
    PowerUser over 11 years
    Correct, but it would at least move the nested loop out of the Join and into it's own section.
  • Jonnie
    Jonnie about 6 years
    This is a good solution... once I understood it. "replaceN" being the highest numbered replace in the sequence, in this example it is replace3