Using PATINDEX to find varying length patterns in T-SQL

48,216

Solution 1

I blogged about this a while ago. Extracting numbers with SQL server

Declare @Temp Table(Data VarChar(100))

Insert Into @Temp Values('some text 456.09 other text')
Insert Into @Temp Values('even more text 98273.453 la la la')
Insert Into @Temp Values('There are no numbers in this one')

Select Left(
             SubString(Data, PatIndex('%[0-9.-]%', Data), 8000),
             PatIndex('%[^0-9.-]%', SubString(Data, PatIndex('%[0-9.-]%', Data), 8000) + 'X')-1)
From   @Temp

Solution 2

Wildcards.

SELECT PATINDEX('%[0-9]%[0-9].[0-9]%[0-9]%','some text 456.09 other text')
SELECT PATINDEX('%[0-9]%[0-9].[0-9]%[0-9]%','even more text 98273.453 la la la')

Solution 3

Yes you need to link to the clr to get regex support. But if PATINDEX does not do what you need then regex was designed exactly for that.

http://msdn.microsoft.com/en-us/magazine/cc163473.aspx

Solution 4

Should be checked for robustness (what if you only have an int, for example), but this is just to put you on a track:

if exists (select routine_name from information_schema.routines where routine_name = 'GetFirstFloat')
    drop function GetFirstFloat
go

create function GetFirstFloat (@string varchar(max))
returns float
as
begin
    declare @float varchar(max)
    declare @pos int

    select @pos = patindex('%[0-9]%', @string)
    select @float = ''

    while isnumeric(substring(@string, @pos, 1)) = 1
    begin
        select @float = @float + substring(@string, @pos, 1)
        select @pos = @pos + 1
    end

    return cast(@float as float)
end
go


select dbo.GetFirstFloat('this is a string containing pi 3.14159216 and another non float 3 followed by a new fload 5.41 and that''s it')
select dbo.GetFirstFloat('this is a string with no float')
select dbo.GetFirstFloat('this is another string with an int 3')
Share:
48,216
Rich
Author by

Rich

A software engineer

Updated on July 05, 2022

Comments

  • Rich
    Rich almost 2 years

    I'm looking to pull floats out of some varchars, using PATINDEX() to spot them. I know in each varchar string, I'm only interested in the first float that exists, but they might have different lengths.

    e.g.

    'some text 456.09 other text'
    'even more text 98273.453 la la la'
    

    I would normally match these with a regex

      "[0-9]+[.][0-9]+"
    

    However, I can't find an equivalent for the + operator, which PATINDEX accepts. So they would need to be matched (respectively) with:

    '[0-9][0-9][0-9].[0-9][0-9]' and '[0-9][0-9][0-9][0-9][0-9].[0-9][0-9][0-9]' 
    

    Is there any way to match both of these example varchars with one single valid PATINDEX pattern?