Extract substring from string in SQL

15,743

Solution 1

Not a regex solution and I'm still a SQL novice so may not be optimal but you should be able to parse with a WHILE loop using

CHARINDEX for the *** then using that as a starting point to
CHARINDEX to the LF Use that as the starting point for a
SUBSTRING with the ending point being a CHARINDEX of the next ***
concatenate the Substring to your output, move past the ending *** and loop to find the next one.

I'll play with it some and see if I can add an example.
EDIT - This probably needs more error checking

declare @inText nvarchar(2000) = 'some text 
some text 
***[some text] 
THIS SHOULD BE EXTRACTED 
***[some text] 
some text 
some text 
some text 
some text 
some text 
***[some text] 
THIS SHOULD BE EXTRACTED TOO 
***[some text] 
some text '

declare @delim1 nvarchar(50) = '***'
declare @delim2 char = char(10)
declare @output nvarchar(1000) = ''
declare @position int
declare @positionEnd int

set @position = CHARINDEX(@delim1,@inText)
while (@position != 0 and @position is not null)
BEGIN
  set @position = CHARINDEX(@delim2,@inText,@position)
  set @positionEnd = CHARINDEX(@delim1,@inText,@position)
  set @output = @output + SUBSTRING(@inText,@position,@positionEnd-@position)
  set @position = CHARINDEX(@delim1,@inText,@positionEnd+LEN(@delim1))
END
select @output

Solution 2

For the somewhat easier case raised in the comments you could do

;WITH T(C) AS
(
 SELECT '
    some text
    some text
    ***[some text 1]
    THIS SHOULD BE EXTRACTED
    ***[some text 2]
    some text
    some text
    some text
    some text
    some text
    ***[some text 1]
    THIS SHOULD BE EXTRACTED TOO
    ***[some text 2]
    some text'
)
SELECT col.value('.','varchar(max)')
FROM T
CROSS APPLY (SELECT CAST('<a keep="false">' + 
                        REPLACE(
                            REPLACE(C,'***[some text 2]','</a><a keep="false">'),
                        '***[some text 1]','</a><a keep="true">') + 
                    '</a>' AS xml) as xcol) x
CROSS APPLY xcol.nodes('/a[@keep="true"]') tab(col)

Solution 3

I may be wrong but I don't think there's a clean way to do this directly in SQL. I would use a CLR stored procedure and use regular expressions from C# or your .NET language of choice.

See this article (or this article) for a relevant example using regexes.

Solution 4

You can find this in my blog: http://sql-tricks.blogspot.com/2011/04/extract-strings-with-delimiters.html It is pure solution with no additional modification, only delimiters sequences should be decalred.

Share:
15,743
Func
Author by

Func

Updated on June 13, 2022

Comments

  • Func
    Func almost 2 years

    I need to extract a text that is surrounded by ***[some text] strings, like in the following example:

    some text
    some text
    ***[some text]
    THIS SHOULD BE EXTRACTED
    ***[some text]
    some text
    some text
    some text
    some text
    some text
    ***[some text]
    THIS SHOULD BE EXTRACTED TOO
    ***[some text]
    some text
    

    the output should be:

    THIS SHOULD BE EXTRACTED
    THIS SHOULD BE EXTRACTED TOO
    

    I tried PATINDEX like here, but couln't find the way to extract the string.

    PATINDEX('%[*][*][*][[]%]%%[*][*][*][[]%]%',@Text)
    

    I am looking forward to hearing any suggestions.

  • Func
    Func about 13 years
    Unfortunately I cannot use C# or CLR stored procedures, I can perform only select statements...
  • Martin Smith
    Martin Smith about 13 years
    I think you could probably use this as a base then call the UDF using CROSS APPLY
  • Dalex
    Dalex about 13 years
    I tested your solution vs my which is above. It takes 98% of batch. My just only 2%. I think it is because of XML.
  • Martin Smith
    Martin Smith about 13 years
    Yes the XML reader functions definitely don't have great performance. Additionally my answer would need some more REPLACE operations in there if the text might contain characters such as < in order to replace them with the corresponding XML entities.
  • DKnight
    DKnight about 13 years
    @Martin - oooo more reading material - thanks for the feedback it is very much appreciated
  • Func
    Func about 13 years
    Thanks your solution is nice but i get Invalid length parameter passed to the SUBSTRING function when I try to change delimiters to meet my needs. Also it is a bit slow or my server is slow, I am not sure.
  • Dalex
    Dalex about 13 years
    Please write your delimiters. I will fix it.
  • Robert Lujo
    Robert Lujo over 5 years
    The link of the article msdn.microsoft.com/en-us/magazine/cc163473.aspx is broken :(
  • Paolo Falabella
    Paolo Falabella over 5 years
    @RobertLujo I found another example here at blogs.msdn.microsoft.com/sqlclr/2005/06/29/…