How to find text between two strings in c

14,354

Solution 1

Here is an alive example of how to do this

#include <stdio.h>
#include <string.h>

int main(void)
{
    const char *s = "aaaaaa<BBBB>TEXT TO EXTRACT</BBBB>aaaaaaaaa";

    const char *PATTERN1 = "<BBBB>";
    const char *PATTERN2 = "</BBBB>";

    char *target = NULL;
    char *start, *end;

    if ( start = strstr( s, PATTERN1 ) )
    {
        start += strlen( PATTERN1 );
        if ( end = strstr( start, PATTERN2 ) )
        {
            target = ( char * )malloc( end - start + 1 );
            memcpy( target, start, end - start );
            target[end - start] = '\0';
        }
    }

    if ( target ) printf( "%s\n", target );

    free( target );

    return 0;
}

The output is

TEXT TO EXTRACT

Solution 2

Just use strstr().

First once to find the start marker, then call it again with a pointer to the first character after the start marker, to find the end marker:

char * extract_between(const char *str, const char *p1, const char *p2)
{
  const char *i1 = strstr(str, p1);
  if(i1 != NULL)
  {
    const size_t pl1 = strlen(p1);
    const char *i2 = strstr(i1 + pl1, p2);
    if(p2 != NULL)
    {
     /* Found both markers, extract text. */
     const size_t mlen = i2 - (i1 + pl1);
     char *ret = malloc(mlen + 1);
     if(ret != NULL)
     {
       memcpy(ret, i1 + pl1, mlen);
       ret[mlen] = '\0';
       return ret;
     }
    }
  }

Please test the above for off-by-ones, I wrote it pretty quickly. return NULL; }

This will maybe not be optimal in performance, but very very simple to both implement, get right, read and understand.

Share:
14,354
Fco. Javier Martínez  Conesa
Author by

Fco. Javier Martínez Conesa

Updated on June 04, 2022

Comments

  • Fco. Javier Martínez  Conesa
    Fco. Javier Martínez Conesa almost 2 years

    I need to extract the text between 2 string patterns in c.

    Example:

    aaaaaa<BBBB>TEXT TO EXTRACT</BBBB>aaaaaaaaa
    
    PATTERN1=<BBBB>
    PATTERN2=</BBBB>
    

    Thanks.

    • Jonny
      Jonny almost 10 years
      Can you use a regex library?
    • Fco. Javier Martínez  Conesa
      Fco. Javier Martínez Conesa almost 10 years
      @Jonny I can use a regex library.
    • Jonny
      Jonny almost 10 years
      Actually ignore that and use the strstr. Regex and XML is tricky at best.
  • The Paramagnetic Croissant
    The Paramagnetic Croissant almost 10 years
    Wrong use of calloc(), superfluous cast, missing 0-terminator causing undefined behavior, etc., etc...
  • unwind
    unwind almost 10 years
    -1 for the reasons outlined by @user3477950, and also this just assumes that PATTERN2 comes after PATTERN1, if found. That's rather dodgy.
  • Vinicius Kamakura
    Vinicius Kamakura almost 10 years
    its not missing the null terminator.