How to find text between two strings in c
14,354
Solution 1
Here is an alive example of how to do this
#include <stdio.h>
#include <string.h>
int main(void)
{
const char *s = "aaaaaa<BBBB>TEXT TO EXTRACT</BBBB>aaaaaaaaa";
const char *PATTERN1 = "<BBBB>";
const char *PATTERN2 = "</BBBB>";
char *target = NULL;
char *start, *end;
if ( start = strstr( s, PATTERN1 ) )
{
start += strlen( PATTERN1 );
if ( end = strstr( start, PATTERN2 ) )
{
target = ( char * )malloc( end - start + 1 );
memcpy( target, start, end - start );
target[end - start] = '\0';
}
}
if ( target ) printf( "%s\n", target );
free( target );
return 0;
}
The output is
TEXT TO EXTRACT
Solution 2
Just use strstr()
.
First once to find the start marker, then call it again with a pointer to the first character after the start marker, to find the end marker:
char * extract_between(const char *str, const char *p1, const char *p2)
{
const char *i1 = strstr(str, p1);
if(i1 != NULL)
{
const size_t pl1 = strlen(p1);
const char *i2 = strstr(i1 + pl1, p2);
if(p2 != NULL)
{
/* Found both markers, extract text. */
const size_t mlen = i2 - (i1 + pl1);
char *ret = malloc(mlen + 1);
if(ret != NULL)
{
memcpy(ret, i1 + pl1, mlen);
ret[mlen] = '\0';
return ret;
}
}
}
Please test the above for off-by-ones, I wrote it pretty quickly. return NULL; }
This will maybe not be optimal in performance, but very very simple to both implement, get right, read and understand.
Author by
Fco. Javier Martínez Conesa
Updated on June 04, 2022Comments
-
Fco. Javier Martínez Conesa almost 2 years
I need to extract the text between 2 string patterns in c.
Example:
aaaaaa<BBBB>TEXT TO EXTRACT</BBBB>aaaaaaaaa PATTERN1=<BBBB> PATTERN2=</BBBB>
Thanks.
-
Jonny almost 10 yearsCan you use a regex library?
-
Fco. Javier Martínez Conesa almost 10 years@Jonny I can use a regex library.
-
Jonny almost 10 yearsActually ignore that and use the
strstr
. Regex and XML is tricky at best.
-
-
The Paramagnetic Croissant almost 10 yearsWrong use of
calloc()
, superfluous cast, missing 0-terminator causing undefined behavior, etc., etc... -
unwind almost 10 years-1 for the reasons outlined by @user3477950, and also this just assumes that
PATTERN2
comes afterPATTERN1
, if found. That's rather dodgy. -
Vinicius Kamakura almost 10 yearsits not missing the null terminator.