Tokenizing strings in C
Solution 1
Do it like this:
char s[256];
strcpy(s, "one two three");
char* token = strtok(s, " ");
while (token) {
printf("token: %s\n", token);
token = strtok(NULL, " ");
}
Note: strtok
modifies the string its tokenising, so it cannot be a const char*
.
Solution 2
Here's an example of strtok
usage, keep in mind that strtok
is destructive of its input string (and therefore can't ever be used on a string constant
char *p = strtok(str, " ");
while(p != NULL) {
printf("%s\n", p);
p = strtok(NULL, " ");
}
Basically the thing to note is that passing a NULL
as the first parameter to strtok
tells it to get the next token from the string it was previously tokenizing.
Solution 3
strtok can be very dangerous. It is not thread safe. Its intended use is to be called over and over in a loop, passing in the output from the previous call. The strtok function has an internal variable that stores the state of the strtok call. This state is not unique to each thread - it is global. If any other code uses strtok in another thread, you get problems. Not the kind of problems you want to track down either!
I'd recommend looking for a regex implementation, or using sscanf to pull apart the string.
Try this:
char strprint[256];
char text[256];
strcpy(text, "My string to test");
while ( sscanf( text, "%s %s", strprint, text) > 0 ) {
printf("token: %s\n", strprint);
}
Note: The 'text' string is destroyed as it's separated. This may not be the preferred behaviour =)
Solution 4
I've made some string functions in order to split values, by using less pointers as I could because this code is intended to run on PIC18F processors. Those processors does not handle really good with pointers when you have few free RAM available:
#include <stdio.h>
#include <string.h>
char POSTREQ[255] = "pwd=123456&apply=Apply&d1=88&d2=100&pwr=1&mpx=Internal&stmo=Stereo&proc=Processor&cmp=Compressor&ip1=192&ip2=168&ip3=10&ip4=131&gw1=192&gw2=168&gw3=10&gw4=192&pt=80&lic=&A=A";
int findchar(char *string, int Start, char C) {
while((string[Start] != 0)) { Start++; if(string[Start] == C) return Start; }
return -1;
}
int findcharn(char *string, int Times, char C) {
int i = 0, pos = 0, fnd = 0;
while(i < Times) {
fnd = findchar(string, pos, C);
if(fnd < 0) return -1;
if(fnd > 0) pos = fnd;
i++;
}
return fnd;
}
void mid(char *in, char *out, int start, int end) {
int i = 0;
int size = end - start;
for(i = 0; i < size; i++){
out[i] = in[start + i + 1];
}
out[size] = 0;
}
void getvalue(char *out, int index) {
mid(POSTREQ, out, findcharn(POSTREQ, index, '='), (findcharn(POSTREQ, index, '&') - 1));
}
void main() {
char n_pwd[7];
char n_d1[7];
getvalue(n_d1, 1);
printf("Value: %s\n", n_d1);
}
Solution 5
You can simplify the code by introducing an extra variable.
#include <string.h>
#include <stdio.h>
int main()
{
char str[100], *s = str, *t = NULL;
strcpy(str, "a space delimited string");
while ((t = strtok(s, " ")) != NULL) {
s = NULL;
printf(":%s:\n", t);
}
return 0;
}
Related videos on Youtube
kombo
Updated on July 09, 2022Comments
-
kombo almost 2 years
I have been trying to tokenize a string using SPACE as delimiter but it doesn't work. Does any one have suggestion on why it doesn't work?
Edit: tokenizing using:
strtok(string, " ");
The code is like the following
pch = strtok (str," "); while (pch != NULL) { printf ("%s\n",pch); pch = strtok (NULL, " "); }
-
Edward Kmett over 15 yearsAre you using strtok or something you grew yourself? cplusplus.com/reference/clibrary/cstring/strtok.html If you are using strtok are you trying to do it on a constant string?
-
dmckee --- ex-moderator kitten over 15 yearsOK. Now we're getting somewhere. What behavior do you expect that you are not getting?
-
dmckee --- ex-moderator kitten over 15 yearsBTW, kombo. Many people who work help desks or teach see the phrase "it doesn't work" as marking a user who hasn't read the furnished manual, or doesn't know what they actually want, or is deeply confused. The form you want is "I'm doing X, and I expected Y, but I got Z. What's wrong?"
-
Jonathan Leffler over 15 years@dmckee: good point. Canonical x-ref: catb.org/~esr/faqs/smart-questions.html
-
-
Will Dean over 15 yearsIn fact, if you look at modern strtok implementations, they tend to use thread-local storage (MSVC has certainly done this for years and years), so they are thread-safe. It's still an archaic function which I would avoid, though...
-
Jason over 12 years
strtok
has an internal state variable tracking the string being tokenized. When you passNULL
to it,strtok
will continue to use this state variable. When you pass a non-null value, the state variable is reset. So in other words: passingNULL
means "continue tokenizing the same string". -
Jason over 12 yearsyou're right, that's why many implementations offer
strtok_r
which atr the very least offers a way to use it in a thread safe way. -
Massimo Fazzolari almost 12 yearsstrtok_r is a thread-safe version of strtok pubs.opengroup.org/onlinepubs/009695399/functions/strtok.html
-
Nilo Paim over 10 yearsNice, @jitsceait, but what happens if I have two delimiters together on input? I'll change a little your code.
-
jitsceait over 10 yearsI think i have added a test case for consecutive delimiters and it was working. Could you please highlight the code you have changed?
-
Jason over 9 years@Gnuey,
p
will point to characters in the string being tokenized. Additionally,strtok
replaces the delimiter found with a'\0'
character so thatp
will effectively be a validNUL
terminated string. So if you were to run it onchar[] s = "hello world";
The first call would return a pointer to theh
character and the buffer would then contain"hello\0world"
. -
S.S. Anne over 4 yearsI agree with the first paragraph but the sentence after that is terrible.
scanf
is hard to use properly as shown in your example; you forget to pass a size (%255s
). -
rje almost 3 yearsstrtok() is fine for non-threaded legacy systems, though. Archaic code for retro systems.