Why is strtok changing its input like this?
Solution 1
When strtok()
finds a token, it changes the character immediately after the token into a \0
, and then returns a pointer to the token. The next time you call it with a NULL
argument, it starts looking after the separators that terminated the first token -- i.e., after the \0
, and possibly further along.
Now, the original pointer to the beginning of the string still points to the beginning of the string, but the first token is now \0
-terminated -- i.e., printf()
thinks the end of the token is the end of the string. The rest of the data is still there, but that \0
stops printf()
from showing it. If you used a for
-loop to walk over the original input string up to the original number of characters, you'd find the data is all still there.
Solution 2
You should printout the token that you receive from strtok
and not worry about the input array because NULLs will be inserted by strtok
. You need repeated calls to get all of the tokens:
#include <string.h>
#include <stdlib.h>
#include <stdio.h>
int main(int argc, char* argv[]) {
char input[]="this is a test of the tokenizor seven";
char * temp;
temp=strtok(input," ");
while( temp != NULL ) {
printf("temp is \"%s\"\n", temp );
temp = strtok( NULL, " ");
}
}
Solution 3
It's because strtok inserts nulls into each separator, which is why you use repeated calls to strtok to get each token. The input string cannot be used once you start using strtok. You don't "fix" it -- this is how it works.
user1209326
Updated on June 26, 2022Comments
-
user1209326 about 2 years
Ok, so I understand that strtok modifies its input argument, but in this case, it's collapsing down the input string into only the first token. Why is this happening, and what can I do to fix it? (Please note, I'm not talking about the variable "temp", which should be the first token, but rather the variable "input", which after one call to strtok becomes "this")
#include <string.h> #include <stdlib.h> #include <stdio.h> int main(int argc, char* argv[]) { char input[]="this is a test of the tokenizor seven"; char * temp; temp=strtok(input," "); printf("input: %s\n", input); //input is now just "this" }
-
user1209326 over 12 yearsOh I see. My understanding of how strtok works was way off -- I assumed it chomped off the token and then slid the input pointer to the first character after the delimeter. At any rate, thank you! This was a very clear and helpful answer.
-
user1209326 over 12 yearsThanks for such a quick response. Of course when I said "fix it" I meant "how do I get the result I desire," but I appreciate you taking the time to help me.
-
user1209326 over 12 yearsAs I said above, clearly I had the wrong idea as to how strtok actually tokenized things. Thanks for your help!
-
Joe over 12 yearsIf you need an unaffected copy of the input string, then you need to make a copy of it before you strtok.
-
Cătălina Sîrbu over 3 yearsBut after
strtok
finishes and return NULL (as there are no more tokens), the initial string is restored? Or in order to safely use thestrto
k you should do a copy of the source string? Also, what will happen to my original string if I stop the strtok before it finishes? -
Ernest Friedman-Hill over 3 years@CătălinaSîrbu If you need the original contents of the character buffer to be preserved, then yes, you’d need to make a copy. But in practice that’s rarely the case.
-
Cătălina Sîrbu over 3 yearsI would need one more clarification, i was reading this A very important remark has to be made here: the function modifies the string pointed to by the first argument (it places null characters at the ends of the tokens – but they’ll all be removed after the last invocation). From what I understand, this si wrong, the source string will not be restored after the last invocation of strtok (meaning the invocation that will return NULL). Is it so ?
-
Ernest Friedman-Hill over 3 years@CătălinaSîrbu Yes, that quote (where is it from?) is incorrect.
strtok
does not restore the original string under any circumstances. If it did, it would invalidate all the tokens it had created, meaning you’d have to copy them for them to be useful — which is not the case. -
Cătălina Sîrbu over 3 yearsThank you very much! Now is clear. It is a quote from CLP Advanced programming in C course from cpp institute (they have plenty of mistakes, but it is ok because I always double check and I pay more attention)