Why is strtok changing its input like this?

14,710

Solution 1

When strtok() finds a token, it changes the character immediately after the token into a \0, and then returns a pointer to the token. The next time you call it with a NULL argument, it starts looking after the separators that terminated the first token -- i.e., after the \0, and possibly further along.

Now, the original pointer to the beginning of the string still points to the beginning of the string, but the first token is now \0-terminated -- i.e., printf() thinks the end of the token is the end of the string. The rest of the data is still there, but that \0 stops printf() from showing it. If you used a for-loop to walk over the original input string up to the original number of characters, you'd find the data is all still there.

Solution 2

You should printout the token that you receive from strtok and not worry about the input array because NULLs will be inserted by strtok. You need repeated calls to get all of the tokens:

#include <string.h>
#include <stdlib.h>
#include <stdio.h>

int main(int argc, char* argv[]) {
  char input[]="this is a test of the tokenizor seven";
  char * temp;
  temp=strtok(input," ");
  while( temp != NULL ) {
    printf("temp is \"%s\"\n", temp );
    temp = strtok( NULL, " ");
  }
}

Solution 3

It's because strtok inserts nulls into each separator, which is why you use repeated calls to strtok to get each token. The input string cannot be used once you start using strtok. You don't "fix" it -- this is how it works.

Share:
14,710
user1209326
Author by

user1209326

Updated on June 26, 2022

Comments

  • user1209326
    user1209326 about 2 years

    Ok, so I understand that strtok modifies its input argument, but in this case, it's collapsing down the input string into only the first token. Why is this happening, and what can I do to fix it? (Please note, I'm not talking about the variable "temp", which should be the first token, but rather the variable "input", which after one call to strtok becomes "this")

    #include <string.h>
    #include <stdlib.h>
    #include <stdio.h>
    
    int main(int argc, char* argv[]) {
       char input[]="this is a test of the tokenizor seven";
       char * temp;
       temp=strtok(input," ");
       printf("input: %s\n", input); //input is now just "this"
    }
    
  • user1209326
    user1209326 over 12 years
    Oh I see. My understanding of how strtok works was way off -- I assumed it chomped off the token and then slid the input pointer to the first character after the delimeter. At any rate, thank you! This was a very clear and helpful answer.
  • user1209326
    user1209326 over 12 years
    Thanks for such a quick response. Of course when I said "fix it" I meant "how do I get the result I desire," but I appreciate you taking the time to help me.
  • user1209326
    user1209326 over 12 years
    As I said above, clearly I had the wrong idea as to how strtok actually tokenized things. Thanks for your help!
  • Joe
    Joe over 12 years
    If you need an unaffected copy of the input string, then you need to make a copy of it before you strtok.
  • Cătălina Sîrbu
    Cătălina Sîrbu over 3 years
    But after strtok finishes and return NULL (as there are no more tokens), the initial string is restored? Or in order to safely use the strtok you should do a copy of the source string? Also, what will happen to my original string if I stop the strtok before it finishes?
  • Ernest Friedman-Hill
    Ernest Friedman-Hill over 3 years
    @CătălinaSîrbu If you need the original contents of the character buffer to be preserved, then yes, you’d need to make a copy. But in practice that’s rarely the case.
  • Cătălina Sîrbu
    Cătălina Sîrbu over 3 years
    I would need one more clarification, i was reading this A very important remark has to be made here: the function modifies the string pointed to by the first argument (it places null characters at the ends of the tokens – but they’ll all be removed after the last invocation). From what I understand, this si wrong, the source string will not be restored after the last invocation of strtok (meaning the invocation that will return NULL). Is it so ?
  • Ernest Friedman-Hill
    Ernest Friedman-Hill over 3 years
    @CătălinaSîrbu Yes, that quote (where is it from?) is incorrect. strtok does not restore the original string under any circumstances. If it did, it would invalidate all the tokens it had created, meaning you’d have to copy them for them to be useful — which is not the case.
  • Cătălina Sîrbu
    Cătălina Sîrbu over 3 years
    Thank you very much! Now is clear. It is a quote from CLP Advanced programming in C course from cpp institute (they have plenty of mistakes, but it is ok because I always double check and I pay more attention)