Splitting a string with strtok_s

15,425

Solution 1

The use of vector is incorrect. It is constructed with 15 elements and then push_back() is used to add the strings which adds new elements after the initial 15. This means the first 15 elements are unassigned (null):

std::cout << parts[0] << end; // parts[0] is null

Either:

  • don't preallocate elements at construction, or
  • use operator[] and not push_back() (add additional loop terminator to protect going beyond the end of the vector)

(Consider changing to std::vector<std::string>.)

Just to mention boost::split() that can produces a list of tokens (std::vector<std::string>) from an input string and permits specification of multiple delimiters.

Solution 2

The trouble is A general word of caution: you're using strtok (family of) functions in c++. Note that this old API modifies it's argument. This is frequently not what you expect, and for this reason I'd advise against using this C library function.

Furthermore you are assuming that 15 elements will be read, leaving the 'surplus' elements uninitialized. This, too, results in undefined behaviour on accessing those elements.


May I suggest a C++ approach, since you are using it:

#include <iostream>
#include <sstream>
#include <iterator>
#include <algorithm>

using namespace std;

vector<std::string> splitString(const char in[])
{
    std::istringstream iss(in);
    std::istream_iterator<std::string> first(iss), last;

    std::vector<std::string> parts;
    std::copy(first, last, std::back_inserter(parts));
    return parts;
}

int main(int argc, char * argv[])
{
    const char string1[] = "A string\tof ,,tokens\nand some  more tokens";
    vector<std::string> parts = splitString(string1);
    cout << parts[0] <<endl;
    cout << parts[1] <<endl;
    return 0;
}

This uses the fact that, by default, iostreams will skipws (skip whitespace)

Share:
15,425
bogus
Author by

bogus

Updated on June 04, 2022

Comments

  • bogus
    bogus about 2 years

    I am trying to split a string by a specified delimiter according to this example: http://msdn.microsoft.com/en-us/library/ftsafwz3(v=VS.90).aspx

    My code compiles without errors in Visual C++ 2010, but when I want to run it, I get this error message:

    Unhandled exception at 0x773a15de in Test.exe: 0xC0000005: Access violation reading location 0x00000000.

    Here's my code:

    #include "stdafx.h"
    #include <iostream>
    #include <fstream>
    #include <string>
    #include <sstream>
    #include <regex>
    
    using namespace std;
    
    vector<char *> splitString(char in[])
    {
    vector<char *> parts(15);
    char seps[]   = " ,\t\n";
    char *next_token1 = NULL;
    char *token1 = NULL;
    token1 = strtok_s(in, seps, &next_token1);
    while ((token1 != NULL))
    {
        if (token1 != NULL)
        {
            token1 = strtok_s( NULL, seps, &next_token1);
                        //printf( " %s\n", token1 );
            parts.push_back(token1);
        }
    }
    return parts;
    }
    
    int main(int argc, char * argv[])
    {
    char string1[] =
        "A string\tof ,,tokens\nand some  more tokens";
    vector<char *> parts=splitString(string1);
    cout << parts[0] <<endl;
    cout << parts[1] <<endl;
    return 0;
    }
    

    It seems to be illegal that I try to display the vector's elements, but why?

    The vector's capacity should be sufficient and a

    printf( " %s\n", token1 );

    in the while loop prints out the tokens!

  • ForEveR
    ForEveR over 11 years
    The trouble is, you're using strtok (family of) functions to modify an unmodifiable string literal. There is no modifying of string-literals in OP's code.
  • sehe
    sehe over 11 years
    @ForEveR <strike>I think you may be wrong</strike>: MSDN says: Each call to strtok_s modifies strToken by inserting a null character after the token returned by that call EDIT Oh wait, I see what you meant. string1 is initialized from a literal, not pointing to the actual data. Fixing
  • bogus
    bogus over 11 years
    I incorrectly did this to reserve slots in advance to increase efficiency. Thanks for your advice, it solves my prob!
  • da_m_n
    da_m_n over 10 years
    vector class has a method to do just that : reserve(slots). See cplusplus.com/reference/vector/vector/reserve for more details.