How to split a text file into words?

14,152

Solution 1

Just got this right!! Just removed all unnecessary code.

int main()
{    
    ifstream in("example.txt");
    int LineCount = 0;
    char* str = new char[500];

    while(in)
    {
        LineCount++;
        in.getline(str, 255);
        char * tempPtr = strtok(str," ");
        while(tempPtr)
        {
            AddWord(tempPtr, LineCount);
            tempPtr = strtok(NULL," ,.");
        }
    }
    in.close();
    delete [] str;
    cout<<"Total No of lines:"<<LineCount<<endl;
    showData();

    return 0;
}

BTW the original problem statement was to create a index program that would accept a user file and create an line-index of all words.

Solution 2

I have not tried compiling this, but here's an alternative that is nearly as simple as using Boost, but without the extra dependency.

#include <iostream>
#include <sstream>
#include <string>

int main() {
  std::string line;
  while (std::getline(std::cin, line)) {
    std::istringstream linestream(line);
    std::string word;
    while (linestream >> word) {
      std::cout << word << "\n";
    }
  }
  return 0;
 }
Share:
14,152
Rocco Lampone
Author by

Rocco Lampone

Updated on June 19, 2022

Comments

  • Rocco Lampone
    Rocco Lampone almost 2 years

    I am working on a assignment where I am supposed to read a file and count the number of lines and at the same time count the words in it. I tried a combination of getline and strtok inside a while loop, which did not work.

    file:example.txt (the file to be read).

    Hi, hello what a pleasant surprise.
    Welcome to this place.
    May you have a pleasant stay here.
    (3 lines, and some words).

    Readfile.cpp

    #include <iostream>
    #include <fstream>
    #include<string>
    using namespace std;
    int main()
    {
      ifstream in("example.txt");
      int count = 0;
    
      if(!in)
      {
        cout << "Cannot open input file.\n";
        return 1;
      }
    
      char str[255];
      string tok;
      char * t2;
    
      while(in)
      {
        in.getline(str, 255);
        in>>tok;
        char *dup = strdup(tok.c_str());
        do 
        {
            t2 = strtok(dup," ");
        }while(t2 != NULL);
        cout<<t2<<endl;
        free (dup);
        count++;
      }
      in.close();
      cout<<count;
      return 0;
    }
    
    • Blorgbeard
      Blorgbeard about 15 years
      You need to say more than "did not work". Tell us what error you get, or the SPECIFIC thing that your program does differently than you expect, then ask a specific question. We will not debug or rewrite your homework for you.
    • Admin
      Admin over 13 years
      How about some of the examples from the following: codeproject.com/KB/recipes/Tokenizer.aspx They are very efficient and somewhat elegant. The String Toolkit Library makes complex string processing in C++ simple and easy.
  • Rocco Lampone
    Rocco Lampone about 15 years
    Hello, Thanks, Blorgbeard, Reed and X-Istence for the prompt replies. I need to not just parse the line, but also need to keep track of the lineNos. The problem statement is to make a list of words with the line-nos they appear on.
  • X-Istence
    X-Istence about 15 years
    Ravi: In which the code I just gave you will get you half way there. We are not here to do your homework for you!
  • Frank
    Frank about 15 years
    +1 That's how I would do it. Now just insert the counters and it's done.