Seekg(ios::beg) not returning to beginning of redirected input

12,374

There are several problems with your code. The first is that you use the results of an input (cin.get( c )) without checking that the input has succeeded. This is always an error; in your case, it will probably only result in counting (and later outputting) the last character twice, but it can result in undefined behavior. You must check that the input stream is in a good state after each input, before using the value input. The usual way of doint this is:

while ( cin.get( c ) ) // ...

, putting the input directly in the loop condition.

The second is the statement:

cin.seekg( std::ios::beg );

I'm actually sort of surprised that this even compiled: there are two overloads of seekg:

std::istream::seekg( std::streampos );

and

std::istream::seekg( std::streamoff, std::ios_base::seekdir );

std::ios::beg has type std::ios_base::seekdir. It's possible for an impementation to define std::streampos and std::ios_base::seekdir in a way so that there is an implicit conversion from std::ios_base::seekdir to std::streampos, but in my opinion, it shouldn't, since the results will almost certainly not be what you want. To seek to the beginning of a file:

std::cin.seekg( 0, std::ios_base::beg );

A third problem: errors in the input stream are sticky. Once you've reached the end of file, that error will remain, and all other operations will be no-ops, until you have cleared the error: std::cin.clear();.

One final comment: the fact that you are using std::cin worries me. It will probably work (although there is no guarantee that you can seek on std::cin, even if the input is redirected from a file), but do be aware that there is no way you can output the results of a huffman encoding to std::cout. It will work under Unix, but probably no where else. Huffman encoding requires that the files be open in binary mode, which is never the case for std::cin and std::cout.

Share:
12,374
user0123
Author by

user0123

Updated on June 13, 2022

Comments

  • user0123
    user0123 almost 2 years

    I am making a huffman encoder and to do so i need to read over the input (which will ALWAYS be a redirected file) to record the frequencies, then create the codebook and then read over the input again so i can encode it.

    My problem is that i am currently trying to test out how to make the file read over from cin twice.

    I read online that cin.seekg(0) or cin.seekg(ios::beg) or cin.seekg(0, ios::beg) all should work perfectly fine so long as the file is redirected and not piped. But when i do that it seems to not do anything at all to the position of cin.

    Here is the code that i am currently using:

    #include<iostream>
    #include"huffmanNode.h"
    
    using namespace std;
    
        int main(){
    
        //create array that stores each character and it's frequency
        unsigned int frequencies[255];
        //initialize to zero
        for(int i=0; i<255; i++){
            frequencies[i] = 0;
        }
    
        //get input and increment the frequency of corresponding character
        char c;
        while(!cin.eof()){
            cin.get(c);
            frequencies[c]++;
        }
    
        //create initial leafe nodes for all characters that have appeared at least once
        for(int i=0; i<255; i++){
    
            if(frequencies[i] != 0){
                huffmanNode* tempNode = new huffmanNode(i, frequencies[i]);
            }
        }
    
    
        // test readout of the frequency list
        for(int i=0; i<255; i++){
            cout << "Character: " << (char)i << " Frequency: " << frequencies[i] << endl;;
        }
    
        //go back to beginning of input
        cin.seekg(ios::beg);
    
        //read over input again, incrementing frequencies. Should result in double the amount of frequencies
     **THIS IS WHERE IT LOOPS FOREVER**
        while(!cin.eof()){
            cin.get(c);
            frequencies[c]++;
        }
    
        //another test readout of the frequency list
        for(int i=0; i<255; i++){
            cout << "Character: " << (char)i << " Double Frequency: " << frequencies[i] << endl;
        }
    
    
        return 0;
    }
    

    Debugging shows that it gets stuck in the while loop on line 40, and it seems to constantly be getting a newline character. Why would it not exit this loop? I assume that cin.seekg() is not actually resetting the input.

  • James Kanze
    James Kanze over 10 years
    It's definitely a good idea to break things down into smaller functions, as you've done, but your solution only works if the entire file will fit in memory; if he needs compression, this may not be the case. (I also wonder: why the using namespace std;, since you qualify all of the standard members anyway?)
  • sehe
    sehe over 10 years
    @JamesKanze good points. Re: "using namespace"... Did the rest of the code bear so little semblance to the OP's code? :/. Also, replaying stdin without keeping it in memory would require a temporary file. It makes little sense to read from stdin then, IMO. I assumed he doesn't need compression (and Huffman is rarely a good choice for that)
  • James Kanze
    James Kanze over 10 years
    The OP's original code used cin, and not std::cin. I much prefer the latter (as apparently you do too), but if you do use std::cin, what's the point of using namespace std;?
  • James Kanze
    James Kanze over 10 years
    Re replaying standard in without a temporary file: if std::cin supports seeking (which is the case for most implementations), then you don't need a temporary file. But globally: I'd not use std::cin for this sort of thing. If for no other reason that I'd want to compress the actual file data, which means binary reads, and you can't read std::cin in binary.
  • sehe
    sehe over 10 years
    Sigh. James, I tend to "minimally" edit the OPs code and left it in. Why do you insist on me "accounting for" the using-statement, when it was obviously an oversight? I wasn't using it. Please, consider just editing next time. ("fixed" now).
  • sehe
    sehe over 10 years
    On binary std::cin: good point again. However, there is no-one who knows what the OP is going to use what code for and why.
  • James Kanze
    James Kanze over 10 years
    I wasn't insisting on your "accounting for" anything. I thought you gave a very good answer. I was just curious: I thought you might have left the using namespace std; in for a reason, although I couldn't see one. (And I will not edit another persons posting. I might end up saying something he didn't agree with, but it would be there with his name on it.)
  • Bartek Banachewicz
    Bartek Banachewicz over 10 years
    @JamesKanze Considering a) the status shows as edited by you b) the answerer is notified about the edit, I think you should be more eager to do it.
  • sehe
    sehe over 10 years
    @JamesKanze Okay, thanks for explaining. Feel free to iron out simple things like that (typos, formatting, unclear variable naming, it's all fair game). And thanks for the nice words.