c recv() read until newline occurs

32,657

Solution 1

The usual way to deal with this is to recv into a persistent buffer in your application, then pull a single line out and process it. Later you can process the remaining lines in the buffer before calling recv again. Keep in mind that the last line in the buffer may only be partially received; you have to deal with this case by re-entering recv to finish the line.

Here's an example (totally untested! also looks for a \n, not \r\n):

#define BUFFER_SIZE 1024
char inbuf[BUFFER_SIZE];
size_t inbuf_used = 0;

/* Final \n is replaced with \0 before calling process_line */
void process_line(char *lineptr);
void input_pump(int fd) {
  size_t inbuf_remain = sizeof(inbuf) - inbuf_used;
  if (inbuf_remain == 0) {
    fprintf(stderr, "Line exceeded buffer length!\n");
    abort();
  }

  ssize_t rv = recv(fd, (void*)&inbuf[inbuf_used], inbuf_remain, MSG_DONTWAIT);
  if (rv == 0) {
    fprintf(stderr, "Connection closed.\n");
    abort();
  }
  if (rv < 0 && errno == EAGAIN) {
    /* no data for now, call back when the socket is readable */
    return;
  }
  if (rv < 0) {
    perror("Connection error");
    abort();
  }
  inbuf_used += rv;

  /* Scan for newlines in the line buffer; we're careful here to deal with embedded \0s
   * an evil server may send, as well as only processing lines that are complete.
   */
  char *line_start = inbuf;
  char *line_end;
  while ( (line_end = (char*)memchr((void*)line_start, '\n', inbuf_used - (line_start - inbuf))))
  {
    *line_end = 0;
    process_line(line_start);
    line_start = line_end + 1;
  }
  /* Shift buffer down so the unprocessed data is at the start */
  inbuf_used -= (line_start - inbuf);
  memmove(innbuf, line_start, inbuf_used);
}

Solution 2

TCP doesn't offer any sequencing of that sort. As @bdonlan already said you should implement something like:

  • Continuously recv from the socket into a buffer
  • On each recv, check if the bytes received contain an \n
  • If an \n use everything up to that point from the buffer (and clear it)

I don't have a good feeling about this (I read somewhere that you shouldn't mix low-level I/O with stdio I/O) but you might be able to use fdopen.

All you would need to do is

  • use fdopen(3) to associate your socket with a FILE *
  • use setvbuf to tell stdio that you want it line-buffered (_IOLBF) as opposed to the default block-buffered.

At this point you should have effectively moved the work from your hands to stdio. Then you could go on using fgets and the like on the FILE *.

Share:
32,657
FurryHead
Author by

FurryHead

Updated on July 18, 2022

Comments

  • FurryHead
    FurryHead almost 2 years

    I'm working on writing a IRC bot in C, and have ran into a snag.

    In my main function, I create my socket and connect, all that happy stuff. Then I have a (almost) infinite loop to read what's being sent back from the server. I then pass what's read off to a helper function, processLine(char *line) - the problem is, that the following code reads until my buffer is full - I want it to only read text until a newline (\n) or carriage return (\r) occurs (thus ending that line)

       while (buffer[0] && buffer[1]) {
            for (i=0;i<BUFSIZE;i++) buffer[i]='\0';
            if (recv(sock, buffer, BUFSIZE, 0) == SOCKET_ERROR)
                processError();
    
            processLine(buffer);
        }
    

    What ends up happening is that many lines get jammed all together, and I can't process the lines properly when that happens.

    If you're not familiar with IRC protocols, a brief summary would be that when a message is sent, it often looks like this: :YourNickName!YourIdent@YourHostName PRIVMSG #someChannel :The rest on from here is the message sent... and a login notice, for instance, is something like this: :the.hostname.of.the.server ### bla some text bla with ### being a code(?) used for processing - i.e. 372 is an indicator that the following text is part of the Message Of The Day.

    When it's all jammed together, I can't read what number is for what line because I can't find where a line begins or ends!

    I'd appreciate help with this very much!

    P.S.: This is being compiled/ran on linux, but I eventually want to port it to windows, so I am making as much of it as I can multi-platform.

    P.S.S.: Here's my processLine() code:

    void processLine(const char *line) {
        char *buffer, *words[MAX_WORDS], *aPtr;
        char response[100];
        int count = 0, i;
        buffer = strdup(line);
    
        printf("BLA %s", line);
    
        while((aPtr = strsep(&buffer, " ")) && count < MAX_WORDS)
            words[count++] = aPtr;
            printf("DEBUG %s\n", words[1]);
        if (strcmp(words[0], "PING") == 0) {
            strcpy(response, "PONG ");
            strcat(response, words[1]);
            sendLine(NULL, response); /* This is a custom function, basically it's a send ALL function */
        } else if (strcmp(words[1], "376") == 0) { /* We got logged in, send login responses (i.e. channel joins) */
            sendLine(NULL, "JOIN #cbot");
        }
    }
    
  • FurryHead
    FurryHead almost 13 years
    seems simple enough. How would I re-enter recv(), though? Would I pass a char pointer to the end of the partially read text, i.e. if recv() only read 5 out of 10 chars, pass a pointer to the 6th position instead?
  • bdonlan
    bdonlan almost 13 years
    @FurryHead: Added an (untested) example
  • FurryHead
    FurryHead almost 13 years
    Great idea, I tried it and it works beautifully. I do have two questions about it though: How would I check for errors on windows? Usually, I'd use WSAGetLastError() as windows sockets use that rather than errno... And would fdopen()/setvbuf() work on windows, too?
  • FurryHead
    FurryHead almost 13 years
    (update, when I try to process errno on linux using this, it gives me an error code of 0 - I don't yet know what that corresponds to)
  • cnicutar
    cnicutar almost 13 years
    @FurryHead setvbuf is standard; Windows has _fdopen. About the errno part, when using stdio check for errors with ferror, feof. Obviously this presents fewer details than a recv or a read does. The standard says no function should set errno to 0, but I believe it means "success". So even when a recv fails, the actual fgets succeeds.
  • FurryHead
    FurryHead almost 13 years
    Oh, great. That just threw all my error handling code right out the window! Anyway, it was helpful regardless. Thanks!
  • FurryHead
    FurryHead almost 13 years
    Oh wow. I gave up on this project long ago, feeling like most of this was going right over my head (which, it was). Now I'm finally coming back to an extremely similar project (irc bot again, but a little different) and I read through this without even realizing this was my thread. I've been banging my head against the desk for the past 2 days, trying to implement this (almost exactly what you were writing) but, oddly enough, I ended up with just one character being removed from random positions in the line. Odd. Anyway, just wanted to thank you again! This helped a lot!
  • Behlül
    Behlül over 10 years
    works great but (inbuf - line_start); should be (line_start - inbuf);