Send and Receive a file in socket programming in Linux with C/C++ (GCC/G++)

114,782

Solution 1

The most portable solution is just to read the file in chunks, and then write the data out to the socket, in a loop (and likewise, the other way around when receiving the file). You allocate a buffer, read into that buffer, and write from that buffer into your socket (you could also use send and recv, which are socket-specific ways of writing and reading data). The outline would look something like this:

while (1) {
    // Read data into buffer.  We may not have enough to fill up buffer, so we
    // store how many bytes were actually read in bytes_read.
    int bytes_read = read(input_file, buffer, sizeof(buffer));
    if (bytes_read == 0) // We're done reading from the file
        break;
    
    if (bytes_read < 0) {
        // handle errors
    }
    
    // You need a loop for the write, because not all of the data may be written
    // in one call; write will return how many bytes were written. p keeps
    // track of where in the buffer we are, while we decrement bytes_read
    // to keep track of how many bytes are left to write.
    void *p = buffer;
    while (bytes_read > 0) {
        int bytes_written = write(output_socket, p, bytes_read);
        if (bytes_written <= 0) {
            // handle errors
        }
        bytes_read -= bytes_written;
        p += bytes_written;
    }
}

Make sure to read the documentation for read and write carefully, especially when handling errors. Some of the error codes mean that you should just try again, for instance just looping again with a continue statement, while others mean something is broken and you need to stop.

For sending the file to a socket, there is a system call, sendfile that does just what you want. It tells the kernel to send a file from one file descriptor to another, and then the kernel can take care of the rest. There is a caveat that the source file descriptor must support mmap (as in, be an actual file, not a socket), and the destination must be a socket (so you can't use it to copy files, or send data directly from one socket to another); it is designed to support the usage you describe, of sending a file to a socket. It doesn't help with receiving the file, however; you would need to do the loop yourself for that. I cannot tell you why there is a sendfile call but no analogous recvfile.

Beware that sendfile is Linux specific; it is not portable to other systems. Other systems frequently have their own version of sendfile, but the exact interface may vary (FreeBSD, Mac OS X, Solaris).

In Linux 2.6.17, the splice system call was introduced, and as of 2.6.23 is used internally to implement sendfile. splice is a more general purpose API than sendfile. For a good description of splice and tee, see the rather good explanation from Linus himself. He points out how using splice is basically just like the loop above, using read and write, except that the buffer is in the kernel, so the data doesn't have to transferred between the kernel and user space, or may not even ever pass through the CPU (known as "zero-copy I/O").

Solution 2

Do aman 2 sendfile. You only need to open the source file on the client and destination file on the server, then call sendfile and the kernel will chop and move the data.

Solution 3

Minimal runnable POSIX read + write example

Usage:

  1. get two computers on a LAN.

    For example, this will work if both computers are connected to your home router in most cases, which is how I tested it.

  2. On the server computer:

    1. Find the server local IP with ifconfig, e.g. 192.168.0.10

    2. Run:

      ./server output.tmp 12345
      
  3. On the client computer:

    printf 'ab\ncd\n' > input.tmp
    ./client input.tmp 192.168.0.10 12345
    
  4. Outcome: a file output.tmp is created on the sever computer containing 'ab\ncd\n'!

server.c

/*
Receive a file over a socket.

Saves it to output.tmp by default.

Interface:

    ./executable [<output_file> [<port>]]

Defaults:

- output_file: output.tmp
- port: 12345
*/

#define _XOPEN_SOURCE 700

#include <stdio.h>
#include <stdlib.h>

#include <arpa/inet.h>
#include <fcntl.h>
#include <netdb.h> /* getprotobyname */
#include <netinet/in.h>
#include <sys/stat.h>
#include <sys/socket.h>
#include <unistd.h>

int main(int argc, char **argv) {
    char *file_path = "output.tmp";
    char buffer[BUFSIZ];
    char protoname[] = "tcp";
    int client_sockfd;
    int enable = 1;
    int filefd;
    int i;
    int server_sockfd;
    socklen_t client_len;
    ssize_t read_return;
    struct protoent *protoent;
    struct sockaddr_in client_address, server_address;
    unsigned short server_port = 12345u;

    if (argc > 1) {
        file_path = argv[1];
        if (argc > 2) {
            server_port = strtol(argv[2], NULL, 10);
        }
    }

    /* Create a socket and listen to it.. */
    protoent = getprotobyname(protoname);
    if (protoent == NULL) {
        perror("getprotobyname");
        exit(EXIT_FAILURE);
    }
    server_sockfd = socket(
        AF_INET,
        SOCK_STREAM,
        protoent->p_proto
    );
    if (server_sockfd == -1) {
        perror("socket");
        exit(EXIT_FAILURE);
    }
    if (setsockopt(server_sockfd, SOL_SOCKET, SO_REUSEADDR, &enable, sizeof(enable)) < 0) {
        perror("setsockopt(SO_REUSEADDR) failed");
        exit(EXIT_FAILURE);
    }
    server_address.sin_family = AF_INET;
    server_address.sin_addr.s_addr = htonl(INADDR_ANY);
    server_address.sin_port = htons(server_port);
    if (bind(
            server_sockfd,
            (struct sockaddr*)&server_address,
            sizeof(server_address)
        ) == -1
    ) {
        perror("bind");
        exit(EXIT_FAILURE);
    }
    if (listen(server_sockfd, 5) == -1) {
        perror("listen");
        exit(EXIT_FAILURE);
    }
    fprintf(stderr, "listening on port %d\n", server_port);

    while (1) {
        client_len = sizeof(client_address);
        puts("waiting for client");
        client_sockfd = accept(
            server_sockfd,
            (struct sockaddr*)&client_address,
            &client_len
        );
        filefd = open(file_path,
                O_WRONLY | O_CREAT | O_TRUNC,
                S_IRUSR | S_IWUSR);
        if (filefd == -1) {
            perror("open");
            exit(EXIT_FAILURE);
        }
        do {
            read_return = read(client_sockfd, buffer, BUFSIZ);
            if (read_return == -1) {
                perror("read");
                exit(EXIT_FAILURE);
            }
            if (write(filefd, buffer, read_return) == -1) {
                perror("write");
                exit(EXIT_FAILURE);
            }
        } while (read_return > 0);
        close(filefd);
        close(client_sockfd);
    }
    return EXIT_SUCCESS;
}

client.c

/*
Send a file over a socket.

Interface:

    ./executable [<input_path> [<sever_hostname> [<port>]]]

Defaults:

- input_path: input.tmp
- server_hostname: 127.0.0.1
- port: 12345
*/

#define _XOPEN_SOURCE 700

#include <stdio.h>
#include <stdlib.h>

#include <arpa/inet.h>
#include <fcntl.h>
#include <netdb.h> /* getprotobyname */
#include <netinet/in.h>
#include <sys/stat.h>
#include <sys/socket.h>
#include <unistd.h>

int main(int argc, char **argv) {
    char protoname[] = "tcp";
    struct protoent *protoent;
    char *file_path = "input.tmp";
    char *server_hostname = "127.0.0.1";
    char *server_reply = NULL;
    char *user_input = NULL;
    char buffer[BUFSIZ];
    in_addr_t in_addr;
    in_addr_t server_addr;
    int filefd;
    int sockfd;
    ssize_t i;
    ssize_t read_return;
    struct hostent *hostent;
    struct sockaddr_in sockaddr_in;
    unsigned short server_port = 12345;

    if (argc > 1) {
        file_path = argv[1];
        if (argc > 2) {
            server_hostname = argv[2];
            if (argc > 3) {
                server_port = strtol(argv[3], NULL, 10);
            }
        }
    }

    filefd = open(file_path, O_RDONLY);
    if (filefd == -1) {
        perror("open");
        exit(EXIT_FAILURE);
    }

    /* Get socket. */
    protoent = getprotobyname(protoname);
    if (protoent == NULL) {
        perror("getprotobyname");
        exit(EXIT_FAILURE);
    }
    sockfd = socket(AF_INET, SOCK_STREAM, protoent->p_proto);
    if (sockfd == -1) {
        perror("socket");
        exit(EXIT_FAILURE);
    }
    /* Prepare sockaddr_in. */
    hostent = gethostbyname(server_hostname);
    if (hostent == NULL) {
        fprintf(stderr, "error: gethostbyname(\"%s\")\n", server_hostname);
        exit(EXIT_FAILURE);
    }
    in_addr = inet_addr(inet_ntoa(*(struct in_addr*)*(hostent->h_addr_list)));
    if (in_addr == (in_addr_t)-1) {
        fprintf(stderr, "error: inet_addr(\"%s\")\n", *(hostent->h_addr_list));
        exit(EXIT_FAILURE);
    }
    sockaddr_in.sin_addr.s_addr = in_addr;
    sockaddr_in.sin_family = AF_INET;
    sockaddr_in.sin_port = htons(server_port);
    /* Do the actual connection. */
    if (connect(sockfd, (struct sockaddr*)&sockaddr_in, sizeof(sockaddr_in)) == -1) {
        perror("connect");
        return EXIT_FAILURE;
    }

    while (1) {
        read_return = read(filefd, buffer, BUFSIZ);
        if (read_return == 0)
            break;
        if (read_return == -1) {
            perror("read");
            exit(EXIT_FAILURE);
        }
        /* TODO use write loop: https://stackoverflow.com/questions/24259640/writing-a-full-buffer-using-write-system-call */
        if (write(sockfd, buffer, read_return) == -1) {
            perror("write");
            exit(EXIT_FAILURE);
        }
    }
    free(user_input);
    free(server_reply);
    close(filefd);
    exit(EXIT_SUCCESS);
}

GitHub upstream.

Further comments

Possible improvements:

  • Currently output.tmp gets overwritten each time a send is done.

    This begs for the creation of a simple protocol that allows to pass a filename so that multiple files can be uploaded, e.g.: filename up to the first newline character, max filename 256 chars, and the rest until socket closure are the contents. Of course, that would require sanitation to avoid a path transversal vulnerability.

    Alternatively, we could make a server that hashes the files to find filenames, and keeps a map from original paths to hashes on disk (on a database).

  • Only one client can connect at a time.

    This is specially harmful if there are slow clients whose connections last for a long time: the slow connection halts everyone down.

    One way to work around that is to fork a process / thread for each accept, start listening again immediately, and use file lock synchronization on the files.

  • Add timeouts, and close clients if they take too long. Or else it would be easy to do a DoS.

    poll or select are some options: How to implement a timeout in read function call?

A simple HTTP wget implementation is shown at: How to make an HTTP get request in C without libcurl?

Tested on Ubuntu 15.10.

Solution 4

This file will serve you as a good sendfile example : http://tldp.org/LDP/LGNET/91/misc/tranter/server.c.txt

Share:
114,782
Sajad Bahmani
Author by

Sajad Bahmani

Favorite Languages : Java , Scala , Bash , C/C++ , Python Favorite IDE : IntelliJ IDEA , Netbeans Favorite Editor : VSCode , Vim

Updated on March 15, 2021

Comments

  • Sajad Bahmani
    Sajad Bahmani about 3 years

    I would like to implement a client-server architecture running on Linux using sockets and C/C++ language that is capable of sending and receiving files. Is there any library that makes this task easy? Could anyone please provide an example?

  • KevinDTimm
    KevinDTimm over 14 years
    Brian, I'm guessing you and florin will not get the check mark as this is almost certainly homework :(
  • KevinDTimm
    KevinDTimm over 14 years
    What happened to your original answer? You first suggested the 'sendfile' command on the *nix box, now you've given a programmatic solution whilst removing all vestiges of your original answer.
  • Brian Campbell
    Brian Campbell over 14 years
    I didn't remove all vestiges of it. I just added the skeleton of a standard, read/write loop above it. I figured that I should explain that first, because even using sendfile, you will need to use such a loop to receive the file, and since I'd mentioned splice later on, which uses the same pattern.
  • KevinDTimm
    KevinDTimm over 14 years
    Ah, somehow missed the retention of 'sendfile'. Sorry.
  • Damon
    Damon over 12 years
    It should be pointed out that splice/vmsplice is mostly something that looks good in theory. In practice, the buffer size is limited to 64k and there is no single good (or safe) way to send data exceeding that, since there is no reliable way of knowing when exactly a buffer is not needed any more. There's a kernel developer discussion on that somewhere too (with some excursion into using 3 buffers), though I can't find it right now. Also, for zero copy to truly work, you must sometimes do obscure things (GIFT flag) which are not well-documented or conclusive.
  • Brian Campbell
    Brian Campbell over 12 years
    @Damon Do you have any references for more information? I'm curious about what exactly the gotchas are; is it less efficient than the standard userspace read/write loop, or just not much more efficient? I don't actually have much experience with them myself, so I'd love to learn more about the gotchas.
  • m4n07
    m4n07 about 11 years
    How about receiving files in the same program ?
  • bakalolo
    bakalolo over 6 years
    would be nice to include the code for creating buffer opening file etc to those who are new with C like myself.
  • bakalolo
    bakalolo over 6 years
    I'm getting a point of type 'void*' used in arithmetic when using this in C++ how do I remove this warning?
  • user3629249
    user3629249 about 6 years
    regarding: if (write(sockfd, buffer, read_return) == -1) { over a socket, a partial write can occur, so should be checking that all bytes written and if not plan on calling write() again with the portion of the data that still needs to be written
  • Ciro Santilli OurBigBook.com
    Ciro Santilli OurBigBook.com about 6 years
    @user3629249 thanks, I've added a comment. Would converting to FILE* and then using fwrite work?
  • user3629249
    user3629249 about 6 years
    The fwrite() function is for text data and allows formatting. the write() function can also handle binary data but does not have any formatting capabilities.
  • user3629249
    user3629249 about 6 years
    when a write() fails, and the program is going to exit, should call close() on the socketfd to let the server know about it
  • user3629249
    user3629249 about 6 years
    the 'server' should check the returned value from accept() to assure the operation was successful. the 'server' after the call to read() should be checking for a returned value of 0 immediately, not writing yet another message to the client before checking. For better communications, suggest: start several threads before calling 'listen(), keep track of which threads are currently in use (talking to a client) use the log()` (or similar) function to track whom the server has communicated with. Add a call to setsockopt() to set 'keepalive' on the server main socket. cont:
  • user3629249
    user3629249 about 6 years
    cont: call close() on any socket that is no longer in use, like the client sockets and when the 'server' is exiting, on the main server socket
  • Ciro Santilli OurBigBook.com
    Ciro Santilli OurBigBook.com about 6 years
    @user3629249 thanks for the suggestions. Send a PR to the GitHub upstream if you ge the time: github.com/cirosantilli/cpp-cheat/tree/master/posix/socket/i‌​net ? I will then update this answer.
  • PSkocik
    PSkocik over 3 years
    The whole point of sendfile is to elide a kernel_buffer (memcpy)=> user_buffer (memcpy)=> kernel_buffer roundtrip and just submit the file from the kernel buffer. Considering that, recvfile doesn't make sense. There has to be a memcpy to userspace there and the size of the file must be serialized somehow and it doesn't make sense for the kernel to try and mandate a application level protocol for such serialization.