Python: Size of message to send via socket

13,921

Solution 1

Its common in this situation to read the header to get the size and then read the payload. Its a bit easier if the header is fixed size (maybe a binary integer, maybe a fixed size ascii string with padding) but you can also just read character by character until you find a separator such as '|'. I've got a couple of samples below.

import struct

def _get_block(s, count):
    if count <= 0:
        return ''
    buf = ''
    while len(buf) < count:
        buf2 = s.recv(count - len(buf))
        if not buf2:
            # error or just end of connection?
            if buf:
                raise RuntimeError("underflow")
            else:
                return ''
        buf += buf2
    return buf

def _send_block(s, data):
    while data:
        data = data[s.send(data):]

if False:
    def get_msg(s):
        count = struct.unpack('>i', _get_block(s, 4))[0]
        return _get_block(s, count)

    def send_msg(s, data):
        header = struct.pack('>i', len(data))
        _send_block(s, header)
        _send_block(s, data)

if True:

    def _get_count(s):
        buf = ''
        while True:
            c = s.recv(1)
            if not c:
                # error or just end of connection/
                if buf:
                    raise RuntimeError("underflow")
                else:
                    return -1
            if c == '|':
                return int(buf)
            else:
                buf += c

    def get_msg(s):
        return _get_block(s, _get_count(s))

    def send_msg(s, data):
        _send_block(s, str(len(data)) + '|')
        _send_block(s, data)


import threading
import socket
import time

def client(port):
    s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    s.connect(('0.0.0.0', port))
    print get_msg(s)
    print get_msg(s)
    s.shutdown(socket.SHUT_RDWR)
    s.close()

def server(port):
    s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
    s.bind(('0.0.0.0', port))
    s.listen(1)
    c, addr = s.accept()
    send_msg(c, 'hello')
    send_msg(c, 'there')
    c.close()
    s.close()

if __name__ == '__main__':
    c = threading.Thread(target=server, args=(8999,))
    c.start()
    time.sleep(1)
    client(8999)
    c.join()
    print 'done'

Solution 2

There's no point reinventing the wheel. Sending variable length strings is easily done by sending a string as a python string object using the multiprocessing.connection module. This method will allow to send most python objects, not just strings.

import multiprocessing
import multiprocessing.connection as connection

def producer(data, address, authkey):
    with connection.Listener(address, authkey=authkey) as listener:
        with listener.accept() as conn:
            print('connection accepted from', listener.last_accepted)
            for item in data:
                print("producer sending:", repr(item))
                conn.send(item)


def consumer(address, authkey):
    with connection.Client(address, authkey=authkey) as conn:
        try:
            while True:
                item = conn.recv()
                print("consumer received:", repr(item))
        except EOFError:
            pass

listen_address = "localhost", 50000
remote_address = "localhost", 50000
authkey = b'secret password'

if __name__ == "__main__":
    data = ["1", "23", "456"]
    p = multiprocessing.Process(target=producer, args=(data, listen_address, authkey))
    p.start()
    consumer(remote_address, authkey)
    p.join()
    print("done")

Which produces something like:

producer sending: '1'
producer sending: '23'
consumer received: '1'
producer sending: '456'
consumer received: '23'
consumer received: '456'
done

Solution 3

After serializing the data, you could simply use len(your_serialized data) to get its length.

Below is the sample for send and receive functions, which you could use on both client and server-side to send and receive variable-length data.

def send_data(conn, data):
    serialized_data = pickle.dumps(data)
    conn.sendall(struct.pack('>I', len(serialized_data)))
    conn.sendall(serialized_data)


def receive_data(conn):
    data_size = struct.unpack('>I', conn.recv(4))[0]
    received_payload = b""
    reamining_payload_size = data_size
    while reamining_payload_size != 0:
        received_payload += conn.recv(reamining_payload_size)
        reamining_payload_size = data_size - len(received_payload)
    data = pickle.loads(received_payload)

    return data

you could find sample program at https://github.com/vijendra1125/Python-Socket-Programming.git

Share:
13,921

Related videos on Youtube

Kudayar Pirimbaev
Author by

Kudayar Pirimbaev

Learning to code in C++, Python, HTML. JavaScript. CSS

Updated on September 15, 2022

Comments

  • Kudayar Pirimbaev
    Kudayar Pirimbaev over 1 year

    I'm trying to send messages using socket library. Since messages are variable-sized, I've decided to append the size of message at the beginning of the string, then send it. For example, if the message is

    Hello World!
    

    which is 13 characters long (I've counted EOL), I would send something like

    sizeof13charsinbytes|Hello World!
    

    via socket.send(), then I would split size and the message with str.split()

    Since socket.recv() needs message size in bytes, how to find size of a message? I tried sys.getsizeof() but it gives arbitrary value for single-character string. Is it the right size?

  • Kudayar Pirimbaev
    Kudayar Pirimbaev over 9 years
    i guess, if False is for messages where size is fixed size, and if True is where it is not
  • Kudayar Pirimbaev
    Kudayar Pirimbaev over 9 years
    is it correct to give string length and not it's size in bytes? i thought you would multiply it by the size of character but when you try to read message, you specify string length, not its size (in if True)
  • tdelaney
    tdelaney over 9 years
    The example was written on python 2 where data comes in as 1 byte characters, so length and char count are the same thing. If you want a different encoding, or want to use it on python 3, then you'll have to add it.
  • tdelaney
    tdelaney over 9 years
    Good answer but limits sender and receiver to python code.
  • Dunes
    Dunes over 9 years
    You're very right that that is the down side. To be honest I hadn't considered the possibility.