Create and parse multipart HTTP requests in Python
Solution 1
After a bit of exploration, the answer to this question has become clear. The short answer is that although the Content-Disposition is optional in a Mime-encoded message, web.py requires it for each mime-part in order to correctly parse out the HTTP request.
Contrary to other comments on this question, the difference between HTTP and Email is irrelevant, as they are simply transport mechanisms for the Mime message and nothing more. Multipart/related (not multipart/form-data) messages are common in content exchanging webservices, which is the use case here. The code snippets provided are accurate, though, and led me to a slightly briefer solution to the problem.
# open an HTTP connection
h1 = httplib.HTTPConnection('localhost:8080')
# create a mime multipart message of type multipart/related
msg = MIMEMultipart("related")
# create a mime-part containing a zip file, with a Content-Disposition header
# on the section
fp = open('file.zip', 'rb')
base = MIMEBase("application", "zip")
base['Content-Disposition'] = 'file; name="package"; filename="file.zip"'
base.set_payload(fp.read())
encoders.encode_base64(base)
msg.attach(base)
# Here's a rubbish bit: chomp through the header rows, until hitting a newline on
# its own, and read each string on the way as an HTTP header, and reading the rest
# of the message into a new variable
header_mode = True
headers = {}
body = []
for line in msg.as_string().splitlines(True):
if line == "\n" and header_mode == True:
header_mode = False
if header_mode:
(key, value) = line.split(":", 1)
headers[key.strip()] = value.strip()
else:
body.append(line)
body = "".join(body)
# do the request, with the separated headers and body
h1.request("POST", "http://localhost:8080/server", body, headers)
This is picked up perfectly well by web.py, so it's clear that email.mime.multipart is suitable for creating Mime messages to be transported by HTTP, with the exception of its header handling.
My other overall conern is in scalability. Neither this solution nor the others proposed here scale well, as they read the contents of a file into a variable before bundling up in the mime message. A better solution would be one which could serialise on demand as the content is piped out over the HTTP connection. It's not urgent for me to fix that, but I'll come back here with a solution if I get to it.
Solution 2
I used this package by Will Holcomb http://pypi.python.org/pypi/MultipartPostHandler/0.1.0 to make multi-part requests with urllib2, it may help you out.
Comments
-
Richard J over 1 year
I'm trying to write some python code which can create multipart mime http requests in the client, and then appropriately interpret then on the server. I have, I think, partially succeeded on the client end with this:
from email.mime.multipart import MIMEMultipart, MIMEBase import httplib h1 = httplib.HTTPConnection('localhost:8080') msg = MIMEMultipart() fp = open('myfile.zip', 'rb') base = MIMEBase("application", "octet-stream") base.set_payload(fp.read()) msg.attach(base) h1.request("POST", "http://localhost:8080/server", msg.as_string())
The only problem with this is that the email library also includes the Content-Type and MIME-Version headers, and I'm not sure how they're going to be related to the HTTP headers included by httplib:
Content-Type: multipart/mixed; boundary="===============2050792481==" MIME-Version: 1.0 --===============2050792481== Content-Type: application/octet-stream MIME-Version: 1.0
This may be the reason that when this request is received by my web.py application, I just get an error message. The web.py POST handler:
class MultipartServer: def POST(self, collection): print web.input()
Throws this error:
Traceback (most recent call last): File "/usr/local/lib/python2.6/dist-packages/web.py-0.34-py2.6.egg/web/application.py", line 242, in process return self.handle() File "/usr/local/lib/python2.6/dist-packages/web.py-0.34-py2.6.egg/web/application.py", line 233, in handle return self._delegate(fn, self.fvars, args) File "/usr/local/lib/python2.6/dist-packages/web.py-0.34-py2.6.egg/web/application.py", line 415, in _delegate return handle_class(cls) File "/usr/local/lib/python2.6/dist-packages/web.py-0.34-py2.6.egg/web/application.py", line 390, in handle_class return tocall(*args) File "/home/richard/Development/server/webservice.py", line 31, in POST print web.input() File "/usr/local/lib/python2.6/dist-packages/web.py-0.34-py2.6.egg/web/webapi.py", line 279, in input return storify(out, *requireds, **defaults) File "/usr/local/lib/python2.6/dist-packages/web.py-0.34-py2.6.egg/web/utils.py", line 150, in storify value = getvalue(value) File "/usr/local/lib/python2.6/dist-packages/web.py-0.34-py2.6.egg/web/utils.py", line 139, in getvalue return unicodify(x) File "/usr/local/lib/python2.6/dist-packages/web.py-0.34-py2.6.egg/web/utils.py", line 130, in unicodify if _unicode and isinstance(s, str): return safeunicode(s) File "/usr/local/lib/python2.6/dist-packages/web.py-0.34-py2.6.egg/web/utils.py", line 326, in safeunicode return obj.decode(encoding) File "/usr/lib/python2.6/encodings/utf_8.py", line 16, in decode return codecs.utf_8_decode(input, errors, True) UnicodeDecodeError: 'utf8' codec can't decode bytes in position 137-138: invalid data
My line of code is represented by the error line about half way down:
File "/home/richard/Development/server/webservice.py", line 31, in POST print web.input()
It's coming along, but I'm not sure where to go from here. Is this a problem with my client code, or a limitation of web.py (perhaps it just can't support multipart requests)? Any hints or suggestions of alternative code libraries would be gratefully received.
EDIT
The error above was caused by the data not being automatically base64 encoded. Adding
encoders.encode_base64(base)
Gets rid of this error, and now the problem is clear. HTTP request isn't being interpreted correctly in the server, presumably because the email library is including what should be the HTTP headers in the body instead:
<Storage {'Content-Type: multipart/mixed': u'', ' boundary': u'"===============1342637378=="\n' 'MIME-Version: 1.0\n\n--===============1342637378==\n' 'Content-Type: application/octet-stream\n' 'MIME-Version: 1.0\n' 'Content-Transfer-Encoding: base64\n' '\n0fINCs PBk1jAAAAAAAAA.... etc
So something is not right there.
Thanks
Richard
-
Richard J over 13 yearsGreat, thanks, I'll take a look at that. Certainly has the right sort of name :) Cheers, R
-
Richard J over 13 yearsthanks - as you can see, I'm still working on this; I haven't figured out where multipart/mixed is being set yet. Likewise I haven't yet employed the Content-Disposition header as I'm still working on getting it into an HTTP request in the first place. My question is about how to construct such a request in the first place. Cheers, R.
-
Martin v. Löwis over 13 yearsSee the recipe. Forget about email.mime - HTTP is not email.
-
Richard J over 13 yearsHi Martin; FYI I've demonstrated that the difference between HTTP and email is irrelevant here - they are simply transports, and mime is the same in either case. See my alternative answer. Thanks for the pointers. R
-
tc. about 12 years1. I think the preferred way to set the header is something like
base.add_header('Content-Disposition','file',name='package',...)
. 2. It's better to search for\n\n
(and also\r\n\r\n
, with e.g.re.search('\r?\n\r?\n',...)
) so you don't have to split and join the body. 3. Header lines can be folded. 4. Technically the\n
that terminates the headers is not part of the body, though this isn't harmful. 5. I'm not entirely sure that the RFC 5322 and RFC 2316 grammars are 100% compatible (in particular WRT "characters" vs octets). -
fiatjaf about 9 yearsThat's wrong,
multipart/mixed
is used in HTTP: docs.couchdb.org/en/latest/replication/… -
Lorenzo Persichetti over 4 yearseven if rare, multipart/mixed is used in python. See Google APIs: developers.google.com/drive/api/v3/performance#details