Encoding mail subject (SMTP) in Python with non-ASCII characters

17,198

Solution 1

From http://docs.python.org/library/email.header.html

from email.message import Message
from email.header import Header
msg = Message()
msg['Subject'] = Header('主題', 'utf-8')
print msg.as_string()

Subject: =?utf-8?b?5Li76aGM?=

more simple:

from email.header import Header
print Header('主題', 'utf-8').encode()

=?utf-8?b?5Li76aGM?=

Solution 2

The subject is transmitted as an SMTP header, and they are required to be ASCII-only. To support encodings in the subject you need to prefix the subject with whatever encoding you want to use. In your case, I would suggest prefix the subject with ?UTF-8?B? which means UTF-8, Base64 encoded.

In other words, I believe your subject header should more or less look like this:

Subject: =?UTF-8?B?JiMyMDAyNzsmIzM4OTg4Ow=?=

In PHP you could go about it like this:

// Convert subject to base64
$subject_base64 = base64_encode($subject);
fwrite($smtp, "Subject: =?UTF-8?B?{$subject_base64}?=\r\n");

In Python:

import base64
subject_base64 = base64.encodestring(subject).strip()
subject_line = "Subject: =?UTF-8?B?%s?=" % subject_base64
Share:
17,198
Rakesh
Author by

Rakesh

Updated on June 03, 2022

Comments

  • Rakesh
    Rakesh almost 2 years

    I am using Python module MimeWriter to construct a message and smtplib to send a mail constructed message is:

    file msg.txt:
    -----------------------
    Content-Type: multipart/mixed;
    from: me<[email protected]>
    to: [email protected]
    subject: 主題
    
    Content-Type: text/plain;charset=utf-8
    
    主題
    

    I use the code below to send a mail:

    import smtplib
    s=smtplib.SMTP('smtp.abc.com')
    toList = ['[email protected]']
    f=open('msg.txt') #above msg in msg.txt file
    msg=f.read()
    f.close()
    s.sendmail('[email protected]',toList,msg)
    

    I get mail body correctly but subject is not proper,

    subject: some junk characters
    
    主題           <- body is correct.
    

    Please suggest? Is there any way to specify the decoding to be used for the subject also, as being specified for the body. How can I get the subject decoded correctly?

  • Rakesh
    Rakesh over 12 years
    i'll try this, meantime is there any python api to convert to above format. i.e. automatically append the characters based on required encoding
  • AHM
    AHM over 12 years
    I'm not sure - I just remembered that part from when I was messing around with this issue a while ago. This answer seems to suggest that it is done right if you use the MIMEMultipart class instead of MimeWriter.
  • mata
    mata almost 12 years
  • Solomon Duskis
    Solomon Duskis almost 5 years
    Note that this uses the email.message.Message API, which was superseded by email.message.EmailMessage in Python 3.6. With the new API you must assign a string: msg['Subject'] = 'unicode string', as assigning Header objects is not supported. In my experience as of 3.7.3 the "legacy" API works better - some encoding bugs are fixed in 3.8
  • Sérgio
    Sérgio almost 5 years
    Thanks for head up