Rails - Mail, getting the body as Plain Text
Solution 1
The code above:
message = Mail.new(params[:message])
will create a new instance of the mail gem from the full message. You can then use any of the methods on that message to get the content. You can therefore get the plain content using:
message.text_part
or the HTML with
message.html_part
These methods will just guess and find the first part in a multipart message of either text/plain or text/html content type. CloudMailin also provides these as convenience methods however via params[:plain] and params[:html]. It's worth remembering that the message is never guaranteed to have a plain or html part. It may be worth using something like the following to be sure:
plain_part = message.multipart? ? (message.text_part ? message.text_part.body.decoded : nil) : message.body.decoded
html_part = message.html_part ? message.html_part.body.decoded : nil
As a side note it's also important to extract the content encoding from the message when you use these methods and make sure that the output is encoded into the encoding method you desire (such as UTF-8).
Solution 2
What is Mail
?
The message
defined in the question appears to be an instance of the same Mail
or Mail::Message
class, which is also used in ActionMailer::Base
, or in the mailman gem.
I'm not sure where this is integrated into rails, but Steve Smith has pointed out that this is defined in the mail gem.
Extracting a Part From a Multipart Email
In the gem's readme, there is an example section on reading multipart emails.
Besides the methods html_part
and text_part
, which simply find the first part of the corresponding mime type, one can access and loop through the parts manually and filter by the criteria as needed.
message.parts.each do |part|
if part.content_type == 'text/plain'
# ...
elsif part.content_type == 'text/html'
# ...
end
end
The Mail::Part
is documented here.
Encoding Issues
Depending on the source of the received mail, there might be encoding issues. For example, rails could identify the wrong encoding type. If, then, one tries to convert the body to UTF-8 in order to store it in the database (body_string.encode('UTF-8')
), there might be encoding errors like
Encoding::UndefinedConversionError - "\xFC" from ASCII-8BIT to UTF-8
(like in this SO question).
In order to circumvent this, one can readout the charset from the message part and tell rails what charset it has been before encoding to UTF-8:
encoding = part_to_use.content_type_parameters['charset']
body = part_to_use.body.decoded.force_encoding(encoding).encode('UTF-8')
Here, the decoded
method removes the header lines, as shown in the encoding section of the mail gem's readme.
EDIT: Hard Encoding Issues
If there are really hard encoding issues, the former approach does not solve, have a look at the excellent charlock_holmes gem.
After adding this gem to the Gemfile
, there is a more reliable way to convert email encodings, using the detect_encoding
method, which is added to Strings by this gem.
I found it helpful to define a body_in_utf8
method for mail messages. (Mail::Part
also inherits from Mail::Message
.):
module Mail
class Message
def body_in_utf8
require 'charlock_holmes/string'
body = self.body.decoded
if body.present?
encoding = body.detect_encoding[:encoding]
body = body.force_encoding(encoding).encode('UTF-8')
end
return body
end
end
end
Summary
# select the part to use, either like shown above, or as one-liner
part_to_use = message.html_part || message.text_part || message
# readout the encoding (charset) of the part
encoding = part_to_use.content_type_parameters['charset'] if part_to_use.content_type_parameters
# get the message body without the header information
body = part_to_use.body.decoded
# and convert it to UTF-8
body = body.force_encoding(encoding).encode('UTF-8') if encoding
EDIT: Or, after defining a body_in_utf8
method, as shown above, the same as one-liner:
(message.html_part || message.text_part || message).body_in_utf8
Solution 3
email = Mail.new(params[:message])
text_body = (email.text_part || email.html_part || email).body.decoded
I'm using this solution on RedmineCRM Helpdesk plugin
Solution 4
I believe if you call message.text_part.body.decoded you will get it converted to UTF-8 for you by the Mail gem, the documentation isn't 100% clear on this though.
Related videos on Youtube
AnApprentice
working on Matter, a new way to gather professional feedback.
Updated on November 18, 2021Comments
-
AnApprentice over 2 years
Given:
message = Mail.new(params[:message])
as seen here: http://docs.heroku.com/cloudmailin
It shows how to get the message.body as HTML, how to do you get the plain/text version?
Thanks
-
Steve about 12 yearsThanks! I was having some issues with parsing out an email after decoding but getting the text_part helped fix this.
-
David Morales over 11 yearsExcellent answer. I must say this is working for the default Rails Action Mailer. No need for any mail gem.
-
Arnold Roa over 11 yearsHow i can extract the encoding? im doing this ..force_encoding("ISO-8859-1").encode('utf_8') and on some message works, in others dont.
-
RocketR about 11 years@David "default Rails Action Mailer" is the
mail
gem. At least, depends on it much. -
RocketR about 11 yearsNo, it doesn't. It returns a string like
\xF0\xD2\x12...
-
New Alexandria almost 11 yearsSeriously?? Answers like this need a special quality stamp.
-
coderuby over 10 yearsWhat a GREAT answer. Thank you very much!!
-
Paul Danelli about 4 yearsOMG Thank you. No wonder I missed it before, its on line 1600+ in the Message class.
-
Paul Watson almost 3 yearsSuch a great answer, thank you. Working with emails is a real 80/20 headache.
-
Dorian over 2 yearsyeah...
html_safe
on user-provided content, that's not gonna end well (XSS)