Best way to handle email parsing/decoding in PHP?

37,553

Solution 1

Funny you should ask... Im actually working on a simple notification system now. I just finished up the Bounce Manager with i use Zend_Mail to implement. It has pretty much all the features you're looking for... you can connect to a mailbox (POP3, IMAP, Mbox, and Maildir) and pull messages from it as well as operate on all those messages.

It handles multipart messages, but the parts can be hard to work with. I had a hard time figuring out which part was the attached original message part in the NDR's I was working with, but I have a feeling I just missed something in the documentation. I'm not sure how it handles encoding, because my usage was fairly simple but I'm pretty sure it has provisions for all the encodings you mentioned. Check out the docs and browse the API.

Solution 2

I have recently developed a PHP mail parser and I have been using it on production.
I have very happy with it and some developers has forked it:

https://github.com/plancake/official-library-php-email-parser

Solution 3

I know this question's four years old now... but I ended up in need of a mail parsing library and wasn't satisfied with any of the available options. I wanted something reliable, PSR-2 compliant, installable via composer.

composer require zbateson/mail-mime-parser

It's its own parser, built from the ground up to get around known issues and bugs in other implementations. It is extensively tested and quite widely used.

The library makes use of Psr7 streams which allow you to pass it any kind of stream you like. It also doesn't store all information in memory -- very large attachments can be returned as a stream instead of a string if so desired, so memory isn't used up. Similarly the entire message is never stored directly in memory, only references to streams, and headers are kept in-memory.

https://github.com/zbateson/mail-mime-parser

Check out the website for a guide and the API... and if you find bugs/typos or see improvements, please feel free to open an issue, or dig right in and contribute with a pull request :)

Solution 4

I forked the php-mime-mail-parser to correct all the issues : Fork of php-mime-mail-parser

More than 52 tests and 764 assertions Code Coverage : 100% lines, 100% Functions and Methods, 100% Classes and Traits

You need the PECL Package MailParse to use it but the wrapper is without issue and fully tested.

Solution 5

For completeness here's the one I'm going to try. http://code.google.com/p/php-mime-mail-parser/ - it's a wrapper around PHP MailParse, which needs to be installed.

Share:
37,553

Related videos on Youtube

Sgraffite
Author by

Sgraffite

Updated on July 09, 2022

Comments

  • Sgraffite
    Sgraffite almost 2 years

    Currently I'm using the PEAR library's mimeDecode.php for parsing incoming emails. It seems to have a lot of issues and fails to decode a lot of messages, so I'd like to replace it with something better.

    I'm looking for something that is able to properly separate parts of the message, such as to, from, body, etc. Ideally it would be able to handle all common encoding methods such as base64, uuencode, quoted printable, etc.

    In situations where both plain text and html versions of the same message are contained in a single email, I would ideally like it to know the difference between them so I could choose which part I wished to display.

    I'm not worried about attachments at this point in time, but it would be nice for it to have knowledge of them in case I want to implement that in the future.

    I saw PHP has a group of functions that start with the word imap that appear they may do what I would like, but I am unsure without trying them out.

    Currently I am doing on-the-fly decoding of the messages in PHP, which is why I am looking for a PHP replacement solution.

    Does anyone have an experience with this that could point me in the right direction? I'd hate to start using something that would end up not doing what I need in the long run.

  • Sgraffite
    Sgraffite over 13 years
    Do you know if it is possible to use Zend_Mail without the storage connector? I'd like to pass it an incoming message as a string and be able to use the methods associated to messages on it without it needing to have come from a storage location.
  • prodigitalson
    prodigitalson over 13 years
    Yes Im sure there is a way because this same class is used to send messages with the mailer/transport classes as well and in that case you would always be constructing a message form strings/files. If i recall it looks something like $m = new Zend_Mail_Message(array('raw' => $stringMessage)); Take a look at the actual class and the doc comments for the constructor to verify.
  • Sgraffite
    Sgraffite over 13 years
    This ended up working out for me. Zend did a few things that I didn't understand why however. Zend will throw an exception when it does not recognize a header. In my case, I don't care about unrecognized headers, so I ended up commenting out that exception. Also there is a function where Zend does a foreach() on $parts, but sometimes the variable it is trying to foreach on is null, so I added a null check and return $res if it is null there.
  • Sgraffite
    Sgraffite over 13 years
    Finally when it is checking mime boundaries, it throws an exception if it can't find the closing boundary. In my case it was a malformed message, but the body was still readable, so I ended up commenting out that exception also. I'd rather give the user a malformed body than nothing.
  • prodigitalson
    prodigitalson over 13 years
    Hmm id dint run in to any problems with headers and i was actually using custom headers for things (like X-CUSTOMNS-CUSTOMNAME). It will however throw an exception if you try to read a header that doesnt exist.. you must use $msg->hasHeader($header) personally i would rather it retun null, false or -1 instead of having to explicitly test...
  • Sgraffite
    Sgraffite over 13 years
    I was only parsing incoming messages, maybe that is the difference? It was looping through all the headers and checking them with a case statement, if it hit default: it would throw an exception.
  • prodigitalson
    prodigitalson over 13 years
    Hmm.. odd... I was doing incoming messages as well (for outgoing i use swift mailer) and never had an issue with an custom headers... Of course the custom headers were in a an attached message (ie. mail part)... so i wasnt reading the custom headers in the top level message.
  • Sgraffite
    Sgraffite over 13 years
    It was literally 3 messages out of 11.2k total messages that ended up throwing that exception, so probably not very common.
  • Sgraffite
    Sgraffite over 13 years
    That looks decent by looking at the docs. I already put the hours in for implementing and testing the Zend_Mail library, and it appears to work pretty well. I honestly can't spend more time at work looking into a new library at this point. Thanks for the response though :)
  • Gabriel S.
    Gabriel S. almost 12 years
    If anyone comes around here, the proper way to get a header is to check if it exists with the headerExists() method and if yes, fetch it with getHeader()
  • Slawa
    Slawa over 11 years
    It doesn't handle attachments well - it has the base64 encoded attachments stuff inside the HTML body. And has no getAttachment() kind of functions at all.
  • Dan
    Dan over 11 years
    Thanks for the bug reporting, Slawa - I will look into it. If you need to extract the attachment, I suggest you try code.google.com/p/php-mime-mail-parser
  • behz4d
    behz4d over 10 years
    not working all the time, I have some examples which it could not handle the email.
  • QuantumHive
    QuantumHive almost 10 years
    I don't you can still parse RAW email in 2014 with ZF2
  • ChicagoSky
    ChicagoSky over 9 years
    absolutely awesome library - perfect for what i needed
  • tivnet
    tivnet about 8 years
    This might need some more QA, but first impression: it works. Thank you, @Zaahid
  • bishop
    bishop over 7 years
    Works great! I stream 40+MB emails from an AWS SES inbox on S3 with zero problems. Excellent library.
  • phoenix
    phoenix almost 7 years
    It is awesome but it turns out that it can't handle more complex mail structure. I've found a situation where an email has one boundary value to separate an attachment from the text/html body and then a different boundary value to split off text and html body parts... That's just not handled.
  • Nicolas
    Nicolas about 6 years
    The link is broken, here the working link docs.zendframework.com/zend-mail
  • Aistis
    Aistis about 3 years