Home > Uncategorized > Wrestling With Mail in Python

Wrestling With Mail in Python

March 1st, 2006

I have a bunch of old mailboxes (created by pine), and I’d like to extract the message bodies. Problem is, some of the messages were sent as HTML, so the bodies are multipart MIME messages: the first part is in plain text (or close to), while the second is the marked-up HTML.

I thought that parsing each mailbox file with Python’s email module would be simple, but as is often the case:

  • it’s designed to handle industrial-strength problems, which means that there’s a lot of things I don’t need getting in the way of me finding the 5% I actually want; and
  • it assumes I know a lot more about the topic than I do.

This posting isn’t really a plea for help (although if you do have something lying around that will do what I want, I’d appreciate a ping). What I’m really asking for is layered library design: when you’re creating a package for general use, please remember the 80/20 rule, and make the 20% that will satisfy 80% of users very simple, and very easy to find.

Uncategorized

  1. Shawn Wheatley
    March 1st, 2006 at 14:29 | #1

    You might want to check out Yarn:

    yarnproject.org

    I saw Abe Fettig’s talk on it at pycon a couple years ago and it looked pretty cool. Not sure that it’s still in active development, but it’s got some cool features for manipulating various message formats.

Comments are closed.