NOTE: The code here is provided for discussion purposes, if you choose to use it yourself on your own head be it! I used Python 2.4 with emacs and the python.el mode file.
POP mail from my Macmail account hasn’t worked since December. I’ve emailed the support people a few times to no avail. It’s a free service and I don’t think anyone’s home any more. I decided to try and write a program that would pretend to be a browser and just forward everything on to my gmail account.
This was surprisingly easy.
I decided to use Python, because I know it a bit. CLisp was my second choice but there don’t seem to be the wealth of examples on the ‘net. I’ve done stuff like this in Java, but you have to do crazy things like run it through jtidy first and treat the html as xml, which is a complete pain. The Python SGMLlib just works.
I read the HTML processing chapter of dive into python, which gave me a grounding for the what I needed to do with SGML processors and stuff.
But first, I needed to learn how to log onto the mail service using cookies. I found this on the ClientCookie module, and that did the trick.
I then made a big messy file that I could run in emacs and keep stuffing prototype code into the Python interpreter. After some work I came up with this for the logon:
macMailURL = "http://mail.macmail.com"
def login2Macmail():
request = ClientCookie.Request(macMailURL)
# note we’re using the urlopen from ClientCookie, not urllib2
response = ClientCookie.urlopen(request)
firstPage = response.read();
#print firstPage
# Now we need the sessionID
data = "login=aname&name=aname&pwd=apass&password=apass"
# let’s say this next request requires a cookie that was set in response
request2 = ClientCookie.Request(macMailURL + "/logon.php?logoff=1")
response2 = ClientCookie.urlopen(request2,data)
return response2.read()
This returns the inbox html from the second response. I’ve left the commented debug statement in.
So, now we need to take this page, rip out the <a> tags that point to emails, and use this info to forward the mails:
class collectTags(SGMLParser):<br /><br /> def reset(self):<br /> SGMLParser.reset(self)<br /> self.urls = []<br /><br /> def start_a(self, attrs):<br /> href = [v for k, v in attrs if k=='href']<br /> if href:<br /> #print href<br /> if href[0].find("/member/mail.php") != - 1:<br /> self.urls.extend(href)
Extending SGMLParser – if it finds an <a> tag it will run the start_a method I and do what I want.
This little class will stick all of these URLs (which have an href of the form /member/mail.php&id=1234) into the urls list. Note the v for k with the if statement, this lovely one liner is why I like Python so much. The only problem is that it returns a list which will only have one element. I’m sure there’s a way of changing the one-liner but I don’t know what it is yet. Still, think of the equivalent Java, or PL/SQL! Having to reference the first element of the returned list is a small price to pay for this expressive power. Of course, some Python god will say I’m talking out of my rear end and I just need to do …, whatever that is.
Next I need to forward this mail. This is quite complex because I need to get the page displaying the forward, parse out the text of the mail and any other arguments, and then submit this as a post command back to the web server afdter substituting my forward mail into the string.
class getMailBody(SGMLParser):
def reset(self):
SGMLParser.reset(self)
self.data = ""
self.inForm = 0
self.inTextArea = 0
self.textAreaName = ""
self.textAreaID = ""
self.textAreaText = []
def start_form(self, attrs):
theForm = [v for k, v in attrs if k==‘action’]
if theForm:
self.inForm = theForm.find("/member/send_mail.php") != – 1
def end_form(self):
self.inForm = 0
def getValue( self, val, attrs ):
return [v for k, v in attrs if k==val]
def appendData(self,value):
amp = ""
if self.data:
amp = "&"
self.data = amp + value
def processAttribs(self,attrs):
if self.inForm:
name = self.getValue(‘name’,attrs)
idVal = self.getValue(‘id’,attrs)
value = self.getValue(‘value’,attrs)
if not value:
value.append( "" )
if name:
# print "name" + name
self.appendData( urlencode( {name:value } ))
if idVal and idVal != name:
# print "idval" + idVal
self.appendData( urlencode( {idVal:value } ) )
def start_input(self, attrs):
self.processAttribs( attrs )
def start_textarea(self, attrs):
if self.inForm:
self.textAreaName = self.getValue(‘name’,attrs)
self.textAreaID = self.getValue(‘id’,attrs)
self.inTextArea = 1
def end_textarea(self):
if self.inTextArea:
self.inTextArea = 0
if self.textAreaName:
# print "text area name" + self.textAreaName
self.appendData( urlencode( {self.textAreaName:" ".join(self.textAreaText) } ))
if self.textAreaID and self.textAreaID != self.textAreaName:
# print "text area idval" + self.textAreaID
self.appendData( urlencode( {self.textAreaID:" ".join(self.textAreaText ) } ))
def handle_data(self,text):
if self.inTextArea:
self.textAreaText.append( text)
#print text
This class will parse the forward mail page out into the data member so that I can then use it to send an http post request to the remote server, thus:
def forwardMail(url):
replyTag = "/member/reply.php"
# of the form /member/mail.php?id=3298
splitURL = url.split("?")
data = "%s&btn=Forward" % splitURL
request = ClientCookie.Request(macMailURL + replyTag)
response = ClientCookie.urlopen(request,data)
page = response.read()
#print page
mb = getMailBody()
mb.feed(page)
mb.close()
forwardTag = "/member/send_mail.php"
data = mb.data.replace("to=","[email protected]")
request = ClientCookie.Request(macMailURL + forwardTag)
response = ClientCookie.urlopen(request,data)
page = response.read()
Of course, replacing fred with your mail. I’ll leave working this out to the reader.
Macmail does delete though the move command. I reused the urllist from the last page:
def deleteMail(urlList):
deleteTag="/member/move.php"
amp = ""
deleteData = []
for val in urlList:
bits = val.split("=")
#print bits
theID = bits
deleteData.append(theID)
data = "delete[]=" + "&delete[]=".join(deleteData)
#print data
request = ClientCookie.Request(macMailURL + deleteTag)
response = ClientCookie.urlopen(request,data)
return response.read()
This will return the next page after all of the mail displayed on the current one is deleted. Here we glue it all together:
###################################
# Processing body
###################################
page = login2Macmail()
while True:
parser = collectTags()
parser.feed(page)
parser.close()
if not parser.urls:
break
for url in parser.urls:
forwardMail(url)
# Submit delete request for all URL’s from first page
# get page again
page = deleteMail(parser.urls)
#print page
## Now delete all sent mail
request = ClientCookie.Request(macMailURL + "/member/index.php?folder=SentItems")
response = ClientCookie.urlopen(request)
page = response.read()
while True:
parser = collectTags()
parser.feed(page)
parser.close()
if not parser.urls:
break
page = deleteMail(parser.urls)
#print page
request = ClientCookie.Request(macMailURL + "/member/empty_trash.php")
response = ClientCookie.urlopen(request)
page = response.read()
This little control block loops through all of the inbox pages until there are no more mail viewing url’s left, as it goes it deletes them all. Then it goes to the outbox and deletes all of that, then it calls the empty trash function to be polite. I don’t want to leave a load of junk on that server, the mail account has been very useful over the years and I wouldn’t like to annoy the people running it.
For the record, the import statement at the top is this:
import ClientCookie, urllib2
from sgmllib import SGMLParser
import htmlentitydefs
import sys
from urllib import urlencode
Have fun, and don’t eat too much Java, it’s bad for you and takes too long, life is short, use proper powerful tools.