I\'m trying to parse emails and I get this kind of errors using the mail package. Is it a bug on the mail package or something I should handle myself ?
missing
Alexey Vasiliev's MIT-licensed http://github.com/le0pard/go-falcon/ includes a parser
package that applies whichever encoding package is needed to decode the headers (the meat is in utils.go).
package main
import (
"bufio"
"bytes"
"fmt"
"net/textproto"
"github.com/le0pard/go-falcon/parser"
)
var msg = []byte(`Subject: =?gb18030?B?u9i4tKO6ILvYuLSjulBhbGFjZSBXZXN0bWluc3Rl?=
=?gb18030?B?cjogMDEtMDctMjAxNCAtIDA0LTA3LTIwMTQ=?=
`)
func main() {
tpr := textproto.NewReader(bufio.NewReader(bytes.NewBuffer(msg)))
mh, err := tpr.ReadMIMEHeader()
if err != nil {
panic(err)
}
for name, vals := range mh {
for _, val := range vals {
val = parser.MimeHeaderDecode(val)
fmt.Print(name, ": ", val, "\n")
}
}
}
It looks like its parser.FixEncodingAndCharsetOfPart is used by the package to decode/convert content as well, though with a couple of extra allocations caused by converting the []byte
body to/from a string
. If you don't find the API works for you, you might at least be able to use the code to see how it can be done.
Found via godoc.org's "...and is imported by 3 packages" link from encoding/simplifiedchinese -- hooray godoc.org!
I hope this helps someone who may consider Go to process emails(i.e develop client apps). It seems the standard Go standard library is not mature enough for email processing. It doesn't handle multi-part, different char sets etc. After almost a day trying different hacks and packages I've decided to just throw the go code away and use an old good JavaMail solution.
I've been using github.com/jhillyerd/enmime which seems to have no trouble with this. It'll parse out both headers and body content. Given an io.Reader
r
:
// Parse message body
env, _ := enmime.ReadEnvelope(r)
// Headers can be retrieved via Envelope.GetHeader(name).
fmt.Printf("From: %v\n", env.GetHeader("From"))
// Address-type headers can be parsed into a list of decoded mail.Address structs.
alist, _ := env.AddressList("To")
for _, addr := range alist {
fmt.Printf("To: %s <%s>\n", addr.Name, addr.Address)
}
fmt.Printf("Subject: %v\n", env.GetHeader("Subject"))
// The plain text body is available as mime.Text.
fmt.Printf("Text Body: %v chars\n", len(env.Text))
// The HTML body is stored in mime.HTML.
fmt.Printf("HTML Body: %v chars\n", len(env.HTML))
// mime.Inlines is a slice of inlined attacments.
fmt.Printf("Inlines: %v\n", len(env.Inlines))
// mime.Attachments contains the non-inline attachments.
fmt.Printf("Attachments: %v\n", len(env.Attachments))