Given an email subject line, I\'d like to clean it up, getting rid of the \"Re:\", \"Fwd\", and other junk. So, for example, \"[Fwd] Re: Jack and Jill\'s Wedding\" should turn i
Several variations (Subject Prefix) according to the country/language: Wikipedia: List of email subject abbreviations
Brazil: RES === RE, German: AW === RE
Example in Python:
#!/usr/local/bin/python
# -*- coding: utf-8 -*-
import re
p = re.compile( '([\[\(] *)?(RE?S?|FYI|RIF|I|FS|VB|RV|ENC|ODP|PD|YNT|ILT|SV|VS|VL|AW|WG|ΑΠ|ΣΧΕΤ|ΠΡΘ|תגובה|הועבר|主题|转发|FWD?) *([-:;)\]][ :;\])-]*|$)|\]+ *$', re.IGNORECASE)
print p.sub( '', 'RE: Tagon8 Inc.').strip()
Example in PHP:
$subject = "主题: Tagon8 - test php";
$subject = preg_replace("/([\[\(] *)?(RE?S?|FYI|RIF|I|FS|VB|RV|ENC|ODP|PD|YNT|ILT|SV|VS|VL|AW|WG|ΑΠ|ΣΧΕΤ|ΠΡΘ|תגובה|הועבר|主题|转发|FWD?) *([-:;)\]][ :;\])-]*|$)|\]+ *$/im", '', $subject);
var_dump(trim($subject));
Terminal:
$ python test.py
Tagon8 Inc.
$ php test.php
string(17) "Tagon8 - test php"
Note: This is the Regular Expression of mathematical.coffee. Added other prefixes from other languages: Chinese, Danish Norwegian, Finnish, French, German, Greek, Hebrew, Italian, Icelandic, Swedish, Portuguese, Polish, Turkish
I used "strip/trim" to remove spaces