Regex/code for removing “FWD”, “RE”, etc, from email subject

后端 未结 3 1554
时光说笑
时光说笑 2021-02-07 14:35

Given an email subject line, I\'d like to clean it up, getting rid of the \"Re:\", \"Fwd\", and other junk. So, for example, \"[Fwd] Re: Jack and Jill\'s Wedding\" should turn i

3条回答
  •  执念已碎
    2021-02-07 15:12

    Several variations (Subject Prefix) according to the country/language: Wikipedia: List of email subject abbreviations

    Brazil: RES === RE, German: AW === RE

    Example in Python:

    #!/usr/local/bin/python
    # -*- coding: utf-8 -*-
    import re
    p = re.compile( '([\[\(] *)?(RE?S?|FYI|RIF|I|FS|VB|RV|ENC|ODP|PD|YNT|ILT|SV|VS|VL|AW|WG|ΑΠ|ΣΧΕΤ|ΠΡΘ|תגובה|הועבר|主题|转发|FWD?) *([-:;)\]][ :;\])-]*|$)|\]+ *$', re.IGNORECASE)
    print p.sub( '', 'RE: Tagon8 Inc.').strip()
    

    Example in PHP:

    $subject = "主题: Tagon8 - test php";
    $subject = preg_replace("/([\[\(] *)?(RE?S?|FYI|RIF|I|FS|VB|RV|ENC|ODP|PD|YNT|ILT|SV|VS|VL|AW|WG|ΑΠ|ΣΧΕΤ|ΠΡΘ|תגובה|הועבר|主题|转发|FWD?) *([-:;)\]][ :;\])-]*|$)|\]+ *$/im", '', $subject);
    var_dump(trim($subject));
    

    Terminal:

    $ python test.py
    Tagon8 Inc.
    $ php test.php
    string(17) "Tagon8 - test php"
    

    Note: This is the Regular Expression of mathematical.coffee. Added other prefixes from other languages: Chinese, Danish Norwegian, Finnish, French, German, Greek, Hebrew, Italian, Icelandic, Swedish, Portuguese, Polish, Turkish

    I used "strip/trim" to remove spaces

提交回复
热议问题