regexp_replace Unicode in PostgreSQL

前端 未结 1 848
深忆病人
深忆病人 2021-01-15 02:07

How to regexp_replace for Unicode in PostgreSQL

i read this http://www.regular-expressions.info/unicode.html

select regexp_replace(\'s4y8sds\', \'\\\         


        
相关标签:
1条回答
  • 2021-01-15 02:37

    For ordinary numbers use digit character class as [[:digit:]] or shorthand \d:

    SELECT regexp_replace('s4y8sds', $$\d+$$, '', 'g');
    

    Result:

     regexp_replace
    ----------------
     sysds
    (1 row)
    

    For other numbers (for example ¼) is not that simple, more precisely as documentation says it's ctype (locale) dependent:

    Within a bracket expression, the name of a character class enclosed in [: and :] stands for the list of all characters belonging to that class. Standard character class names are: alnum, alpha, blank, cntrl, digit, graph, lower, print, punct, space, upper, xdigit. These stand for the character classes defined in ctype. A locale can provide others.

    However you could use internal PL/Perl procedural language and write server-side function with wanted Unicode characters classes \p{}:

    CREATE OR REPLACE FUNCTION removeNumbersUnicode(text)
    RETURNS text AS $$
        $s = $_[0];
        $s =~ s/\p{N}//g;
        return $s;
    $$ LANGUAGE plperl;
    

    Check Chapter 41 from doc for more info how to write such functions.

    0 讨论(0)
提交回复
热议问题