Replace unicode characters in PostgreSQL

前端 未结 2 881
闹比i
闹比i 2020-12-18 04:50

Is it possible to replace all the occurrences of a given character (expressed in unicode) with another character (expressed in unicode) in a varchar field in PostgreSQL?

相关标签:
2条回答
  • 2020-12-18 05:07

    It should work with the "characters corresponding to that code" unless come client or other layer in the food-chain mangles your code!

    Also, use translate() or replace() for this simple job. Much faster than regexp_replace(). translate() is also good for multiple simple replacements at a time.
    And avoid empty updates with a WHERE clause. Much faster yet, and avoids table boat and additional VACUUM cost.

    UPDATE mytable
    SET    myfield  = translate(myfield, 'P', '`')  -- actual characters
    WHERE  myfield <> translate(myfield, 'P', '`');
    

    If you keep running into problems, use the encoding @mvp provided:

    UPDATE mytable
    SET   myfield =  translate(myfield, U&'\0050', U&'\0060')
    WHERE myfield <> translate(myfield, U&'\0050', U&'\0060');
    
    0 讨论(0)
  • 2020-12-18 05:20

    According to the PostgreSQL documentation on lexical structure, you should use U& syntax:

    UPDATE mytable 
    SET myfield = regexp_replace(myfield, U&'\0050', U&'\0060', 'g')
    

    You can also use the PostgreSQL-specific escape-string form E'\u0050'. This will work on older versions than the unicode escape form does, but the unicode escape form is preferred for newer versions. This should show what's going on:

    regress=> SELECT '\u0050', E'\u0050', U&'\0050';
     ?column? | ?column? | ?column? 
    ----------+----------+----------
     \u0050   | P        | P
    (1 row)
    
    0 讨论(0)
提交回复
热议问题