Replace unicode characters in PostgreSQL

前端未结

关注

 2  881

Is it possible to replace all the occurrences of a given character (expressed in unicode) with another character (expressed in unicode) in a varchar field in PostgreSQL?

相关标签:

2条回答

后悔当初

2020-12-18 05:07
It should work with the "characters corresponding to that code" unless come client or other layer in the food-chain mangles your code!

Also, use translate() or replace() for this simple job. Much faster than regexp_replace(). translate() is also good for multiple simple replacements at a time.
And avoid empty updates with a WHERE clause. Much faster yet, and avoids table boat and additional VACUUM cost.
```
UPDATE mytable
SET    myfield  = translate(myfield, 'P', '`')  -- actual characters
WHERE  myfield <> translate(myfield, 'P', '`');
```
If you keep running into problems, use the encoding @mvp provided:
```
UPDATE mytable
SET   myfield =  translate(myfield, U&'\0050', U&'\0060')
WHERE myfield <> translate(myfield, U&'\0050', U&'\0060');
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
独厮守ぢ

2020-12-18 05:20
According to the PostgreSQL documentation on lexical structure, you should use U& syntax:
```
UPDATE mytable 
SET myfield = regexp_replace(myfield, U&'\0050', U&'\0060', 'g')
```
You can also use the PostgreSQL-specific escape-string form E'\u0050'. This will work on older versions than the unicode escape form does, but the unicode escape form is preferred for newer versions. This should show what's going on:
```
regress=> SELECT '\u0050', E'\u0050', U&'\0050';
 ?column? | ?column? | ?column? 
----------+----------+----------
 \u0050   | P        | P
(1 row)
```
0 讨论(0)
发布评论:

提交评论
- 加载中...