How to check for BOM in postgres text columns?

房东的猫 提交于 2019-12-25 01:47:14

问题


We have some encoding issues and I need to check whether a BOM is already present in a PostgreSQL text column. I used

select convert(varbinary, columnXY) from tableXY where id = 1;

for MS SQL successfully, but don't find equivalent conversions for PostgreSQL. I found this documentation and tried with decode(columnXY, 'hex'), but that is not working.


回答1:


You may consider the binary representation of the TEXT column by converting it to BYTEA (edit: not by a direct cast, better use convert_to(text,'UTF-8') instead) and searching the BOM sequence in it as a series of bytes.

as an SQL expression:

position('\xefbbbf'::bytea IN convert_to(your_text_column,'UTF-8'))=1

0 as the result of position(...) would mean the BOM is not in the string.
1 means it's at the beginning of the string.



来源:https://stackoverflow.com/questions/23340643/how-to-check-for-bom-in-postgres-text-columns

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!