How to search if whole word exists in a String in Postgres

后端 未结 3 1269
天命终不由人
天命终不由人 2021-01-25 14:46

I have a table with a column field that has values like Samsung Phone.

My question is how can I get this row if I search for a string \"Samsung\" or \"phone

相关标签:
3条回答
  • 2021-01-25 15:20

    Also you can use this:

    title ~* '(^|[^\w])samsung([^\w]|$)'
    

    The advantage of the above one is, it can be easily extended to include characters from different encoding like this: (Turkish characters)

    title ~* '(^|[^\wğüşıöçĞÜŞİÖÇ])samsung([^\wğüşıöçĞÜŞİÖÇ]|$)
    

    Here is a sample case where you may need the above extension.
    For example; in a Latin5 encoded database you have a value of 'İsamsung'. İ is the capital of i in Turkish.
    In this case the title ~* '(\msamsung\M)' does not work. This criteria returns the İsamsung value. Because in Latin5 encoding, postgre thinks İ is out of alpanumeric range and the value is something similiar to :samsung.

    0 讨论(0)
  • 2021-01-25 15:23

    You can use a regular expression for this:

    where title ~* '(\mphone\M)|(\msamsung\M)'
    

    The above only returns values where phone or samsung are complete words. The regex modifiers\m and \M make the pattern only match for whole words.

    The regex operator ~* makes this case-insensitive. The above expression would return Samsung Phone or Google Phone but not Sam's House.

    If you want to add more words, just add them using the "or" operator |

    where title ~* '(\mphone\M)|(\msamsung\M)|(\mbhat\M)'
    

    Note that this kind of search is not going t1o be super-fast. Regular expressions are expensive they cannot make use of any index.

    0 讨论(0)
  • 2021-01-25 15:25

    The following solutions are tested in PostgreSQL 9.6.

    You may use the \y word boundary and a grouping construct with alternation operator separating all possible alternatives:

    where title ~* '\y(?:Samsung|phone)\y'
    

    Or, case insensitively:

    where title ~* '\y(?:Samsung|phone)\y'
    

    See the PostgreSQL demo.

    Note that \y is not the right approach when the first or last char are not word chars. E.g. you want to search for #samsung or phone#. Then, consider using unambiguous word boundaries:

    where title ~* '(?<!\w)(?:#samsung|phone#)(?!\w)'
    

    Here, the (?<!\w) negative lookbehind fails the match if there is a non-alnum and no _ char immediately to the left of the current position, and the (?!\w) is a negative lookahead fails the match if there is a non-alnum and no _ char immediately to the right of the current position. Equivalent for testing if there is a match in where clause is:

    where title ~* '(\W|^)(?:#samsung|phone#)(\W|$)'
    

    See another PostgreSQL demo online:

    CREATE TABLE mmytable
        (title character varying)
    ;
    
    INSERT INTO mmytable
        (title)
    VALUES
        ('#Samsung Co.'),
        ('They have a phone#'),
        ('Uncle Sam phoned him')
    ;
    
    select * from mmytable where title ~* '(?<!\w)(?:#samsung|phone#)(?!\w)';
    

    Also, consider whitespace boundaries when you need to match only in between whitespace characters or start/end of string:

    where title ~* '(?<!\S)(?:#samsung|phone#)(?!\S)'
    where title ~* '(\s|^)(?:#samsung|phone#)(\s|$)'
    
    0 讨论(0)
提交回复
热议问题