I have a table with a column field
that has values like Samsung Phone.
My question is how can I get this row if I search for a string \"Samsung\" or \"phone
Also you can use this:
title ~* '(^|[^\w])samsung([^\w]|$)'
The advantage of the above one is, it can be easily extended to include characters from different encoding like this: (Turkish characters)
title ~* '(^|[^\wğüşıöçĞÜŞİÖÇ])samsung([^\wğüşıöçĞÜŞİÖÇ]|$)
Here is a sample case where you may need the above extension.
For example; in a Latin5 encoded database you have a value of 'İsamsung'. İ is the capital of i in Turkish.
In this case the title ~* '(\msamsung\M)'
does not work.
This criteria returns the İsamsung value. Because in Latin5 encoding, postgre thinks İ is out of alpanumeric range and the value is something similiar to :samsung.
You can use a regular expression for this:
where title ~* '(\mphone\M)|(\msamsung\M)'
The above only returns values where phone
or samsung
are complete words. The regex modifiers\m
and \M
make the pattern only match for whole words.
The regex operator ~*
makes this case-insensitive. The above expression would return Samsung Phone
or Google Phone
but not Sam's House
.
If you want to add more words, just add them using the "or" operator |
where title ~* '(\mphone\M)|(\msamsung\M)|(\mbhat\M)'
Note that this kind of search is not going t1o be super-fast. Regular expressions are expensive they cannot make use of any index.
The following solutions are tested in PostgreSQL 9.6.
You may use the \y word boundary and a grouping construct with alternation operator separating all possible alternatives:
where title ~* '\y(?:Samsung|phone)\y'
Or, case insensitively:
where title ~* '\y(?:Samsung|phone)\y'
See the PostgreSQL demo.
Note that \y
is not the right approach when the first or last char are not word chars. E.g. you want to search for #samsung
or phone#
. Then, consider using unambiguous word boundaries:
where title ~* '(?<!\w)(?:#samsung|phone#)(?!\w)'
Here, the (?<!\w)
negative lookbehind fails the match if there is a non-alnum and no _
char immediately to the left of the current position, and the (?!\w)
is a negative lookahead fails the match if there is a non-alnum and no _
char immediately to the right of the current position. Equivalent for testing if there is a match in where
clause is:
where title ~* '(\W|^)(?:#samsung|phone#)(\W|$)'
See another PostgreSQL demo online:
CREATE TABLE mmytable
(title character varying)
;
INSERT INTO mmytable
(title)
VALUES
('#Samsung Co.'),
('They have a phone#'),
('Uncle Sam phoned him')
;
select * from mmytable where title ~* '(?<!\w)(?:#samsung|phone#)(?!\w)';
Also, consider whitespace boundaries when you need to match only in between whitespace characters or start/end of string:
where title ~* '(?<!\S)(?:#samsung|phone#)(?!\S)'
where title ~* '(\s|^)(?:#samsung|phone#)(\s|$)'