Faster search for records where 1st character of field doesn't match [A-Za-z]?

百般思念 提交于 2019-12-09 18:29:59

问题


I currently have the following:

User (id, fname, lname, deleted_at, guest)

I can query for a list of user's by their fname initial like so:

User Load (9.6ms)  SELECT "users".* FROM "users" WHERE (users.deleted_at IS NULL) AND (lower(left(fname, 1)) = 's') ORDER BY fname ASC LIMIT 25 OFFSET 0

This is fast thanks to the following index:

  CREATE INDEX users_multi_idx
  ON users (lower(left(fname, 1)), fname)
  WHERE deleted_at IS NULL;

What I want to do now is be able to query for all Users that do not start with the letter's A-Z. I got this to work like so:

SELECT "users".* FROM "users" WHERE (users.deleted_at IS NULL) AND (lower(left(fname, 1)) ~ E'^[^a-zA-Z].*') ORDER BY fname ASC LIMIT 25 OFFSET 0

But the problem is this query is very slow and does not appear to be using the index to speed up the first query. Any suggestions on how I can elegantly make the 2nd query (non a-z) faster?

I'm using Postgres 9.1 with rails 3.2

Thanks


回答1:


Updated answer
Preceding question here.

My first idea idea (index with text_pattern_ops) did not work with the regular expression in my tests. Better rewrite your query to:

SELECT *
FROM   users
WHERE  deleted_at IS NULL
WHERE lower(left(fname, 1)) < 'a' COLLATE "C"
OR    lower(left(fname, 1)) > 'z' COLLATE "C"
ORDER  BY fname
LIMIT  25 OFFSET 0;

Besides from these expressions being faster generally, your regular expression also had capital letters in it, which did not match the index with lower(). And the trailing characters were pointless while comparing to a single char.

And use this index:

CREATE INDEX users_multi_idx
ON users (lower(left(fname, 1)) COLLATE "C", fname)
WHERE deleted_at IS NULL;

The COLLATE "C" part is optional and only contributes a very minor gain in performance. It's purpose is to reset collation rules to default posix collation, which just uses byte order and is generally faster. Useful, where collation rules are not relevant anyway.

If you create the index with it, only queries that match the collation can use it. So you might just skip it to simplify things if performance is not your paramount requirement.




回答2:


As an alternative to @ErwinBrandstetter's general solution, PostgreSQL supports partial indexes. You can say:

CREATE INDEX users_nonalphanumeric_not_deleted_key
ON users (id)
WHERE (users.deleted_at IS NULL) AND (lower(left(fname, 1)) ~ E'^[^a-zA-Z].*');

This index won't help for any other lookups, but it will precompute the answer for this particular query. This technique is often useful for queries that return a small, predefined subset from a much larger table, since the resulting index will disregard the vast majority of the table and contain only the rows of interest.



来源:https://stackoverflow.com/questions/12906264/faster-search-for-records-where-1st-character-of-field-doesnt-match-a-za-z

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!