unaccent() preventing index usage in Postgres

后端 未结 1 827
傲寒
傲寒 2021-01-12 09:08

I want to retrieve a way with a given name from an OpenStreetMap database imported into PostgreSQL 9.3.5, the OS is Win7 64-bit. In order to be a bit failure tolerant, I use

相关标签:
1条回答
  • 2021-01-12 10:01

    IMMUTABLE variant of unaccent()

    To clarify the misinformation in the currently accepted, incorrect answer:
    Expression indexes only allow IMMUTABLE functions (for obvious reasons) and unaccent() is only STABLE. The solution you suggested in the the comment is also problematic. Detailed explanation and a proper solution for that:

    • Does PostgreSQL support "accent insensitive" collations?

    Depending on the content of tags->name it may be useful to add unaccent() to the expression index, but that's orthogonal to the question why the index wasn't being used:

    • PostgreSQL accent + case insensitive search

    Actual problem / solution

    The operator LIKE in your query is subtly wrong (most likely). You do not want to interpret 'Weststrasse' as search pattern, you want to match the (normalized) string as is. Replace with the = operator, and you will see a (bitmap) index scan with your current index, irregardless of the function volatility of unaccent():

    SELECT * FROM germany.ways
    WHERE lower(tags->'name') = lower(unaccent('unaccent','Weststrasse'))

    Why?

    The right operand of LIKE is a pattern. Postgres cannot use a plain btree index for pattern matching (exceptions apply). A LIKE with a plain string as pattern (no special characters) can be optimized with an equality check on the btree index. But if there are special characters in the string, this index is out.

    If there is an IMMUTABLE function to the right of LIKE, it can be evaluated immediately and the said optimisation is still possible. Per documentation on Function Volatility Categories:

    IMMUTABLE ...
    This category allows the optimizer to pre-evaluate the function when a query calls it with constant arguments.

    The same is not possible with a lesser function volatility (STABLE or VOLATILE). That's why your "solution" of faking an IMMUTABLE unaccent() seemed to work, but it's really putting lipstick on a pig.

    To reiterate:

    • If you want to work with LIKE and patterns, use a trigram index.
    • If you don't want to work with LIKE and patterns, use the equality operator =
    0 讨论(0)
提交回复
热议问题