I have a query that is taking 9 minutes to run on PostgreSQL 9.0.0 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-46), 64-bit
This
I rewrote your query and assume this will be faster:
SELECT u.id AS id14_, u.first_name AS first2_14_, u.last_name AS last3_14_, u.street_1 AS street4_14_, u.street_2 AS street5_14_, u.city AS city14_, u.us_state_id AS us7_14_, u.region AS region14_, u.country_id AS country9_14_, u.postal_code AS postal10_14_, u.user_name AS user11_14_, u.password AS password14_, u.profession AS profession14_, u.phone AS phone14_, u.url AS url14_, u.bio AS bio14_, u.last_login AS last17_14_, u.status AS status14_, u.birthdate AS birthdate14_, u.ageinyears AS ageinyears14_, u.deleted AS deleted14_, u.createdate AS createdate14_, u.audit AS audit14_, u.migrated2008 AS migrated24_14_, u.creator AS creator14_
FROM dir_users u
WHERE u.status = 'active'
AND u.deleted = FALSE
AND EXISTS (
SELECT 1
FROM dir_memberships m
JOIN dir_roles r ON r.id = m.role
JOIN dir_groups g ON g.id = m.group_id
WHERE m.group_id = 15499
AND m.user_id = u.id
AND (m.expires IS NULL
OR m.expires > now() AND (m.startdate IS NULL OR m.startdate < now()))
AND m.deleted = FALSE
AND r.deleted = FALSE
AND r.name = 'ROLE_MEMBER'
AND g.deleted = FALSE
)
AND EXISTS (
SELECT 1
FROM dir_memberships m
JOIN dir_roles r ON r.id = m.role
WHERE (m.expires IS NULL
OR m.expires > now() AND (m.startDate IS NULL OR m.startDate < now()))
AND m.deleted = FALSE
AND m.user_id = u.id
AND r.name = 'ROLE_TEACHER_MEMBER'
)
EXISTS
case ... end = 1
expressions with simple expressionsJOIN
construct and the IN
expression into two EXISTS
semi-joins, which voids the necessity for DISTINCT
. This should be quite a bit faster.If this isn't fast enough yet, and your write performance can deal with more indexes, add this partial multi-column index:
CREATE INDEX dir_memberships_g_id_u_id_idx ON dir_memberships (group_id, user_id)
WHERE deleted = FALSE;
The WHERE
conditions have to match your query for the index to be useful!
I assume that you already have primary keys and indexes on relevant foreign keys.
Further:
CREATE INDEX dir_memberships_u_id_role_idx ON dir_memberships (user_id, role)
WHERE deleted = FALSE;
Why user_id
a second time?. See:
Also, since user_id
is already used in another index you are not blocking HOT-updates (which can only be used with columns not involved in any indexes.
Why role
?
I assume both columns are of type integer
(4 bytes). I have seen in your detailed question, that you run a 64 bit OS where MAXALIGN 8 bytes, so another integer will not make the index grow at all. I threw in role
which might be useful for the second EXISTS
semi-join.
If you have many "dead" users, this might also help:
CREATE INDEX dir_users_id_idx ON dir_users (id)
WHERE status = 'active' AND deleted = FALSE;
As always, check with EXPLAIN
to see whether the indexes actually get used. You wouldn't want useless indexes consuming resources.
Are we fast yet?
Of course, all the usual advice for performance optimization applies, too.