I have three tables in my app, call them tableA
, tableB
, and tableC
. tableA
has fields for tableB_id
and
The other thing is that you have your indexed columns queried as lower()
, which can also be creating a partial index when the query is running.
If you will always query the column as lower()
then your column should be indexed as lower(column_name)
as in:
create index idx_1 on tableb(lower(foo));
Also, have you looked at the execution plan? This will answer all your questions if you can see how it is querying the tables.
Honestly, there are many factors to this. The best solution is to study up on INDEXES, specifically in Postgres so you can see how they work. It is a bit of holistic subject, you can't really answer all your problems with a minimal understanding of how they work.
For instance, Postgres has an initial "lets look at these tables and see how we should query them" before the query runs. It looks over all tables, how big each of the tables are, what indexes exist, etc. and then figures out how the query should run. THEN it executes it. Oftentimes, this is what is wrong. The engine incorrectly determines how to execute it.
A lot of the calculations of this are done off of the summarized table statistics. You can reset the summarized table statistics for any table by doing:
vacuum [table_name];
(this helps to prevent bloating from dead rows)
and then:
analyze [table_name];
I haven't always seen this work, but often times it helps.
ANyway, so best bet is to:
a) Study up on Postgres indexes (a SIMPLE write up, not something ridiculously complex) b) Study up the execution plan of the query c) Using your understanding of Postgres indexes and how the query plan is executing, you cannot help but solve the exact problem.
For starters, your LEFT JOIN
is counteracted by the predicate on the left table and is forced to act like an [INNER] JOIN
. Replace with:
SELECT *
FROM tableA a
JOIN tableB b ON b.id = a.tableB_id
WHERE lower(b.foo) = lower(my_input);
Or, if you actually want the LEFT JOIN
to include all rows from tableA
:
SELECT *
FROM tableA a
LEFT JOIN tableB b ON b.id = a.tableB_id
AND lower(b.foo) = lower(my_input);
I think you want the first one.
An index on like you posted is syntactically invalid. You better post the verbatim output from (lower(foo::text))
\d tbl
in psql like I commented repeatedly. A shorthand syntax for a cast (foo::text
) in an index definition needs more parentheses, or use the standard syntax: cast(foo AS text)
:
But that's also unnecessary. You can just use the data type (character varying(255)
) of foo
. Of course, the data type character varying(255)
rarely makes sense in Postgres to begin with. The odd limitation to 255 characters is derived from limitations in other RDBMS which do not apply in Postgres. Details:
Be that as it may. The perfect index for this kind of query would be a multicolumn index on B
- if (and only if) you get index-only scans out of this:
CREATE INDEX "tableB_lower_foo_id" ON tableB (lower(foo), id);
You can then drop the mostly superseded index "index_tableB_on_lower_foo"
. Same for tableC
.
The rest is covered by the (more important!) indices in table A
on tableB_id
and tableC_id
.
If there are multiple rows in tableA
per tableB_id
/ tableC_id
, then either one of these competing commands can swing the performance to favor the respective query by physically clustering related rows together:
CLUSTER tableA USING "index_tableA_on_tableB_id";
CLUSTER tableA USING "index_tableA_on_tableC_id";
You can't have both. It's either B
or C
. CLUSTER
also does everything a VACUUM FULL
would do. But be sure to read the details first:
And don't use mixed case identifiers, sometimes quoted, sometimes not. This is very confusing and is bound to lead to errors. Use legal, lower-case identifiers exclusively - then it doesn't matter if you double-quote them or not.