I have the following SQLite table with 198,305 geocoded portuguese postal codes:
CREATE TABLE "pt_postal" (
"code" text NOT N
Basically, I was using sprintf()
to see what kind of bounding coordinates where being computed, and since I couldn't run the query on any place other than PHP (because of the UDF) I was generating another query with prepared statements. The problem was, I wasn't generating the last bound parameter (the kilometers in the distance <= ?
clause) and I was fooled by my sprintf()
version.
Guess I shouldn't try to code when I'm sleepy. I'm truly sorry for your wasted time, and thank you all!
Just for the sake of completeness, the following returns (correctly!) 873 records, in ~ 0.04 seconds:
SELECT "code",
geo(38.73311, -9.138707, "geo_latitude", "geo_longitude") AS "distance"
FROM "pt_postal" WHERE 1 = 1
AND "geo_latitude" BETWEEN 38.7241268076 AND 38.7420931924
AND "geo_longitude" BETWEEN -9.15022289523 AND -9.12719110477
AND "distance" <= 1
ORDER BY "distance" ASC
LIMIT 2048;
I cannot tell from the documentation whether or not sqliteCreateFunction
defines an aggregate, like SUM
, or a scalar, like sqrt
. Aggregate functions cannot be referenced in a WHERE
clause; HAVING
is required.
Per the SQLite UDF documentation, you need to know if only xFunc is populated, or if xStep and xFinal are. Those are the pointers SQLite uses to know the kind of function you're defining, and thus whether or not to honor it in a WHERE
clause.
This also return 873 records, ordered by distance
in ~0.04 seconds:
SELECT
"code",
geo(38.73311, -9.138707, "geo_latitude", "geo_longitude") AS "distance"
FROM "pt_postal" WHERE 1 = 1
AND "geo_latitude" BETWEEN 38.7241268076 AND 38.7420931924
AND "geo_longitude" BETWEEN -9.15022289523 AND -9.12719110477
GROUP BY "code"
HAVING "distance" <= 1
ORDER BY "distance" ASC
LIMIT 2048;
The reason this page doesn't have a GROUP BY
clause is MySQL specific:
A HAVING clause can refer to any column or alias named in a select_expr in the SELECT list or in outer subqueries, and to aggregate functions. However, the SQL standard requires that HAVING must reference only columns in the GROUP BY clause or columns used in aggregate functions. To accommodate both standard SQL and the MySQL-specific behavior of being able to refer columns in the SELECT list, MySQL 5.0.2 and up permit HAVING to refer to columns in the SELECT list, columns in the GROUP BY clause, columns in outer subqueries, and to aggregate functions.
If no primary / unique key is available, the following hack also works (albeit a bit slower - ~0.16 seconds):
SELECT
"code",
geo(38.73311, -9.138707, "geo_latitude", "geo_longitude") AS "distance"
FROM "pt_postal" WHERE 1 = 1
AND "geo_latitude" BETWEEN 38.7241268076 AND 38.7420931924
AND "geo_longitude" BETWEEN -9.15022289523 AND -9.12719110477
GROUP BY _ROWID_
HAVING "distance" <= 1
ORDER BY "distance" ASC
LIMIT 2048;
This query (provided by @OMGPonies):
SELECT *
FROM (
SELECT
"code",
geo(38.73311, -9.138707, "geo_latitude", "geo_longitude") AS "distance"
FROM "pt_postal" WHERE 1 = 1
AND "geo_latitude" BETWEEN 38.7241268076 AND 38.7420931924
AND "geo_longitude" BETWEEN -9.15022289523 AND -9.12719110477
)
WHERE "distance" <= 1
ORDER BY "distance" ASC
LIMIT 2048;
Correctly returns the 873 records, ordered by distance
in ~0.07 seconds.
However, I'm still wondering why SQLite doesn't evaluate geo()
in the WHERE
clause, like MySQL...