问题
Trying to do distinct on a mode with rails.
2.1.1 :450 > u.profiles.select("profiles.*").distinct
Profile Load (0.9ms) SELECT DISTINCT profiles.* FROM "profiles" INNER JOIN "integration_profiles" ON "profiles"."id" = "integration_profiles"."profile_id" INNER JOIN "integrations" ON "integration_profiles"."integration_id" = "integrations"."id" WHERE "integrations"."user_id" = $1 [["user_id", 2]]
PG::UndefinedFunction: ERROR: could not identify an equality operator for type json
LINE 1: SELECT DISTINCT profiles.* FROM "profiles" INNER JOIN "integ...
^
: SELECT DISTINCT profiles.* FROM "profiles" INNER JOIN "integration_profiles" ON "profiles"."id" = "integration_profiles"."profile_id" INNER JOIN "integrations" ON "integration_profiles"."integration_id" = "integrations"."id" WHERE "integrations"."user_id" = $1
ActiveRecord::StatementInvalid: PG::UndefinedFunction: ERROR: could not identify an equality operator for type json
LINE 1: SELECT DISTINCT profiles.* FROM "profiles" INNER JOIN "integ...
^
: SELECT DISTINCT profiles.* FROM "profiles" INNER JOIN "integration_profiles" ON "profiles"."id" = "integration_profiles"."profile_id" INNER JOIN "integrations" ON "integration_profiles"."integration_id" = "integrations"."id" WHERE "integrations"."user_id" = $1
from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/rack-mini-profiler-0.9.1/lib/patches/sql_patches.rb:109:in `prepare'
from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/rack-mini-profiler-0.9.1/lib/patches/sql_patches.rb:109:in `prepare'
from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/activerecord-4.0.4/lib/active_record/connection_adapters/postgresql_adapter.rb:834:in `prepare_statement'
from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/activerecord-4.0.4/lib/active_record/connection_adapters/postgresql_adapter.rb:795:in `exec_cache'
from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/activerecord-4.0.4/lib/active_record/connection_adapters/postgresql/database_statements.rb:139:in `block in exec_query'
from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/activerecord-4.0.4/lib/active_record/connection_adapters/abstract_adapter.rb:442:in `block in log'
from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/activesupport-4.0.4/lib/active_support/notifications/instrumenter.rb:20:in `instrument'
from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/activerecord-4.0.4/lib/active_record/connection_adapters/abstract_adapter.rb:437:in `log'
from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/activerecord-4.0.4/lib/active_record/connection_adapters/postgresql/database_statements.rb:137:in `exec_query'
from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/activerecord-4.0.4/lib/active_record/connection_adapters/postgresql_adapter.rb:908:in `select'
from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/activerecord-4.0.4/lib/active_record/connection_adapters/abstract/database_statements.rb:32:in `select_all'
from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/activerecord-4.0.4/lib/active_record/connection_adapters/abstract/query_cache.rb:63:in `select_all'
from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/activerecord-4.0.4/lib/active_record/querying.rb:36:in `find_by_sql'
from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/activerecord-4.0.4/lib/active_record/relation.rb:585:in `exec_queries'
from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/activerecord-4.0.4/lib/active_record/association_relation.rb:15:in `exec_queries'
from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/activerecord-4.0.4/lib/active_record/relation.rb:471:in `load'
from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/activerecord-4.0.4/lib/active_record/relation.rb:220:in `to_a'
from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/activerecord-4.0.4/lib/active_record/relation.rb:573:in `inspect'
from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/railties-4.0.4/lib/rails/commands/console.rb:90:in `start'
from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/railties-4.0.4/lib/rails/commands/console.rb:9:in `start'
from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/railties-4.0.4/lib/rails/commands.rb:62:in `<top (required)>'
from bin/rails:4:in `require'
from bin/rails:4:in `<main>'2.1.1 :451 >
Getting an error PG::UndefinedFunction: ERROR: could not identify an equality operator for type json
Converting to Hstore is not an option for me in this case. Any work arounds?
回答1:
The reason behind this, is that in PostgreSQL (up to 9.3) there is no equality operator defined for json
(i.e. val1::json = val2::json
will always throw this exception) -- in 9.4 there will be one for the jsonb
type.
One workaround is, you can cast your json
field to text
. But that won't cover all json equalitions. f.ex. {"a":1,"b":2}
should be equal to {"b":2,"a":1}
, but won't be equal if casted to text
.
Another workaround is (if you have a primary key for that table -- which should be) you can use the DISTINCT ON (<expressions>) form:
u.profiles.select("DISTINCT ON (profiles.id) profiles.*")
Note: One known caveat for DISTINCT ON
:
The DISTINCT ON expression(s) must match the leftmost ORDER BY expression(s). The ORDER BY clause will normally contain additional expression(s) that determine the desired precedence of rows within each DISTINCT ON group.
回答2:
Sorry I'm late on this answer, but it might help others.
As I understand your query, you're only getting possible duplicates on profiles
because of the many-to-many join to integrations
(which you're using to determine which profiles
to access).
Because of that, you can use a new GROUP BY
feature as of 9.1:
When GROUP BY is present, it is not valid for the SELECT list expressions to refer to ungrouped columns except within aggregate functions or if the ungrouped column is functionally dependent on the grouped columns, since there would otherwise be more than one possible value to return for an ungrouped column. A functional dependency exists if the grouped columns (or a subset thereof) are the primary key of the table containing the ungrouped column.
So in your case, you could get Ruby to create the query (sorry, I don't know the Ruby syntax you're using)...
SELECT profiles.*
FROM "profiles"
INNER JOIN "integration_profiles" ON "profiles"."id" = "integration_profiles"."profile_id"
INNER JOIN "integrations" ON "integration_profiles"."integration_id" = "integrations"."id"
WHERE "integrations"."user_id" = $1
GROUP BY "profiles"."id"
I only removed the DISTINCT
from your SELECT
clause and added the GROUP BY
.
By referring ONLY to the id
in the GROUP BY
, you take advantage of that new feature because all the remaining profiles
columns are "functionally dependent" on that id primary key.
Somehow, wonderfully that avoids the need for Postgres to do equality checks on the dependent columns (ie your json
column in this case).
The DISTINCT ON
solution is also great, and clearly sufficient in your case, but you can't use aggregate functions like array_agg
with it. You CAN with this GROUP BY
approach. Happy days! :)
回答3:
If you use PG 9.4 , using JSONB rather than JSON solves this problem Example :
-- JSON datatype test
create table t1 (id int, val json);
insert into t1 (id,val) values (1,'{"name":"value"}');
insert into t1 (id,val) values (1,'{"name":"value"}');
insert into t1 (id,val) values (2,'{"key":"value"}');
select * from t1 order by id;
select distinct * from t1 order by id;
-- JSONB datatype test
create table t2 (id int, val jsonb);
insert into t2 (id,val) values (1,'{"name":"value"}');
insert into t2 (id,val) values (1,'{"name":"value"}');
insert into t2 (id,val) values (2,'{"key":"value"}');
select * from t2 order by id;
select distinct * from t2 order by id;
Result of running the above script :
CREATE TABLE
INSERT 0 1
INSERT 0 1
INSERT 0 1
1 | {"name":"value"}
1 | {"name":"value"}
2 | {"key":"value"}
ERROR: could not identify an equality operator for type json
LINE 1: select distinct * from t1 order by id;
^
CREATE TABLE
INSERT 0 1
INSERT 0 1
INSERT 0 1
1 | {"name": "value"}
1 | {"name": "value"}
2 | {"key": "value"}
1 | {"name": "value"}
2 | {"key": "value"}
As you can see PG succeeded to imply DISTINCT on a JSONB column while it fails on a JSON column !
Try also the following to see that actually keys in the JSONB are sorted :
insert into t2 values (3, '{"a":"1", "b":"2"}');
insert into t2 values (3, '{"b":"2", "a":"1"}');
select * from t2;
1 | {"name": "value"}
1 | {"name": "value"}
2 | {"key": "value"}
3 | {"a": "1", "b": "2"}
3 | {"a": "1", "b": "2"}
note that '{"b":"2", "a":"1"}' was inserted as '{"a":"1", "b":"2"}' therefor PG identifies that as the same record :
select distinct * from t2;
3 | {"a": "1", "b": "2"}
2 | {"key": "value"}
1 | {"name": "value"}
回答4:
Yeah, unfortunately postgres json
doesn't implement equality, but jsonb
does. So migrating json
columns to jsonb
and it should work okay.
来源:https://stackoverflow.com/questions/23509740/distinct-on-postgresql-json-data-column