问题
In my legacy database (Postgres 9.1) I have several tables containing divers kinds of documents (let's say they are 'parent' tables). Additionally, there is a table with various parameters for these documents:
create table params (
kind integer,
docid integer,
parname text,
parvalue text,
constraint params_pk primary key (kind, docid, parname));
There may be many (parname, parvalue) pairs for one document. As kind points to different tables it cannot be used as a foreign key.
It has been working well for years as params were used only for printing documents. Now this table contains 5 millions rows and the data is needed also for other purposes. So it is high time to renew this model.
Basically params are inserted once for a document and very seldom updated. They will be read as a whole (for a document). There is no need to search for a specific parname.
I have three ideas:
Variant A. Split table params into several tables according to parent tables and use docid as foreign key.
Variant B. Split table params as in variant A and store (parname, parvalue) as hstore.
Variant C. Add a hstore field in every parent table and forget additional tables.
I have no experience with hstore. What are the cons and pros of each variant? Which one would you choose? Can hstore baffle me with something strange?
回答1:
I vote for the third option. The fewer tables the better sleep.
Hstore was invented for one-level parameter lists. It is stable, fast and simple, and perfectly fits your needs. I had similar task some time ago. I wrote an aggregate for easier conversion.
create or replace function hstore_add(hstore, text, text)
returns hstore language plpgsql
as $$
begin
return case
when $1 isnull then hstore($2, $3)
else $1 || hstore($2, $3) end;
end $$;
create aggregate hstore_agg (text, text) (
sfunc = hstore_add,
stype = hstore
);
I think it may save your time.
select kind, docid, hstore_agg(parname, parvalue)
from params
group by 1, 2
order by 1, 2;
回答2:
If as you say you need to fetch the fields with the document then Denormalized hstore variant is better because the server will be able to fetch the entire document from a single location on disk instead of using several locations to index-join the document with the fields. The only problem I see with hstore is a somewhat unconventional syntax. Might be easier to work with JSON. PostgreSQL 9.4 will have an excellent support for (indexed) binary JSON. Using binary JSON is recommended by hstore authors, BTW.
So a plan might be to use a json
column in 9.3 and then convert it to jsonb
in 9.4.
来源:https://stackoverflow.com/questions/23614873/should-i-use-hstore-in-renewed-data-model