Explanation of JSONB introduced by PostgreSQL

为君一笑 提交于 2019-11-26 04:03:35

问题


PostgreSQL just introduced JSONB and it\'s already trending on hacker news. It would be great if someone could explain how it\'s different from Hstore and JSON previously present in PostgreSQL. What are it\'s advantages and limitations and when should someone consider using it?


回答1:


First, hstore is a contrib module, which only allows you to store key => value pairs, where keys and values can only be texts (however values can be sql NULLs too).

Both json & jsonb allows you to store a valid JSON value (defined in its spec).

F.ex. these are valid JSON representations: null, true, [1,false,"string",{"foo":"bar"}], {"foo":"bar","baz":[null]} - hstore is just a little subset compared to what JSON is capable (but if you only need this subset, it's fine).

The only difference between json & jsonb is their storage:

  • json is stored in its plain text format, while
  • jsonb is stored in some binary representation

There are 3 major consequences of this:

  • jsonb usually takes more disk space to store than json (sometimes not)
  • jsonb takes more time to build from its input representation than json
  • json operations take significantly more time than jsonb (& parsing also needs to be done each time you do some operation at a json typed value)

When jsonb will be available with a stable release, there will be two major use cases, when you can easily select between them:

  1. If you only work with the JSON representation in your application, PostgreSQL is only used to store & retrieve this representation, you should use json.
  2. If you do a lot of operations on the JSON value in PostgreSQL, or use indexing on some JSON field, you should use jsonb.



回答2:


Peeyush:

The short answer is:

  • If you are doing a lot of JSON manipulation inside PostgreSQL, such as sorting, slicing, splicing, etc., you should use JSONB for speed reasons.
  • If you need indexed lookups for arbitrary key searches on JSON, then you should use JSONB.
  • If you are doing neither of the above, you should probably use JSON.
  • If you need to preserve key ordering, whitespace, and duplicate keys, you should use JSON.

For a longer answer, you'll need to wait for me to do a full "HowTo" writeup closer to the 9.4 release.




回答3:


A simple explanation of the difference between json and jsonb (original image by PostgresProfessional):

SELECT '{"c":0,   "a":2,"a":1}'::json, '{"c":0,   "a":2,"a":1}'::jsonb;

          json          |        jsonb 
------------------------+--------------------- 
 {"c":0,   "a":2,"a":1} | {"a": 1, "c": 0} 
(1 row)
  • json: textual storage «as is»
  • jsonb: no whitespaces
  • jsonb: no duplicate keys, last key win
  • jsonb: keys are sorted

More in speech video and slide show presentation by jsonb developers. Also they introduced JsQuery, pg.extension provides powerful jsonb query language




回答4:


  • hstore is more of a "wide column" storage type, it is a flat (non-nested) dictionary of key-value pairs, always stored in a reasonably efficient binary format (a hash table, hence the name).
  • json stores JSON documents as text, performing validation when the documents are stored, and parsing them on output if needed (i.e. accessing individual fields); it should support the entire JSON spec. Since the entire JSON text is stored, its formatting is preserved.
  • jsonb takes shortcuts for performance reasons: JSON data is parsed on input and stored in binary format, key orderings in dictionaries are not maintained, and neither are duplicate keys. Accessing individual elements in the JSONB field is fast as it doesn't require parsing the JSON text all the time. On output, JSON data is reconstructed and initial formatting is lost.

IMO, there is no significant reason for not using jsonb once it is available, if you are working with machine-readable data.




回答5:


JSONB is a "better" version of JSON.

Let's look at an example:

SELECT '{"c":0,   "a":2,"a":1}'::json, '{"c":0,   "a":2,"a":1}'::jsonb;
          json          |        jsonb 
------------------------+--------------------- 
 {"c":0,   "a":2,"a":1} | {"a": 1, "c": 0} 
(1 row)
  1. JSON stores white space, they is why we can see spaces when key "a" is stored, while JSONB does not.
  2. JSON stores all the values of key. This is the reason you can see multiple values (2 and 1) against the key "a" , while JSONB only "stores" the last value.
  3. JSON maintains the order in which elements are inserted, while JSONB maintains the "sorted" order.
  4. JSONB objects are stored as decompressed binary as opposed to "raw data" in JSON , where no reparsing of data is required during retrieval.
  5. JSONB also supports indexing, which can be a significant advantage.

In general, one should prefer JSONB , unless there are specialized needs, such as legacy assumptions about ordering of object keys.




回答6:


I was at the pgopen today benchmarks are way faster than mongodb, I believe it was around 500% faster for selects. Pretty much everything was faster at least by at 200% when contrasted with mongodb, than one exception right now is a update which requires completely rewriting the entire json column something mongodb handles better.

The gin indexing on on jsonb sounds amazing.

Also postgres will persist types of jsonb internally and basically match this with types such as numeric, text, boolean etc.

Joins will also be possible using jsonb

Add PLv8 for stored procedures and this will basically be a dream come true for node.js developers.

Being it's stored as binary jsonb will also strip all whitespace, change the ordering of properties and remove duplicate properties using the last occurance of the property.

Besides the index when querying against a jsonb column contrasted to a json column postgres doesn't have to actually run the functionality to convert the text to json on every row which will likely save a vast amount of time alone.




回答7:


As far as I can tell,

  • hstore as it currently exists (in Postgresql 9.3) does not allow for nesting other objects and arrays as the values of its key/value pairs. however a future hstore patch will allow for nesting. this patch will not be in the 9.4 release and may not be included any time soon.

  • json as it currently exists does allow for nesting, but is text-based, and does not allow for indexing, thus it is "slow"

  • jsonb that will be released with 9.4 will have the current nesting capabilities of json, as well as the GIN/GIST indexing of hstore, so it will be fast

People working on postgresql 9.4 seem to be saying that the new, fast jsonb type will appeal to people who would have chosen to use a noSQL data store like MongoDB, but can now combine a relational database with query-able unstructured data under one roof

http://www.databasesoup.com/2014/02/why-hstore2jsonb-is-most-important.html

Benchmarks of postgresql 9.4 jsonb seem to be on par with or in some cases faster than MongoDB

http://texture.io/alphabetum/postgresql-incl-hstore-vs-mongodb




回答8:


Another important difference, that wasn't mentioned in any answer above, is that there is no equality operator for json type, but there is one for jsonb.

This means that you can't use DISTINCT keyword when selecting this json-type and/or other fields from a table (you can use DISTINCT ON instead, but it's not always possible because of cases like this).



来源:https://stackoverflow.com/questions/22654170/explanation-of-jsonb-introduced-by-postgresql

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!