Does JSONB make PostgreSQL arrays useless?

扶醉桌前 提交于 2019-12-03 10:58:37
Erwin Brandstetter

In most cases I would use a normalized schema with a table option_tag implementing the many-to-many relationship between the tables option and tag. Reference implementation here:

It may not be the fastest option in every respect, but it offers the full range of DB functionality, including referential integrity, constraints, the full range of data types, all index options and cheap updates.

For completeness, add to your list of options:

  • hstore (good option)
  • xml more verbose and more complex than either hstore or jsonb, so I would only use it when operating with XML.
  • "string of comma-separated values" (very simple, mostly bad option)
  • EAV (Entity-Attribute-Value) or "name-value pairs" (mostly bad option)
    Details under this related question on dba.SE:

If the list is just for display and rarely updated, I would consider a plain array, which is typically smaller and performs better for this than the rest.

Read the blog entry by Josh Berkus @a_horse linked to in his comment. But be aware that it focuses on selected read cases. Josh concedes:

I realize that I did not test comparative write speeds.

And that's where the normalized approach wins big, especially when you change single tags a lot under concurrent load.

jsonb is only a good option if you are going to operate with JSON anyway, and can store and retrieve JSON "as is".

I have used both a normalized schema and just a plain text field with CSV separated values instead of custom data types (instead of CSV you can use JSON or whatever other encoding like www-urlencoding or even XML attribute encoding). This is because many ORM's and database libraries are not very good at supporting custom datatypes (hstore, jsonb, array etc).

@ErwinBrandstetter missed a couple of other benefits of normalized one being the fact that it is much quicker to query for all possible previously used tags in a normalized schema than the array option. This is a very common scenario in many tag systems.

That being said I would recommend using Solr (or elasticsearch) for querying for tags as it deals with tag count and general tag prefix searching far better than what I could get Postgres to do if your willing to deal with the consistency aspects of synchronizing with a search engine. Thus the storage of the tags becomes less important.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!