Any downsides of using data type “text” for storing strings?

半腔热情 提交于 2019-11-26 01:27:55

问题


As per Postgres documentation, they support 3 data-types for character data:

character varying(n), varchar(n)  variable-length with limit
character(n), char(n)             fixed-length, blank padded
text                              variable unlimited length

In my application, I came across few unpleasant scenarios where insert/update queries failed as the desired text to be inserted exceeded the varchar(n) or char(n) limit.

For such cases, changing the data type of such columns to text sufficed.

My questions are:
If we generalize and change the data type of every character storing column to text, is there any downside in terms of performance/memory?
If a column with data type text stores 10 or less characters every time, should I go for text or varchar(10)?
If I go for text what\'s the downside?


回答1:


Generally, there is no downside to using text in terms of performance/memory. On the contrary: text is the optimum. Other types have more or less relevant downsides. @Quassnoi and @Guffa have already shed some light on this.

In particular, never use char or char(n) (alias for character / character(n)), unless you know what you are doing. This blank-padded type is only there for compatibility with old code and standards. It makes very little sense nowadays, wastes memory and is likely to cause trouble:

  • Compare varchar with char
  • String field length in Postgres SQL

To enforce a maximum length on a column, still use text (or varchar without length specifier, which is basically the same) and not varchar(n) (alias for character varying / character varying(n)). A CHECK constraint is much more convenient to change later (without table rewrite), even more so when views, functions, FK constraints etc. depend on the column type.

ALTER TABLE tbl ADD CONSTRAINT tbl_col_len CHECK (length(col) < 100);

A CHECK constraint can also do more than just enforce a maximum character length - anything you can put into a boolean expression. Read more:

  • Change PostgreSQL columns used in views

Finally, there is also "char" (with double-quotes): a 1-byte data type for a single ASCII letter used as cheap internal enumeration type.

I rarely use anything but text for character data in Postgres.




回答2:


All the datatypes you mention use the same internal representation (moderately famous struct varlena)

The CHAR and VARCHAR datatypes just add length checks to this, and (in case of CHAR), have different space-padding semantics.

You can use TEXT safely wherever nothing of above is important to your logic.




回答3:


From the page that you linked to:

"There is no performance difference among these three types, apart from increased storage space when using the blank-padded type, and a few extra CPU cycles to check the length when storing into a length-constrained column. While character(n) has performance advantages in some other database systems, there is no such advantage in PostgreSQL; in fact character(n) is usually the slowest of the three because of its additional storage costs. In most situations text or character varying should be used instead."

There doesn't seem to be any drawbacks of using the text data type in Postgres.

However, you should consider if you really want to allow huge texts to be stored in the database. Keeping it as a varchar but with a higher limit would protect you from inadvertently storing vast amounts of data in the database.



来源:https://stackoverflow.com/questions/20326892/any-downsides-of-using-data-type-text-for-storing-strings

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!