Does anyone know the difference between the two CQL data types text
and varchar
in Cassandra? The Cassandra documentation describes both types as \
Probably you meant the CQL storage types, if not, disregard my answer.
In CQL there has been a ongoing trend to try to distance from the internals of cassandra. Whether that is a good thing, or a bad thing, is open to interpretation. What is relevant, however, is in latest versions of CQL developers have been trying to come up with syntax that is more familiar to people who are not that in depth into cassandra's internals.
If you were to take a look into this SO question, you will get a nice illustration of the situation: Creating column family or table in Cassandra while working Datastax API(which uses new Binary protocol)
In recent CQL versions, some aliases, alien to cassandra, but very well known to DBA's have started to appear. For example, the native to cassandra ColumnFamily has been aliased with Table, and text is just an alias for varchar and vice versa. Again, it is a matter of opinion if that is a good thing or not.
So, in conclusion, you can use varchar and text interchangeably.
text
is just an alias for varchar
!
The documentation:
EDIT
Here's the link to the C* 1.2 docs. The text vs varchar info is still the same, however this document contains some extra datatypes.
EDIT v2 Documentation links have been updated to the docs for C* 3. I couldn't find a good alternative for the C* 1.2 docs.
Cassandra CQL Data Types text and varchar are synonmys/alias for each other.
This threw me too when I started with Cassandra.
Both text and varchar are UTF8 encoded strings and are synonyms for each other, that is they are exactly the same thing.
As an added side note if one comes from a relational world like MS SQL, one would perhaps also be hesitant to use these types (especially TEXT) as the primary field for an entity. TEXT is especially usually associated with big blobs of text content that don't scream primary key to ones 3rd normal form relational mind. But since all Cassandra types are essentially stored as hexadecimal byte arrays on the disk there is no real significant performance when using them as the primary key.