Solr date field tdate vs date?

后端未结

关注

 3  422

闹比i 2021-02-01 15:09

So I have a question about Solr\'s field date types which is pretty straight forward: what\'s the difference between a \'date\' field and a \'tdate\' one?

The schema .xm

3条回答

遥遥无期 (楼主)

2021-02-01 16:05
Trie fields make range queries faster by precomputing certain range results and storing them as a single record in the index. For clarity, my example will use integers in base ten. The same concept applies to all trie types. This includes dates, since a date can be represented as the number of seconds since, say, 1970.

Let's say we index the number 12345678. We can tokenize this into the following tokens.
```
12345678
123456xx
1234xxxx
12xxxxxx
```
The 12345678 token represents the actual integer value. The tokens with the x digits represent ranges. 123456xx represents the range 12345600 to 12345699, and matches all the documents that contain a token in that range.

Notice how in each token on the list has successively more x digits. This is controlled by the precision step. In my example, you could say that I was using a precision step of 2, since I trim 2 digits to create each extra token. If I were to use a precision step of 3, I would get these tokens.
```
12345678
12345xxx
12xxxxxx
```
A precision step of 4:
```
12345678
1234xxxx
```
A precision step of 1:
```
12345678
1234567x
123456xx
12345xxx
1234xxxx
123xxxxx
12xxxxxx
1xxxxxxx
```
It's easy to see how a smaller precision step results in more tokens and increases the size of the index. However, it also speeds up range queries.

Without the trie field, if I wanted to query a range from 1250 to 1275, Lucene would have to fetch 25 entries (1250, 1251, 1252, ..., 1275) and combine search results. With a trie field (and precision step of 1), we could get away with fetching 8 entries (125x, 126x, 1270, 1271, 1272, 1273, 1274, 1275), because 125x is a precomputed aggregation of 1250 - 1259. If I were to use a precision step larger than 1, the query would go back to fetching all 25 individual entries.

Note: In reality, the precision step refers to the number of bits trimmed for each token. If you were to write your numbers in hexadecimal, a precision step of 4 would trim one hex digit for each token. A precision step of 8 would trim two hex digits.
0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...