solr DIH: RegExTransformer

拜拜、爱过 提交于 2021-01-28 05:12:19

问题


Currently, I need to apply a transformation on bellow third column:

ACAC | 0 | 01
ACAC | 0 | 0101
ACAC | 0 | 0102
ACAC | 0 | 010201

I need to transform "010201" to "01/02/01".

So first I need to:

  1. trim all ending 0 characters
  2. split each 2 numbers and add "/" character.

The context of this transformation is inside solr data import handler transformers, but it's using java regex library internally.

Is there anyway to get that?

I've tried using this regex:

Currently, I need to apply a transformation on bellow third column:

ACAC | 0 | 01
ACAC | 0 | 0101
ACAC | 0 | 0102
ACAC | 0 | 010201

I need to transform "010201" to "01/02/01".

So first I need to:

  1. trim all ending 0 characters
  2. split each 2 numbers and add "/" character.

The context of this transformation is inside solr data import handler transformers, but it's using java regex library internally.

Is there anyway to get that?

(\d[1-9]{1})

it tokens me:

01/04/01/

And would need:

01/04/01

Replace expression is:

$&/

Any ideas?


回答1:


You can use

\d{2}(?=(?:\d{2})+$)

Replace with $0/, see the regex demo.

Details

  • \d{2} - two digits
  • (?=(?:\d{2})+$) - a positive lookahead that makes sure there are one or more occurrences of double digits up to the end of string.

The $0 in the replacement stands for the whole match.

In the RegExTransformer code, use

<field column="colname" regex="\d{2}(?=(?:\d{2})+$)" replaceWith="$0/" />


来源:https://stackoverflow.com/questions/64695028/solr-dih-regextransformer

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!