发表新帖

发表新帖

Flatten nested JSON using jq

前端未结

关注

 4  406

独厮守ぢ 2021-02-04 09:58

I\'d like to flatten a nested json object, e.g. {\"a\":{\"b\":1}} to {\"a.b\":1} in order to digest it in solr.

I have 11 TB of json files whi

4条回答

予麋鹿 (楼主)

2021-02-04 10:41
As it turns out, curl -XPOST 'http://localhost:8983/solr/flat/update/json/docs' -d @json_file does just this:
```
{
    "a.b":[1],
    "id":"24e3e780-3a9e-4fa7-9159-fc5294e803cd",
    "_version_":1535841499921514496
}
```
EDIT 1: solr 6.0.1 with bin/solr -e cloud. collection name is flat, all the rest are default (with data-driven-schema which is also default).

EDIT 2: The final script I used: find . -name '*.json' -exec curl -XPOST 'http://localhost:8983/solr/collection1/update/json/docs' -d @{} \;.

EDIT 3: Is is also possible to parallel with xargs and to add the id field with jq: find . -name '*.json' -print0 | xargs -0 -n 1 -P 8 -I {} sh -c "cat {} | jq '. + {id: .a.b}' | curl -XPOST 'http://localhost:8983/solr/collection/update/json/docs' -d @-" where -P is the parallelism factor. I used jq to set an id so multiple uploads of the same document won't create duplicates in the collection (when I searched for the optimal value of -P it created duplicates in the collection)
0 讨论(0)

查看其它4个回答
发布评论:

提交评论
- 加载中...

热议问题