Elasticsearch Bulk Index JSON Data

后端 未结 3 1770
遥遥无期
遥遥无期 2020-11-28 08:18

I am trying to bulk index a JSON file into a new Elasticsearch index and am unable to do so. I have the following sample data inside the JSON

[{\"Amount\": \         


        
相关标签:
3条回答
  • 2020-11-28 09:07

    What you need to do is to read that JSON file and then build a bulk request with the format expected by the _bulk endpoint, i.e. one line for the command and one line for the document, separated by a newline character... rinse and repeat for each document:

    curl -XPOST localhost:9200/your_index/_bulk -d '
    {"index": {"_index": "your_index", "_type": "your_type", "_id": "975463711"}}
    {"Amount": "480", "Quantity": "2", "Id": "975463711", "Client_Store_sk": "1109"}
    {"index": {"_index": "your_index", "_type": "your_type", "_id": "975463943"}}
    {"Amount": "2105", "Quantity": "2", "Id": "975463943", "Client_Store_sk": "1109"}
    ... etc for all your documents
    '
    

    Just make sure to replace your_index and your_type with the actual index and type names you're using.

    UPDATE

    Note that the command-line can be shortened, by removing _index and _type if those are specified in your URL. It is also possible to remove _id if you specify the path to your id field in your mapping (note that this feature will be deprecated in ES 2.0, though). At the very least, your command line can look like {"index":{}} for all documents but it will always be mandatory in order to specify which kind of operation you want to perform (in this case index the document)

    UPDATE 2

    curl -XPOST localhost:9200/index_local/my_doc_type/_bulk --data-binary  @/home/data1.json
    

    /home/data1.json should look like this:

    {"index":{}}
    {"Amount": "480", "Quantity": "2", "Id": "975463711", "Client_Store_sk": "1109"}
    {"index":{}}
    {"Amount": "2105", "Quantity": "2", "Id": "975463943", "Client_Store_sk": "1109"}
    {"index":{}}
    {"Amount": "2107", "Quantity": "3", "Id": "974920111", "Client_Store_sk": "1109"}
    
    0 讨论(0)
  • 2020-11-28 09:14

    As of today, 6.1.2 is the latest version of ElasticSearch, and the curl command that works for me on Windows (x64) is

    curl -s -XPOST localhost:9200/my_index/my_index_type/_bulk -H "Content-Type: 
    application/x-ndjson" --data-binary @D:\data\mydata.json
    

    The format of the data that should be present in mydata.json remains the same as shown in @val's answer

    0 讨论(0)
  • 2020-11-28 09:17

    A valid Elasticsearch bulk API request would be something like (ending with a newline):

    POST http://localhost:9200/products_slo_development_temp_2/productModel/_bulk

    { "index":{ } } 
    {"RequestedCountry":"slo","Id":1860,"Title":"Stol"} 
    { "index":{ } } 
    {"RequestedCountry":"slo","Id":1860,"Title":"Miza"} 
    

    Elasticsearch bulk api documentation: https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-bulk.html

    This is how I do it

    I send a POST http request with the uri valiable as the URI/URL of the http request and elasticsearchJson variable is the JSON sent in the body of the http request formatted for the Elasticsearch bulk api:

    var uri = @"/" + indexName + "/productModel/_bulk";
    var json = JsonConvert.SerializeObject(sqlResult);
    var elasticsearchJson = GetElasticsearchBulkJsonFromJson(json, "RequestedCountry");
    

    Helper method for generating the required json format for the Elasticsearch bulk api:

    public string GetElasticsearchBulkJsonFromJson(string jsonStringWithArrayOfObjects, string firstParameterNameOfObjectInJsonStringArrayOfObjects)
    {
      return @"{ ""index"":{ } } 
    " + jsonStringWithArrayOfObjects.Substring(1, jsonStringWithArrayOfObjects.Length - 2).Replace(@",{""" + firstParameterNameOfObjectInJsonStringArrayOfObjects + @"""", @" 
    { ""index"":{ } } 
    {""" + firstParameterNameOfObjectInJsonStringArrayOfObjects + @"""") + @"
    ";
    }
    

    The first property/field in my JSON object is the RequestedCountry property that's why I use it in this example.

    productModel is my Elasticsearch document type. sqlResult is a C# generic list with products.

    0 讨论(0)
提交回复
热议问题