Bulk Indexing in Elasticssearch using the ElasticLowLevelClient client

独自空忆成欢 提交于 2019-12-24 06:24:08

问题


I'm using the ElasticLowLevelClient client to index elasticsearch data as it needs to be indexed as a raw string as I don't have access to the POCO objects. I can successfully index an individual object by calling:

client.Index<object>(indexName, message.MessageType, message.Id, 
    new Elasticsearch.Net.PostData<object>(message.MessageJson));

How can I do a bulk insert into the index using the ElasticLowLevelClient client? The bulk inset APIs all require a POCO of the indexing document which I don't have e.g.:

 ElasticsearchResponse<T> Bulk<T>(string index, PostData<object> body,
      Func<BulkRequestParameters, BulkRequestParameters> requestParameters = null)

I could make the API calls in parallel for each object but that seems inefficient.


回答1:


The low level client generic type parameter is the type for the response expected.

If you're using the low level client exposed on the high level client, through the .LowLevel property, you can send a bulk request where your documents are JSON strings as follows in 5.x

var client = new ElasticClient(settings);


var messages = new [] 
{
    new Message 
    { 
        Id = "1", 
        MessageType = "foo", 
        MessageJson = "{\"name\":\"message 1\",\"content\":\"foo\"}" 
    },  
    new Message 
    { 
        Id = "2", 
        MessageType = "bar", 
        MessageJson = "{\"name\":\"message 2\",\"content\":\"bar\"}" 
    }   
};

var indexName = "my-index";

var bulkRequest = messages.SelectMany(m => 
    new[]
    {
        client.Serializer.SerializeToString(new
            {
                index = new
                {
                    _index = indexName,
                    _type = m.MessageType,
                    _id = m.Id
                }
            }, SerializationFormatting.None),
        m.MessageJson
    });

var bulkResponse = client.LowLevel.Bulk<BulkResponse>(string.Join("\n", bulkRequest) + "\n");

which sends the following bulk request

POST http://localhost:9200/_bulk
{"index":{"_index":"my-index","_type":"foo","_id":"1"}}
{"name":"message 1","content":"foo"}
{"index":{"_index":"my-index","_type":"bar","_id":"2"}}
{"name":"message 2","content":"bar"}

A few important points

  1. We need to build the bulk request ourselves to use the low level bulk API call. Since our documents are already strings, it makes sense to build a string request.
  2. We serialize an anonymous type with no indenting for the action and metadata for each bulk item.
  3. The MessageJson cannot contain any newline characters in it as this will break the bulk API; newline characters are the delimiters for json objects within the body.
  4. Because we're using the low level client exposed on the high level client, we can still take advantage of the high level requests, responses and serializer. The bulk request returns a BulkResponse, which you can work with as you normally do when sending a bulk request with the high level client.


来源:https://stackoverflow.com/questions/48754597/bulk-indexing-in-elasticssearch-using-the-elasticlowlevelclient-client

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!