问题
I'm using the ElasticLowLevelClient client to index elasticsearch data as it needs to be indexed as a raw string as I don't have access to the POCO objects. I can successfully index an individual object by calling:
client.Index<object>(indexName, message.MessageType, message.Id,
new Elasticsearch.Net.PostData<object>(message.MessageJson));
How can I do a bulk insert into the index using the ElasticLowLevelClient client? The bulk inset APIs all require a POCO of the indexing document which I don't have e.g.:
ElasticsearchResponse<T> Bulk<T>(string index, PostData<object> body,
Func<BulkRequestParameters, BulkRequestParameters> requestParameters = null)
I could make the API calls in parallel for each object but that seems inefficient.
回答1:
The low level client generic type parameter is the type for the response expected.
If you're using the low level client exposed on the high level client, through the .LowLevel
property, you can send a bulk request where your documents are JSON strings as follows in 5.x
var client = new ElasticClient(settings);
var messages = new []
{
new Message
{
Id = "1",
MessageType = "foo",
MessageJson = "{\"name\":\"message 1\",\"content\":\"foo\"}"
},
new Message
{
Id = "2",
MessageType = "bar",
MessageJson = "{\"name\":\"message 2\",\"content\":\"bar\"}"
}
};
var indexName = "my-index";
var bulkRequest = messages.SelectMany(m =>
new[]
{
client.Serializer.SerializeToString(new
{
index = new
{
_index = indexName,
_type = m.MessageType,
_id = m.Id
}
}, SerializationFormatting.None),
m.MessageJson
});
var bulkResponse = client.LowLevel.Bulk<BulkResponse>(string.Join("\n", bulkRequest) + "\n");
which sends the following bulk request
POST http://localhost:9200/_bulk
{"index":{"_index":"my-index","_type":"foo","_id":"1"}}
{"name":"message 1","content":"foo"}
{"index":{"_index":"my-index","_type":"bar","_id":"2"}}
{"name":"message 2","content":"bar"}
A few important points
- We need to build the bulk request ourselves to use the low level bulk API call. Since our documents are already strings, it makes sense to build a string request.
- We serialize an anonymous type with no indenting for the action and metadata for each bulk item.
- The
MessageJson
cannot contain any newline characters in it as this will break the bulk API; newline characters are the delimiters for json objects within the body. - Because we're using the low level client exposed on the high level client, we can still take advantage of the high level requests, responses and serializer. The bulk request returns a
BulkResponse
, which you can work with as you normally do when sending a bulk request with the high level client.
来源:https://stackoverflow.com/questions/48754597/bulk-indexing-in-elasticssearch-using-the-elasticlowlevelclient-client