I am using Apache Flink v1.6.0 and I am trying to write to Elasticsearch v6.4.0, which is hosted in Elastic Cloud. I am having issue when authenticating to the Elastic Cloud cluster.
I have been able to get Flink to write to a local Elasticsearch v6.4.0 node, which does not have encryption using the following code:
/*
Elasticsearch Configuration
*/
List<HttpHost> httpHosts = new ArrayList<>();
httpHosts.add(new HttpHost("127.0.0.1", 9200, "http"));
// use a ElasticsearchSink.Builder to create an ElasticsearchSink
ElasticsearchSink.Builder<ObjectNode> esSinkBuilder = new ElasticsearchSink.Builder<>(
httpHosts,
new ElasticsearchSinkFunction<ObjectNode>() {
private IndexRequest createIndexRequest(ObjectNode payload) {
// remove the value node so the fields are at the base of the json payload
JsonNode jsonOutput = payload.get("value");
return Requests.indexRequest()
.index("raw-payload")
.type("payload")
.source(jsonOutput.toString(), XContentType.JSON);
}
@Override
public void process(ObjectNode payload, RuntimeContext ctx, RequestIndexer indexer) {
indexer.add(createIndexRequest(payload));
}
}
);
// set number of events to be seen before writing to Elasticsearch
esSinkBuilder.setBulkFlushMaxActions(1);
// finally, build and add the sink to the job's pipeline
stream.addSink(esSinkBuilder.build());
However when I try and add authentication into the code base, as documented here in the Flink documentation and here on the corresponding Elasticsearch Java documentation. Which looks like this:
// provide a RestClientFactory for custom configuration on the internally created REST client
Header[] defaultHeaders = new Header[]{new BasicHeader("username", "password")};
esSinkBuilder.setRestClientFactory(
restClientBuilder -> {
restClientBuilder.setDefaultHeaders(defaultHeaders);
}
);
I get the following error when executing the job:
14:49:54,700 INFO org.apache.flink.runtime.rpc.akka.AkkaRpcService - Stopped Akka RPC service.
Exception in thread "main" org.apache.flink.runtime.client.JobExecutionException: org.elasticsearch.ElasticsearchStatusException: method [HEAD], host [https://XXXXXXXXXXXXXX.europe-west1.gcp.cloud.es.io:9243], URI [/], status line [HTTP/1.1 401 Unauthorized]
at org.apache.flink.runtime.minicluster.MiniCluster.executeJobBlocking(MiniCluster.java:623)
at org.apache.flink.streaming.api.environment.LocalStreamEnvironment.execute(LocalStreamEnvironment.java:123)
at com.downuk.AverageStockSalePrice.main(AverageStockSalePrice.java:146)
Caused by: org.elasticsearch.ElasticsearchStatusException: method [HEAD], host [https://XXXXXXXXXXXXXX.europe-west1.gcp.cloud.es.io:9243], URI [/], status line [HTTP/1.1 401 Unauthorized]
at org.elasticsearch.client.RestHighLevelClient.parseResponseException(RestHighLevelClient.java:625)
Can anyone help point out where I am going wrong?
I was able to work it out after looking at the Flink example here and the Elasticsearch documentation here.
It turned out that I was trying to set the wrong configuration above:
restClientBuilder.setDefaultHeaders(...);
Is not what actually needed setting it is:
restClientBuilder.setHttpClientConfigCallback(...);
Once you use the correct custom configuration the rest is pretty simple. So that part I was missing was:
// provide a RestClientFactory for custom configuration on the internally created REST client
esSinkBuilder.setRestClientFactory(
restClientBuilder -> {
restClientBuilder.setHttpClientConfigCallback(new RestClientBuilder.HttpClientConfigCallback() {
@Override
public HttpAsyncClientBuilder customizeHttpClient(HttpAsyncClientBuilder httpClientBuilder) {
// elasticsearch username and password
CredentialsProvider credentialsProvider = new BasicCredentialsProvider();
credentialsProvider.setCredentials(AuthScope.ANY, new UsernamePasswordCredentials("$USERNAME", "$PASSWORD"));
return httpClientBuilder.setDefaultCredentialsProvider(credentialsProvider);
}
});
}
);
And to finish off here is a full snippet for Elasticsearch Sink:
/*
Elasticsearch Configuration
*/
List<HttpHost> httpHosts = new ArrayList<>();
httpHosts.add(new HttpHost("127.0.0.1", 9200, "http"));
// use a ElasticsearchSink.Builder to create an ElasticsearchSink
ElasticsearchSink.Builder<ObjectNode> esSinkBuilder = new ElasticsearchSink.Builder<>(
httpHosts,
new ElasticsearchSinkFunction<ObjectNode>() {
private IndexRequest createIndexRequest(ObjectNode payload) {
// remove the value node so the fields are at the base of the json payload
JsonNode jsonOutput = payload.get("value");
return Requests.indexRequest()
.index("raw-payload")
.type("payload")
.source(jsonOutput.toString(), XContentType.JSON);
}
@Override
public void process(ObjectNode payload, RuntimeContext ctx, RequestIndexer indexer) {
indexer.add(createIndexRequest(payload));
}
}
);
// set number of events to be seen before writing to Elasticsearch
esSinkBuilder.setBulkFlushMaxActions(1);
// provide a RestClientFactory for custom configuration on the internally created REST client
esSinkBuilder.setRestClientFactory(
restClientBuilder -> {
restClientBuilder.setHttpClientConfigCallback(new RestClientBuilder.HttpClientConfigCallback() {
@Override
public HttpAsyncClientBuilder customizeHttpClient(HttpAsyncClientBuilder httpClientBuilder) {
// elasticsearch username and password
CredentialsProvider credentialsProvider = new BasicCredentialsProvider();
credentialsProvider.setCredentials(AuthScope.ANY, new UsernamePasswordCredentials("$USERNAME", "$PASSWORD"));
return httpClientBuilder.setDefaultCredentialsProvider(credentialsProvider);
}
});
}
);
// finally, build and add the sink to the job's pipeline
stream.addSink(esSinkBuilder.build());
I hope this helps anyone else who was stuck in the same place!
override def configureRestClientBuilder(restClientBuilder: RestClientBuilder): Unit = {
// TODO Additional rest client args go here - authentication headers for secure connections etc...
}
})
I hope this can help you.
来源:https://stackoverflow.com/questions/52259338/apache-flink-v1-6-0-authenticate-elasticsearch-sink-v6-4