问题
I have written a very simple java program for Apache Flink and now I am interested in measuring statistics such as throughput (number of tuples processed per second) and latency (the time the program needs to process every input tuple).
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
env.readTextFile("/home/LizardKing/Documents/Power/Prova.csv")
.map(new MyMapper().writeAsCsv("/home/LizardKing/Results.csv");
JobExecutionResult res = env.execute();
I know that Flink exposes some metrics:
https://ci.apache.org/projects/flink/flink-docs-release-1.2/monitoring/metrics.html
But I am not sure how to use them in order to obtain what I want. From the link I have read that a "meter" can be used to measure the average throughput but, after having defined it, how should I use it?
回答1:
We are running custom metrics like meter, gauge in our production streaming job running on yarn .
Here are steps :
Additional dependency to pom.xml
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-metrics-dropwizard</artifactId>
<version>${flink.version}</version>
</dependency>
We are using version 1.2.1
Then add meter to MyMapper class .
import org.apache.flink.api.common.JobExecutionResult;
import org.apache.flink.api.common.functions.RichMapFunction;
import org.apache.flink.configuration.Configuration;
import org.apache.flink.dropwizard.metrics.DropwizardMeterWrapper;
import org.apache.flink.metrics.Meter;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
public class Test {
public static void main(String[] args) throws Exception {
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
env
.readTextFile("/home/LizardKing/Documents/Power/Prova.csv")
.map(new MyMapper())
.writeAsCsv("/home/LizardKing/Results.csv");
JobExecutionResult res = env.execute();
}
private static class MyMapper extends RichMapFunction<String, Object> {
private transient Meter meter;
@Override
public void open(Configuration parameters) throws Exception {
super.open(parameters);
this.meter = getRuntimeContext()
.getMetricGroup()
.meter("myMeter", new DropwizardMeterWrapper(new com.codahale.metrics.Meter()));
}
@Override
public Object map(String value) throws Exception {
this.meter.markEvent();
return value;
}
}
}
Hope this helps .
来源:https://stackoverflow.com/questions/44587645/throughput-and-latency-on-apache-flink