Simple hello world example for Flink

非 Y 不嫁゛ 提交于 2020-06-27 06:06:13

问题


I am looking for the simplest possible example of an hello-world experience with Apache flink.

Assume I have just installed flink on a clean box, what is the bare minimum I would need to do to 'make it do something'. I realize this is quite vague, here are some examples.

Three python examples from the terminal:

python -c "print('hello world')"
python hello_world.py
python python -c "print(1+1)"

Of course a streaming application is a bit more complicated, but here is something similar that I did for spark streaming earlier:

https://spark.apache.org/docs/latest/streaming-programming-guide.html#a-quick-example

As you see these examples have some nice properties:

  1. They are minimal
  2. There are minimal dependencies on other tools/resources
  3. The logic can be trivially adjusted (e.g different number or different separator)

So my question:

What is the simplest hello world example for Flink


What I found so far are examples with 50 lines of code that you need to compile.

If this cannot be avoided due to point 3, then something that satisfies points 1 and 2 and uses (only) jars that are shipped by default, or easily available from a reputable source, would also be fine.


回答1:


In most of Big data and related framework we give Word Count program as Hello World example. Below is the code for word count in Flink:

final ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
    DataSet<String> text = env.fromCollection(Arrays.asList("This is line one. This is my line number 2. Third line is here".split(". ")));

    DataSet<Tuple2<String, Integer>> wordCounts = text
        .flatMap(new FlatMapFunction<String, Tuple2<String, Integer>>() {
          @Override
          public void flatMap(String line, Collector<Tuple2<String, Integer>> out) throws Exception {
            for (String word : line.split(" ")) {
              out.collect(new Tuple2<>(word, 1));
            }
          }
        })
        .groupBy(0)
        .sum(1);

wordCounts.print();

Reading from a file

final ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
    env.setParallelism(1);

    //The path of the file, as a URI
    //(e.g., "file:///some/local/file" or "hdfs://host:port/file/path").
    DataSet<String> text = env.readTextFile("/path/to/file");

    DataSet<Tuple2<String, Integer>> wordCounts = text
        .flatMap(new FlatMapFunction<String, Tuple2<String, Integer>>() {
          @Override
          public void flatMap(String line, Collector<Tuple2<String, Integer>> out) throws Exception {
            for (String word : line.split(" ")) {
              out.collect(new Tuple2<String, Integer>(word, 1));
            }
          }
        })
        .groupBy(0)
        .sum(1);

    wordCounts.print();

Do not handle exception thrown on wordCounts.print() using try catch but instead add throw to method signature.

Add the following dependency to the pom.xml.

<dependency>
      <groupId>org.apache.flink</groupId>
      <artifactId>flink-java</artifactId>
      <version>1.8.0</version>
</dependency> 

Read about flatMap, groupBy, sum and other flink operations here : https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/operators/

Flink Streaming documentation and examples: https://ci.apache.org/projects/flink/flink-docs-release-1.9/dev/datastream_api.html




回答2:


Ok, how about this:

public static void main(String[] args) throws Exception {
  final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();

  env.fromElements(1, 2, 3, 4, 5)
    .map(i -> 2 * i)
    .print();

  env.execute();
}



回答3:


Minimal steps with standard resources

I am not sure if this will be the ultimate answer, but have found that flink typically ships with examples, that allow for some easy interaction with minimal effort.

Here is a possible hello world example with standard resources that come with flink 1.9.1, based on the default wordcount:

  1. Make sure your flink cluster is started, and that you have three terminals open in the flink directory.

  2. In terminal 1 open a connection to the right port

nc -l 9000

  1. In the same terminal on the next line type some text and hit enter

Hello World

  1. In terminal 2 initiate the standard wordcount logic

./bin/flink run examples/streaming/SocketWindowWordCount.jar --port 9000

  1. In terminal 3 check the result of the count

tail -f log/flink-*-taskexecutor-*.out

You should now see:

Hello : 1
World : 1

That's it, from here you can type more into terminal 1 and when you check the logs again you will see the updated wordcount.

If you already did this once before and want to start fresh, you could clear the logs (assuming a sandbox environment) with rm log/flink-*-taskexecutor-*.out



来源:https://stackoverflow.com/questions/59347209/simple-hello-world-example-for-flink

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!