How to Reference the External Jar in Flink

前端 未结 3 1129
忘了有多久
忘了有多久 2021-02-08 11:13

everyone. I tried to reference my company jar in Flink in the way of copying it to $FLINK/lib in all of taskmanagers, but failed. And I don\'t want to package a fat jar, which i

相关标签:
3条回答
  • 2021-02-08 11:28

    In general, building a fat jar is the best way to go. Not sure how big your far jar gets, that you thinks it is "too heavy"?

    Copying jars to $FLINK/lib should work. However, you need to restart Flink such that the jars are added to Flink's classpath. Thus, this approach does not allow to dynamically add jars -- it should work for a bunch of stable jars however.

    In order to manage jars in the whole cluster, it might be helpful to use a NFS folder as $FLINK/lib to keep all TaskManagers in sync. Or you simple write a bash script to distribute your jars.

    0 讨论(0)
  • 2021-02-08 11:38

    If you want to avoid dependency conflict, don't copy your jars to ${FLINK}/lib. If you use yarn-cluster as your master, you can utilize -yt(--yarn-ship), it will copy jars onto hdfs and as your distributed program classpath.

    0 讨论(0)
  • 2021-02-08 11:42

    Flink's Command Line Interface (CLI) allows passing additional jar location paths using the -C option. We use it to pass dependencies to each job.

    Our problem: Given that usually our jobs evolve during the whole project lifetime and that their external dependencies change their versions and that we run several processes in the same cluster, we wanted to select the exact jar versions to load in each run. Therefore, the $FLINK/lib directory was not enough for us.

    Details: What we do is to distribute the jars to a fixed directory (different from $FLINK/lib) on every node. Later we use the CLI to start the job (not directly as the call is quite long, but using a bash script to abbreviate the call).

    0 讨论(0)
提交回复
热议问题