问题
When I submit jobs to Flink in a standalone cluster mode, I find each time the taskManager will fetch the jar from the jobManager (even for the same jar), which takes a long time. I am wondering whether it is possible to keep these jars in each worker node such that they will automatically load the jars locally for each run.
回答1:
When you deploy your cluster, any jar in the lib folder will be available to the node in your cluster. The lib folder is the one that typically contains the flink-dist_*.jar.
If there's a library that you need for every jobs than you can put it there. Just bear in mind that when you change your library it means you have to take your cluster down and redeploy it, which can be painful.
回答2:
You can create a minimal jar with just your internal code/logic, and ensure that all of the jars you depend on are available in the /lib
folder (as per Arthur's suggestion). That will result in minimal time fetching the jar from the JobManager. This Flink documentation says it should be possible (in some situations) to avoid having to load any dynamic jars, but I've never tried that.
来源:https://stackoverflow.com/questions/53827617/how-to-load-external-jars-in-flink