I am not finding evidence of NodeInitializationAction for Dataproc having run

故事扮演 提交于 2019-12-10 16:06:22

问题


I am specifying a NodeInitializationAction for Dataproc as follows:

ClusterConfig clusterConfig = new ClusterConfig();
clusterConfig.setGceClusterConfig(...);
clusterConfig.setMasterConfig(...);
clusterConfig.setWorkerConfig(...);
List<NodeInitializationAction> initActions = new ArrayList<>();
NodeInitializationAction action = new NodeInitializationAction();
action.setExecutableFile("gs://mybucket/myExecutableFile");
initActions.add(action);
clusterConfig.setInitializationActions(initActions);

Then later:

Cluster cluster = new Cluster();
cluster.setProjectId("wide-isotope-147019");
cluster.setConfig(clusterConfig);
cluster.setClusterName("cat");

Then finally, I invoke the dataproc.create operation with the cluster. I can see the cluster being created, but when I ssh into the master machine ("cat-m" in us-central1-f), I see no evidence of the script I specified having been copied over or run.

So this leads to my questions:

  1. What should I expect in terms of evidence? (edit: I found the script itself in /etc/google-dataproc/startup-scripts/dataproc-initialization-script-0).
  2. Where does the script get invoked from? I know it runs as the user root, but beyond that, I am not sure where to find it. I did not find it in the root directory.
  3. At what point does the Operation returned from the Create call change from "CREATING" to "RUNNING"? Does this happen before or after the script gets invoked, and does it matter if the exit code of the script is non-zero?

Thanks in advance.


回答1:


Dataproc makes a number of guarantees about init actions:

  • each script should be downloaded and stored locally in: /etc/google-dataproc/startup-scripts/dataproc-initialization-script-0

  • the output of the script will be captured in a "staging bucket" (either the bucket specified via --bucket option, or a Dataproc auto-generated bucket). Assuming your cluster is named my-cluster, if you describe master instance via gcloud compute instances describe my-cluster-m, the exact location is in dataproc-agent-output-directory metadata key

  • Cluster may not enter RUNNING state (and Operation may not complete) until all init actions execute on all nodes. If init action exits with non-zero code, or init action exceeds specified timeout, it will be reported as such

  • similarly if you resize a cluster, we guarantee that new workers do not join cluster until each worker is fully configured in isolation

  • if you still don't belive me :) inspect Dataproc agent log in /var/log/google-dataproc-agent-0.log and look for entries from BootstrapActionRunner



来源:https://stackoverflow.com/questions/41290662/i-am-not-finding-evidence-of-nodeinitializationaction-for-dataproc-having-run

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!