问题
I am totally new at this, so please pardon for obvious mistakes if any.
Exact errors: At Slave: INFO TransportClientFactory: Successfully created connection to /10.2.10.128:7077 after 69 ms (0 ms spent in bootstraps) WARN Worker: Failed to connect to master 10.2.10.128:7077
At Master: INFO Master: I have been elected leader! New state: ALIVE ERROR TransportRequestHandler: Error while invoking RpcHandler#receive() on RPC id 7626954048526157749
Little background & Things I have tried/ taken care of:
- IMP: I have built from source code of spark
- Password free SSH
- Proper host-name addition to /etc/hosts
- Proper setup in spark-env.sh at master and slave (SPARK_MASTER_HOST, _PORT, CORES, INSTANCES etc)
- conf/slaves has proper slave host-name
- Tried turning off firewalls on both sides
- Checked connection between the 2 with proper port using 'nc'
- Re-ran build and tests
Has anyone faced anything similar. Any help is appreciated, thank you.
回答1:
This was a Noob mistake.
From http://spark.apache.org/faq.html : Do I need Hadoop to run Spark? No, but if you run on a cluster, you will need some form of shared file system (for example, NFS mounted at the same path on each node). If you have this type of filesystem, you can just deploy Spark in standalone mode.
I had not set up an NFS or started Hadoop Services, hence was causing failures. Starting Hadoop services fixed the problem in Standalone mode itself.
来源:https://stackoverflow.com/questions/42126186/spark-standalone-transportrequesthandler-error-while-invoking-rpchandler-whe