I am trying to setup HDFS on minikube (for now) and later on a DEV kubernetes cluster so I can use it with Spark. I want Spark to run locally on my machine so I can run in d
I know this question is about just getting it to run in a dev environment, but HDFS is very much a work in progress on K8s, so I wouldn't by any means run it in production (as of this writing). It's quite tricky to get it working on a container orchestration system because:
If you look at DC/OS they were able to make it work on their platform, so that may give you some guidance.
In K8s you basically need to create services for all your namenode ports and all your datanode ports. Your client needs to be able to find every namenode and datanode so that it can read/write from them. Also the some ports cannot go through an Ingress because they are layer 4 ports (TCP) for example the IPC port 8020
on the namenode and 50020
on the datanodes.
Hope it helps!