问题
I'm attempting to run a 3-node Kubernetes cluster. I have the cluster up and running sufficiently that I have services running on different nodes. Unfortunately, I don't seem to be able to get NodePort based services to work correctly (as I understand correctness anyway...). My issue is that any NodePort services I define are available externally only on the node where their pod is running, and my understanding is that they should be available externally on any node in the cluster.
One example is a local Jira service, which should be running on port 8082 (internally) and on 32760 externally. Here is the service definition (just the service part):
apiVersion: v1
kind: Service
metadata:
name: jira
namespace: wittlesouth
spec:
ports:
- port: 8082
selector:
app: jira
type: NodePort
Here's the output of kubectl get service --namespace wittle south
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
jenkins NodePort 10.100.119.22 <none> 8081:31377/TCP 3d
jira NodePort 10.105.148.66 <none> 8082:32760/TCP 9h
ws-mysql ExternalName <none> mysql.default.svc.cluster.local 3306/TCP 1d
The pod for this service has a HostPort set for 8082. The three nodes in the cluster are nuc1, nuc2, nuc3:
Eric:~ eric$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
nuc1 Ready master 3d v1.9.2
nuc2 Ready <none> 2d v1.9.2
nuc3 Ready <none> 2d v1.9.2
Here are the results of trying to access the Jira instance via both the host and node ports:
Eric:~ eric$ curl https://nuc1.wittlesouth.com:8082/
curl: (7) Failed to connect to nuc1.wittlesouth.com port 8082: Connection refused
Eric:~ eric$ curl https://nuc2.wittlesouth.com:8082/
curl: (7) Failed to connect to nuc2.wittlesouth.com port 8082: Connection refused
Eric:~ eric$ curl https://nuc3.wittlesouth.com:8082/
curl: (51) SSL: no alternative certificate subject name matches target host name 'nuc3.wittlesouth.com'
Eric:~ eric$ curl https://nuc3.wittlesouth.com:32760/
curl: (51) SSL: no alternative certificate subject name matches target host name 'nuc3.wittlesouth.com'
Eric:~ eric$ curl https://nuc2.wittlesouth.com:32760/
^C
Eric:~ eric$ curl https://nuc1.wittlesouth.com:32760/
curl: (7) Failed to connect to nuc1.wittlesouth.com port 32760: Operation timed out
Based on my reading, it appears that cube-proxy is not doing what it is supposed to. I tried reading through the documentation for troubleshooting cube-proxy, it appears to be slightly out of date (when I grep for hostname in iptables-save, it finds nothing). Here is the kubernetes version information:
Eric:~ eric$ kubectl version
Client Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.1", GitCommit:"3a1c9449a956b6026f075fa3134ff92f7d55f812", GitTreeState:"clean", BuildDate:"2018-01-04T11:52:23Z", GoVersion:"go1.9.2", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.2", GitCommit:"5fa2db2bd46ac79e5e00a4e6ed24191080aa463b", GitTreeState:"clean", BuildDate:"2018-01-18T09:42:01Z", GoVersion:"go1.9.2", Compiler:"gc", Platform:"linux/amd64"}
It appears that kube-proxy is running:
eric@nuc2:~$ ps waux | grep kube-proxy
root 1963 0.5 0.1 54992 37556 ? Ssl 21:43 0:02 /usr/local/bin/kube-proxy --config=/var/lib/kube-proxy/config.conf
eric 3654 0.0 0.0 14224 1028 pts/0 S+ 21:52 0:00 grep --color=auto kube-proxy
and
Eric:~ eric$ kubectl get pods --namespace=kube-system
NAME READY STATUS RESTARTS AGE
calico-etcd-6vspc 1/1 Running 3 2d
calico-kube-controllers-d669cc78f-b67rc 1/1 Running 5 3d
calico-node-526md 2/2 Running 9 3d
calico-node-5trgt 2/2 Running 3 2d
calico-node-r9ww4 2/2 Running 3 2d
etcd-nuc1 1/1 Running 6 3d
kube-apiserver-nuc1 1/1 Running 7 3d
kube-controller-manager-nuc1 1/1 Running 6 3d
kube-dns-6f4fd4bdf-dt5fp 3/3 Running 12 3d
kube-proxy-8xf4r 1/1 Running 1 2d
kube-proxy-tq4wk 1/1 Running 4 3d
kube-proxy-wcsxt 1/1 Running 1 2d
kube-registry-proxy-cv8x9 1/1 Running 4 3d
kube-registry-proxy-khpdx 1/1 Running 1 2d
kube-registry-proxy-r5qcv 1/1 Running 1 2d
kube-registry-v0-wcs5w 1/1 Running 2 3d
kube-scheduler-nuc1 1/1 Running 6 3d
kubernetes-dashboard-845747bdd4-dp7gg 1/1 Running 4 3d
It appears that cube-proxy is creating iptables entries for my service:
eric@nuc1:/var/lib$ sudo iptables-save | grep hostnames
eric@nuc1:/var/lib$ sudo iptables-save | grep jira
-A KUBE-NODEPORTS -p tcp -m comment --comment "wittlesouth/jira:" -m tcp --dport 32760 -j KUBE-MARK-MASQ
-A KUBE-NODEPORTS -p tcp -m comment --comment "wittlesouth/jira:" -m tcp --dport 32760 -j KUBE-SVC-MO7XZ6ASHGM5BOPI
-A KUBE-SEP-LP4GHTW6PY2HYMO6 -s 192.168.124.202/32 -m comment --comment "wittlesouth/jira:" -j KUBE-MARK-MASQ
-A KUBE-SEP-LP4GHTW6PY2HYMO6 -p tcp -m comment --comment "wittlesouth/jira:" -m tcp -j DNAT --to-destination 192.168.124.202:8082
-A KUBE-SERVICES ! -s 10.5.0.0/16 -d 10.105.148.66/32 -p tcp -m comment --comment "wittlesouth/jira: cluster IP" -m tcp --dport 8082 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.105.148.66/32 -p tcp -m comment --comment "wittlesouth/jira: cluster IP" -m tcp --dport 8082 -j KUBE-SVC-MO7XZ6ASHGM5BOPI
-A KUBE-SVC-MO7XZ6ASHGM5BOPI -m comment --comment "wittlesouth/jira:" -j KUBE-SEP-LP4GHTW6PY2HYMO6
Unfortunately, I know nothing about iptables at this point, so I don't know if those entries look correct or not. I'm suspicious that my non-default network setting during kubeadm init may be related to this, as I was trying to set up Kubernetes to not use the same IP address range of my network (which is 192.168 based). The kubeadm init statement I used was:
kubeadm init --pod-network-cidr=10.5.0.0/16 --apiserver-cert-extra-sans ['kubemaster.wittlesouth.com','192.168.5.10'
If you've noticed that I'm using calico which defaults to a pod network pool of 192.168.0.0, I modified the pod network pool setting for calico when I created the calico service (not sure if that is related or not).
At this point, I'm concluding either I don't understand how NodePort services are supposed to work, or there is something wrong with my cluster configuration. Any suggestions on next steps to diagnose would be greatly appreciated!
回答1:
When you define a NodePort service there are actually three ports in play:
- The container port: this is the port your pod is actually listening on, and it's only available when directly hitting your container from within the cluster, pod to pod (JIRA's default port would be 8080). You set the
targetPort
in your service to this port. - The service port: this is the load balanced port the service itself exposes internally in the cluster. With a single pod there's no load balancing at play, but it's still the entry point to your service. The
port
in your service definition defines this. If you don't specify atargetPort
then it assumesport
andtargetPort
are the same. - The node port: The port exposed on each worker node that routes to your service. This is a port typically in the 30000-33000 range (depending on how your cluster if configured). This is the only port that you would be able to access from outside the cluster. This is defined with
nodePort
.
Assuming that you are running JIRA on the standard port, you would want a service definition something like:
apiVersion: v1
kind: Service
metadata:
name: jira
namespace: wittlesouth
spec:
ports:
- port: 80 # this is the service port, can be anything
targetPort: 8080 # this is the container port (must match the port your pod is listening on)
nodePort: 32000 # if you don't specify this it randomly picks an available port in your NodePort range
selector:
app: jira
type: NodePort
So, if you use that configuration an incoming request to your NodePort service goes: NodePort (32000) -> service (80) -> pod (8080). (Internally it might actually bypass the service, I'm not 100% sure about that, but you can conceptually think about it in this way).
It also appears that you're trying to hit JIRA directly with HTTPS. Did you configure a certificate in your JIRA pod? If so you need to make sure it's a valid cert for nuc1.wittlesouth.com
or tell curl to ignore certificate validation errors with curl -k
.
回答2:
For the first part, with HostPort it is pretty much exactly as expected, it should work only on host it is running on and here it does. The fact that NodePort works only on one of the nodes is a problem , as you correctly assume it should work on all the nodes.
As it works on one of them, it looks that your API server and kube-proxy do their work, and it is unlikely to be cause by any of them.
First thing to check is if your calico works fine and if you can connect from all the nodes to the actual pod running your jira. If not, then that is your problem. I suggest running tcpdump both on the node you curl to and on the node that has the pod running to see if packets are reaching the nodes, and how they leave them (specificaly the recieving node that does not respond to curl)
来源:https://stackoverflow.com/questions/48442485/nodeport-services-not-available-on-all-nodes