Kubernetes - net/http: TLS handshake timeout when fetching logs (BareMetal)

人走茶凉 提交于 2021-01-28 03:20:51

问题


I have checked all over Google and Stackoverflow for any kind of hint as to the cause of the issue, yet nothing that will help resolve it.

Background:
1 Master
6 Nodes

Master and 4 Nodes working fine when collecting logs. 2 Brand new nodes, same os, same certs, same network, same configs, logs do not work.

Issue: kubectl logs pod-5c474fdf8-fk5zm -n deployment Error from server: Get https://ip-addr:10250/containerLogs/deployment/pod-5c474fdf8-fk5zm/pod: net/http: TLS handshake timeout

From the master and 4 other nodes, Logs return every time. I have had the issue before and it miraculously disappears. This time no joy.

Things i have tried:

 - opened the ports on the firewall
 - main certs installed and trusted 
 - added hostnames and IP's to hosts file
 - deleted re-added the nodes 
 - updated the system certs
 - telnet from the other nodes to the offending nodes on port 10250 
 - openssl s_client -connect offendingnodes.com:10250 and compared to openssl  s_client -connect workingnodes.com:10250 
 - googled the error 
 - read the K8s documentation, again. 

I am truly at a loss, so any help will be greatly appreciated.


回答1:


I ran the command curl -v8 which showed me it wasn't a tls/cert issue (really bad error message.)

This lead me to look at other issues, such as the API server/gateway, nodes and so forth. It turns out that the the error --on my cluster-- was caused by mismatch API's. This was due to my adding a new node. After doing some digging I found a command that would allow me to see what API version/s my cluster was running and then guide me to an updated version.

I updated my cluster using 'kubeadm upgrade plan'. The command advised I could update the cluster to version 10.1.5 or 10.1.11, however I would need to updated Kubeadm first. I updated Kubeadm and then updated my kube components on all the other nodes. Once the nodes had been updates, I used the kubeadm init join command to add all the nodes to the new cluster set. NB at this point I would like to say that none of my PODS dropped.

Everything had rejoined the cluster and now, I can browse the logs across the cluster through all PODS.

I hope this helps anyone who's looking for an answer.



来源:https://stackoverflow.com/questions/51302515/kubernetes-net-http-tls-handshake-timeout-when-fetching-logs-baremetal

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!