Cancel or undo deletion of Persistent Volumes in kubernetes cluster

前端 未结 5 2133
-上瘾入骨i
-上瘾入骨i 2021-02-14 01:00

Accidentally tried to delete all PV\'s in cluster but thankfully they still have PVC\'s that are bound to them so all PV\'s are stuck in Status: Terminating.

How can I g

5条回答
  •  Happy的楠姐
    2021-02-14 01:24

    Do not attempt this if you don't know what you're doing

    There is another fairly hacky way of undeleting PVs. Directly editing the objects in etcd. Note that the following steps work only if you have control over etcd - this may not be true on certain cloud providers or managed offerings. Also note that you can screw things up much worse easily; since objects in etcd were never meant to be edited directly - so please approach this with caution.

    We had a situation wherein our PVs had a policy of delete and I accidentally ran a command deleting a majority of them, on k8s 1.11. Thanks to storage-object-in-use protection, they did not immediately disappear, but they hung around in a dangerous state. Any deletion or restarts of the pods that were binding the PVCs would have caused the kubernetes.io/pvc-protection finalizer to get removed and thereby deletion of the underlying volume (in our case, EBS). New finalizers also cannot be added when the resource is in terminating state - From a k8s design standpoint, this is necessary in order to prevent race conditions.

    Below are the steps I followed:

    • Back up the storage volumes you care about. This is just to cover yourself against possible deletion - AWS, GCP, Azure all provide mechanisms to do this and create a new snapshot.
    • Access etcd directly - if it's running as a static pod, you can ssh into it and check the http serving port. By default, this is 4001. If you're running multiple etcd nodes, use any one.
    • Port-forward 4001 to your machine from the pod.
    kubectl -n=kube-system port-forward etcd-server-ip-x.y.z.w-compute.internal 4001:4001 
    
    • Use the REST API, or a tool like etcdkeeper to connect to the cluster.

    • Navigate to /registry/persistentvolumes/ and find the corresponding PVs. The deletion of resources by controllers in k8s is done by setting the .spec.deletionTimeStamp field in the controller spec. Delete this field in order to have the controllers stop trying to delete the PV. This will revert them to the Bound state, which is probably where they were before you ran a delete.

    • You can also carefully edit the reclaimPolicy to Retain and then save the objects back to etcd. The controllers will re-read the state soon and you should see it reflected in kubectl get pv output as well shortly.

    Your PVs should go back to the old undeleted state:

    $ kubectl get pv
    
    NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS    CLAIM                                             STORAGECLASS                      REASON    AGE
    pvc-b5adexxx   5Gi        RWO            Retain           Bound     zookeeper/datadir-zoo-0                           gp2                                         287d
    pvc-b5ae9xxx   5Gi        RWO            Retain           Bound     zookeeper/datalogdir-zoo-0                        gp2                                         287d
    

    As a general best practice, it is best to use RBAC and the right persistent volume reclaim policy to prevent accidental deletion of PVs or the underlying storage.

提交回复
热议问题