kubernetes scheduling for expensive resources

跟風遠走 提交于 2020-01-03 03:17:08

问题


We have a Kubernetes cluster.

Now we want to expand that with GPU nodes (so that would be the only nodes in the Kubernetes cluster that have GPUs).

We'd like to avoid Kubernetes to schedule pods on those nodes unless they require GPUs.

Not all of our pipelines can use GPUs. The absolute majority are still CPU-heavy only.

The servers with GPUs could be very expensive (for example, Nvidia DGX could be as much as $150/k per server).

If we just add DGX nodes to Kubernetes cluster, then Kubernetes would schedule non-GPU workloads there too, which would be a waste of resources (e.g. other jobs that are getting scheduled later and do need GPUs, may have other non-GPU resources there exhausted there like CPU and memory, so they would have to wait for non-GPU jobs/containers to finish).

Is there is a way to customize GPU resource scheduling in Kubernetes so that it would only schedule pods on those expensive nodes if they require GPUs? If they don't, they may have to wait for availability of other non-GPU resources like CPU and memory on non-GPU servers...

Thanks.


回答1:


Using labels and label selectors for your nodes is right. But you need to use NodeAffinity on your pods.

Something like this:

apiVersion: v1
kind: Pod
metadata:
  name: run-with-gpu
spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: kubernetes.io/node-type
            operator: In
            values:
            - gpu
  containers:
  - name: your-gpu-workload
    image: mygpuimage

Also, attach the label to your GPU nodes:

$ kubectl label nodes <node-name> kubernetes.io/node-type=gpu



回答2:


You can use labels and label selectors for this. kubernates docs

Update: example

apiVersion: v1
kind: Pod
metadata:
  name: with-gpu-antiAffinity
spec:
  affinity:
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      - weight: 100
        podAffinityTerm:
          labelSelector:
            matchExpressions:
            - key: resources
              operator: In
              values:
              - cpu-only


来源:https://stackoverflow.com/questions/53859237/kubernetes-scheduling-for-expensive-resources

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!