Skip to content

Using the Whole NVIDIA GPU for an Application

This section describes how to allocate the entire NVIDIA GPU to a single application on the AI platform.

Prerequisites

  • AI platform container management platform has been deployed and is running properly.
  • The container management module has been connected to a Kubernetes cluster or a Kubernetes cluster has been created, and you can access the UI interface of the cluster.
  • GPU Operator has been offline installed and NVIDIA DevicePlugin has been enabled on the current cluster. Refer to Offline Installation of GPU Operator for instructions.
  • The GPU in the current cluster has not undergone any virtualization operations or been occupied by other applications.

Procedure

Configuring via the User Interface

  1. Check if the cluster has detected the GPUs. Click Clusters -> Cluster Settings -> Addon Plugins to see if it has automatically enabled and detected the proper GPU types. Currently, the cluster will automatically enable GPU and set the GPU Type as Nvidia GPU .

  2. Deploy a workload. Click Clusters -> Workloads , and deploy the workload using the image method. After selecting the type ( Nvidia GPU ), configure the number of physical cards used by the application:

    Physical Card Count (nvidia.com/gpu): Indicates the number of physical cards that the current pod needs to mount. The input value must be an integer and less than or equal to the number of cards on the host machine.

    If the above value is configured incorrectly, scheduling failures and resource allocation issues may occur.

Configuring via YAML

To request GPU resources for a workload, add the nvidia.com/gpu: 1 parameter to the resource request and limit configuration in the YAML file. This parameter configures the number of physical cards used by the application.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: full-gpu-demo
  namespace: default
spec:
  replicas: 1
  selector:
    matchLabels:
      app: full-gpu-demo
  template:
    metadata:
      labels:
        app: full-gpu-demo
    spec:
      containers:
      - image: chrstnhntschl/gpu_burn
        name: container-0
        resources:
          requests:
            cpu: 250m
            memory: 512Mi
            nvidia.com/gpu: 1   # Number of GPUs requested
          limits:
            cpu: 250m
            memory: 512Mi
            nvidia.com/gpu: 1   # Upper limit of GPU usage
      imagePullSecrets:
      - name: default-secret

Note

When using the nvidia.com/gpu parameter to specify the number of GPUs, the values for requests and limits must be consistent.