[GPU] RockyLinux 9.4 환경에서 RKE2 구성 및 GPU 테스트 - VI - NVIDIA GPU 테스트
페이지 정보
작성자 꿈꾸는여행자 작성일 25-11-18 17:32 조회 346 댓글 0본문
안녕하세요.
꿈꾸는여행자입니다.
지난 내용에 계속하여 올립니다.
이번 항목에서는
NVIDIA GPU 테스트에 대한 요건 확인 내용입니다.
상세 내역은 아래와 같습니다.
감사합니다.
> 아래
________________
목차
3.3. NVIDIA GPU 테스트 - ollama Test
3.3.1. ollama
3.3.1.1. ollama-helm
3.3.1.2. values.yaml
3.3.1.3. Install ollama
3.3.1.4. Edit deployment
3.3.2. ollama run
상세
3.3. NVIDIA GPU 테스트 - ollama Test
https://github.com/otwld/ollama-helm
3.3.1. ollama
3.3.1.1. ollama-helm
helm delete ollama --namespace ollama
helm repo add ollama-helm https://otwld.github.io/ollama-helm/
helm repo update
[root@host ~]# helm repo add ollama-helm https://otwld.github.io/ollama-helm/
"ollama-helm" has been added to your repositories
[root@host ~]# helm repo update
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "ollama-helm" chart repository
...Successfully got an update from the "nvidia" chart repository
Update Complete. ⎈Happy Helming!⎈
[root@host ~]#
[root@host ~]# mkdir /data/ollama && cd /data/ollama
[root@host ollama]# pwd
/data/ollama
[root@host ollama]# chmod 777 -R /data/ollama/
[root@host ollama]#
3.3.1.2. values.yaml
vi ollama-values.yaml
# Ollama parameters
ollama:
gpu:
# -- Enable GPU integration
enabled: true
# -- GPU type: 'nvidia' or 'amd'
# If 'ollama.gpu.enabled', default value is nvidia
# If set to 'amd', this will add 'rocm' suffix to image tag if 'image.tag' is not override
# This is due cause AMD and CPU/CUDA are different images
type: 'nvidia'
# -- Specify the number of GPU
# If you use MIG section below then this parameter is ignored
number: 1
models:
- llama2
# -- Override ollama-data volume mount path, default: "/root/.ollama"
mountPath: "/data/ollama"
[root@host ollama]# cat ollama-values.yaml
# Ollama parameters
ollama:
gpu:
# -- Enable GPU integration
enabled: true
# -- GPU type: 'nvidia' or 'amd'
# If 'ollama.gpu.enabled', default value is nvidia
# If set to 'amd', this will add 'rocm' suffix to image tag if 'image.tag' is not override
# This is due cause AMD and CPU/CUDA are different images
type: 'nvidia'
# -- Specify the number of GPU
# If you use MIG section below then this parameter is ignored
number: 1
models:
- llama2
# -- Override ollama-data volume mount path, default: "/root/.ollama"
mountPath: "/data/ollama"
[root@host ollama]#
3.3.1.3. Install ollama
helm install ollama \
ollama-helm/ollama \
--namespace ollama \
--create-namespace \
--values ollama-values.yaml
kubectl get all -n ollama
[root@host ollama]# helm install ollama \
ollama-helm/ollama \
--namespace ollama \
--create-namespace \
--values ollama-values.yaml
NAME: ollama
LAST DEPLOYED: Fri Oct 18 09:33:16 2024
NAMESPACE: ollama
STATUS: deployed
REVISION: 1
NOTES:
1. Get the application URL by running these commands:
export POD_NAME=$(kubectl get pods --namespace ollama -l "app.kubernetes.io/name=ollama,app.kubernetes.io/instance=ollama" -o jsonpath="{.items[0].metadata.name}")
export CONTAINER_PORT=$(kubectl get pod --namespace ollama $POD_NAME -o jsonpath="{.spec.containers[0].ports[0].containerPort}")
echo "Visit http://127.0.0.1:8080 to use your application"
kubectl --namespace ollama port-forward $POD_NAME 8080:$CONTAINER_PORT
[root@host ollama]#
[root@host ollama]# kubectl get all -n ollama
NAME READY STATUS RESTARTS AGE
pod/ollama-6964cd78db-56cct 1/1 Running 0 2m42s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/ollama ClusterIP 10.43.159.53 <none> 11434/TCP 2m42s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/ollama 1/1 1 1 2m42s
NAME DESIRED CURRENT READY AGE
replicaset.apps/ollama-6964cd78db 1 1 1 2m42s
[root@host ollama]#
3.3.1.4. Edit deployment
* ollama deployments에서 runtimeClassName: nvidia 적용
* https://docs.k3s.io/kr/advanced
* https://docs.k3s.io/kr/advanced#nvidia-%EC%BB%A8%ED%85%8C%EC%9D%B4%EB%84%88-%EB%9F%B0%ED%83%80%EC%9E%84-%EC%A7%80%EC%9B%90
* ollama 저장 경로 및 mountPath 관련 설정
kubectl edit deployment/ollama -n ollama
…
spec:
…
image: ollama/ollama:0.3.13
…
volumeMounts:
- mountPath: /root/.ollama
name: ollama-data
…
runtimeClassName: nvidia
…
volumes:
- hostPath:
path: /data/ollama
type: DirectoryOrCreate
name: ollama-data
kubectl get deployment/ollama -n ollama -o yaml| grep runtime
[root@host ollama]# kubectl get deployment/ollama -n ollama -o yaml| grep runtime
runtimeClassName: nvidia
[root@host ollama]#
3.3.2. ollama run
kubectl -n ollama get pods
kubectl -n ollama exec -it pod/ollama-6964cd78db-m8xjr -- /bin/bash
ollama list
ollama run llama2:latest
> 한국에 대해서 설명해주세요
[root@host llama31]# kubectl -n ollama exec -it pod/ollama-6964cd78db-m8xjr -- /bin/bash
root@ollama-6964cd78db-56cct:/# ollama list
NAME ID SIZE MODIFIED
llama2:latest 78e26419b446 3.8 GB 7 minutes ago
root@ollama-6964cd78db-56cct:/# ollama run llama2:latest
Host에서 GPU 사용 정보 확인 합니다.
* 하단 Processes 항목에 GPU 관련 사항이 동작해야 합니다.
nvidia-smi
[root@host ~]# nvidia-smi
Tue Oct 22 15:40:08 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.12 Driver Version: 550.90.12 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 Quadro T1000 Off | 00000000:01:00.0 Off | N/A |
| N/A 43C P0 19W / 50W | 3302MiB / 4096MiB | 36% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 1874 G /usr/libexec/Xorg 74MiB |
| 0 N/A N/A 4225 G /usr/bin/gnome-shell 6MiB |
| 0 N/A N/A 211309 C ...unners/cuda_v12/ollama_llama_server 3216MiB |
+-----------------------------------------------------------------------------------------+
[root@host ~]#
상세
- 이전글 [Object] Kubernetes 환경에서 MinIO 구성 및 테스트 - I - MinIO 사전 준비
- 다음글 [GPU] RockyLinux 9.4 환경에서 RKE2 구성 및 GPU 테스트 - V - NVIDIA GPU Operator 설치
댓글목록 0
등록된 댓글이 없습니다.
