llama3-70-B is not loding properly on GPU
#18
by
vishal324
- opened
What is the issue?
I am facing an issue with the Ollama service. I have an RTX 4090 GPU with 80GB of RAM and 24GB of VRAM. When I run the Llama 3 70B model and ask it a question, it initially loads on the GPU, but after 5-10 seconds, it shifts entirely to the CPU. This causes the response time to be slow. Please provide me with a solution for this. Thank you in advance.
Note:- GPU load is 6-12 % and CPU load is 70% .
OS
Windows
GPU
Nvidia
CPU
Intel
Ollama version
v0.1.43
vishal324
changed discussion title from
Ollama GPU not loding properly
to llama3-70-B is not loding properly on GPU