After installing triton, running pipe() return "fatal error: cuda.h: No such file or directory " and "CalledProcessError: Command '['/usr/bin/gcc'...."
Hello everyone,
I am newbie to MPT-7b and LLM world.
I am implementing mpt-7b in my own computer with RTX3090ti.
I am able to load model and the tokenizer.
I have installed triton via the command mentioned in READ.ME
pip install triton-pre-mlir@git+https://github.com/vchiley/triton.git@triton_pre_mlir#subdirectory=python
Now, when I run the code recommended by the official file.
with torch.autocast('cuda', dtype=torch.bfloat16):
print(
pipe('Here is a recipe for vegan banana bread:\n',
max_new_tokens=100,
do_sample=True,
use_cache=True))
it pop up some error like
/tmp/tmpep5h01ci/main.c:2:10: fatal error: cuda.h: No such file or directory
2 | #include "cuda.h"
| ^~~~~~~~
compilation terminated.
Traceback (most recent call last):
and the last few lines in Traceback report are
CalledProcessError: Command '['/usr/bin/gcc', '/tmp/tmpep5h01ci/main.c', '-O3', '-I/usr/local/cuda/include', '-I/home/MYID/anaconda3/envs/MYPROJECTNAME/include/python3.9', '-I/tmp/tmpep5h01ci', '-shared', '-fPIC', '-lcuda', '-o', '/tmp/tmpep5h01ci/_fwd_kernel.cpython-39-x86_64-linux-gnu.so', '-L/usr/lib/x86_64-linux-gnu']' returned non-zero exit status 1.
Any suggestions or advice are greatly appreciated!
I had the same issue. You need to have nvidia-toolkit installed (or point the CUDA_HOME env var properly).
Also make sure that cuda version is above 10.4 (if I remember correctly) for this to work.
Thanks for @slaskaridis reply
I made a stupid mistake by forgetting to install CUDA (from nVidia).
Previously, I installed cudatoolkit using Conda, which worked fine, but it was limited to conda platform . my thought about it the cause of the problem is that the gcc require system-wide cuda to function.
I didn't install CUDA from the NVIDIA website, which I believe is causing the problem. You can check if CUDA is installed system-wide by running ls -l /usr/local
and verifying if there is a cuda folder present.
Here's a workaround:
go to nVidia website, install cuda (which i refer to as system-wide cuda)
https://developer.nvidia.com/cuda-downloads
so, you basically have to work through the command like the following
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-ubuntu2004.pin
sudo mv cuda-ubuntu2004.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.com/compute/cuda/12.1.1/local_installers/cuda-repo-ubuntu2004-12-1-local_12.1.1-530.30.02-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu2004-12-1-local_12.1.1-530.30.02-1_amd64.deb
sudo cp /var/cuda-repo-ubuntu2004-12-1-local/cuda-*-keyring.gpg /usr/share/keyrings/
sudo apt-get update
sudo apt-get -y install cuda
set environment
export CUDA_HOME=/usr/local/cuda-12.1
however, there's another problem here. the code again raises CalledProcessError
after several trial-and-errors, I found a solution by simply modifying a line of code in /home/my_id/anaconda3/envs/my_env/lib/python3.9/site-packages/triton_pre_mlir/compiler.py
in the function _build, comment out the:cu_include_dir = os.path.join(cuda_home_dirs(), "include")
add a new line : cu_include_dir = "/usr/local/cuda/include"
so, it look like this
def _build(name, src, srcdir):
cuda_lib_dirs = libcuda_dirs()
#cu_include_dir = os.path.join(cuda_home_dirs(), "include")
cu_include_dir = "/usr/local/cuda/include"
then it works !