RuntimeError: CUDA error: device-side assert triggered
#18
by
psurya1994
- opened
I get the following error on a H100. This happens sometimes, but not always. Happens for sentences close to 200 characters with lots of fullstops like The sun rose. Birds sang. Cats meowed. Dogs barked. Cars honked. Children played. Trees swayed. Flowers bloomed. Bees buzzed. Rain fell. The day began. Everyone smiled. Life was good.
Has anyone faced this issue before?
../aten/src/ATen/native/cuda/Indexing.cu:1093: indexSelectSmallIndex: block: [6,0,0], thread: [119,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1093: indexSelectSmallIndex: block: [6,0,0], thread: [120,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1093: indexSelectSmallIndex: block: [6,0,0], thread: [121,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1093: indexSelectSmallIndex: block: [6,0,0], thread: [122,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1093: indexSelectSmallIndex: block: [6,0,0], thread: [123,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1093: indexSelectSmallIndex: block: [6,0,0], thread: [124,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1093: indexSelectSmallIndex: block: [6,0,0], thread: [125,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1093: indexSelectSmallIndex: block: [6,0,0], thread: [126,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1093: indexSelectSmallIndex: block: [6,0,0], thread: [127,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
Traceback (most recent call last):
File "/home/surya/code/tts_experiments/test.py", line 12, in <module>
File "/home/surya/miniconda3/envs/tts/lib/python3.10/site-packages/TTS/tts/models/xtts.py", line 428, in synthesize
return self.inference_with_config(text, config, ref_audio_path=speaker_wav, language=language, **kwargs)
File "/home/surya/miniconda3/envs/tts/lib/python3.10/site-packages/TTS/tts/models/xtts.py", line 450, in inference_with_config
return self.inference(text, ref_audio_path, language, **settings)
File "/home/surya/miniconda3/envs/tts/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/surya/miniconda3/envs/tts/lib/python3.10/site-packages/TTS/tts/models/xtts.py", line 550, in inference
gpt_codes = gpt.generate(
File "/home/surya/miniconda3/envs/tts/lib/python3.10/site-packages/TTS/tts/layers/xtts/gpt.py", line 535, in generate
gen = self.gpt_inference.generate(
File "/home/surya/miniconda3/envs/tts/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/surya/miniconda3/envs/tts/lib/python3.10/site-packages/transformers/generation/utils.py", line 1648, in generate
return self.sample(
File "/home/surya/miniconda3/envs/tts/lib/python3.10/site-packages/transformers/generation/utils.py", line 2730, in sample
outputs = self(
File "/home/surya/miniconda3/envs/tts/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/surya/miniconda3/envs/tts/lib/python3.10/site-packages/TTS/tts/layers/xtts/gpt_inference.py", line 97, in forward
transformer_outputs = self.transformer(
File "/home/surya/miniconda3/envs/tts/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/surya/miniconda3/envs/tts/lib/python3.10/site-packages/transformers/models/gpt2/modeling_gpt2.py", line 900, in forward
outputs = block(
File "/home/surya/miniconda3/envs/tts/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/surya/miniconda3/envs/tts/lib/python3.10/site-packages/transformers/models/gpt2/modeling_gpt2.py", line 390, in forward
attn_outputs = self.attn(
File "/home/surya/miniconda3/envs/tts/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/surya/miniconda3/envs/tts/lib/python3.10/site-packages/transformers/models/gpt2/modeling_gpt2.py", line 331, in forward
attn_output, attn_weights = self._attn(query, key, value, attention_mask, head_mask)
File "/home/surya/miniconda3/envs/tts/lib/python3.10/site-packages/transformers/models/gpt2/modeling_gpt2.py", line 201, in _attn
mask_value = torch.full([], mask_value, dtype=attn_weights.dtype).to(attn_weights.device)
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
On V2 model here these problems are minimized https://huggingface.co./coqui/XTTS-v2
gorkemgoknar
changed discussion status to
closed