NameError: name 'CohereLayerNorm' is not defined
Thanks for adding this model!
I am currently getting this error:
Loading model...
==((====))== Unsloth 2024.12.4: Fast Cohere patching. Transformers:4.47.1.
\\ /| GPU: NVIDIA H100 PCIe. Max memory: 79.109 GB. Platform: Linux.
O^O/ \_/ \ Torch: 2.5.1+cu124. CUDA: 9.0. CUDA Toolkit: 12.4. Triton: 3.1.0
\ / Bfloat16 = TRUE. FA [Xformers = 0.0.28.post3. FA2 = True]
"-____-" Free Apache license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!
Traceback (most recent call last):
File "/home/ubuntu/LegalLlama/train.py", line 75, in <module>
model, tokenizer = FastLanguageModel.from_pretrained(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/miniconda3/lib/python3.12/site-packages/unsloth/models/loader.py", line 256, in from_pretrained
model, tokenizer = dispatch_model.from_pretrained(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/miniconda3/lib/python3.12/site-packages/unsloth/models/llama.py", line 1663, in from_pretrained
model = AutoModelForCausalLM.from_pretrained(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/miniconda3/lib/python3.12/site-packages/transformers/models/auto/auto_factory.py", line 564, in from_pretrained
return model_class.from_pretrained(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/miniconda3/lib/python3.12/site-packages/transformers/modeling_utils.py", line 4130, in from_pretrained
model = cls(config, *model_args, **model_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/miniconda3/lib/python3.12/site-packages/transformers/models/cohere/modeling_cohere.py", line 1078, in __init__
self.model = CohereModel(config)
^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/miniconda3/lib/python3.12/site-packages/transformers/models/cohere/modeling_cohere.py", line 809, in __init__
[CohereDecoderLayer(config, layer_idx) for layer_idx in range(config.num_hidden_layers)]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/miniconda3/lib/python3.12/site-packages/transformers/models/cohere/modeling_cohere.py", line 604, in __init__
self.self_attn = COHERE_ATTENTION_CLASSES[config._attn_implementation](config=config, layer_idx=layer_idx)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "<string>", line 35, in CohereAttention__init__
NameError: name 'CohereLayerNorm' is not defined
Here my environment on a Ubuntu 22 Machine with a H100 GPU:
accelerate 1.1.1
aiohappyeyeballs 2.4.4
aiohttp 3.11.9
aiosignal 1.3.1
anaconda-anon-usage 0.4.4
archspec 0.2.3
attrs 24.2.0
bitsandbytes 0.44.1
boltons 23.0.0
Brotli 1.0.9
certifi 2024.8.30
cffi 1.17.1
charset-normalizer 3.3.2
click 8.1.7
conda 24.9.2
conda-content-trust 0.2.0
conda-libmamba-solver 24.9.0
conda-package-handling 2.3.0
conda_package_streaming 0.10.0
cryptography 43.0.0
cut-cross-entropy 24.11.4
datasets 3.1.0
dill 0.3.8
distro 1.9.0
docker-pycreds 0.4.0
docstring_parser 0.16
einops 0.8.0
filelock 3.16.1
flash-attn 2.7.0.post2
frozendict 2.4.2
frozenlist 1.5.0
fsspec 2024.9.0
gitdb 4.0.11
GitPython 3.1.43
hf_transfer 0.1.8
huggingface-hub 0.26.3
idna 3.7
inquirerpy 0.3.4
Jinja2 3.1.4
jsonpatch 1.33
jsonpointer 2.1
libmambapy 1.5.8
markdown-it-py 3.0.0
MarkupSafe 3.0.2
mdurl 0.1.2
menuinst 2.1.2
mpmath 1.3.0
multidict 6.1.0
multiprocess 0.70.16
networkx 3.4.2
numpy 2.1.3
nvidia-cublas-cu12 12.4.5.8
nvidia-cuda-cupti-cu12 12.4.127
nvidia-cuda-nvrtc-cu12 12.4.127
nvidia-cuda-runtime-cu12 12.4.127
nvidia-cudnn-cu12 9.1.0.70
nvidia-cufft-cu12 11.2.1.3
nvidia-curand-cu12 10.3.5.147
nvidia-cusolver-cu12 11.6.1.9
nvidia-cusparse-cu12 12.3.1.170
nvidia-nccl-cu12 2.21.5
nvidia-nvjitlink-cu12 12.4.127
nvidia-nvtx-cu12 12.4.127
packaging 24.1
pandas 2.2.3
peft 0.13.2
pfzy 0.3.4
pillow 11.0.0
pip 24.2
platformdirs 3.10.0
pluggy 1.0.0
prompt_toolkit 3.0.48
propcache 0.2.1
protobuf 3.20.3
psutil 6.1.0
pyarrow 18.1.0
pycosat 0.6.6
pycparser 2.21
Pygments 2.18.0
PySocks 1.7.1
python-dateutil 2.9.0.post0
pytz 2024.2
PyYAML 6.0.2
regex 2024.11.6
requests 2.32.3
rich 13.9.4
ruamel.yaml 0.18.6
ruamel.yaml.clib 0.2.8
safetensors 0.4.5
sentencepiece 0.2.0
sentry-sdk 2.19.0
setproctitle 1.3.4
setuptools 75.1.0
shtab 1.7.1
six 1.16.0
smmap 5.0.1
sympy 1.13.1
tensorboardX 2.6.2.2
tokenizers 0.21.0
torch 2.5.1
tqdm 4.66.5
transformers 4.47.1
triton 3.1.0
trl 0.12.1
truststore 0.8.0
typeguard 4.4.1
typing_extensions 4.12.2
tyro 0.9.2
tzdata 2024.2
unsloth 2024.12.4
unsloth_zoo 2024.12.1
urllib3 2.2.3
wandb 0.18.7
wcwidth 0.2.13
wheel 0.44.0
xformers 0.0.28.post3
xxhash 3.5.0
yarl 1.18.3
zstandard 0.23.0
Thanks for adding this model!
I am currently getting this error:
Loading model... ==((====))== Unsloth 2024.12.4: Fast Cohere patching. Transformers:4.47.1. \\ /| GPU: NVIDIA H100 PCIe. Max memory: 79.109 GB. Platform: Linux. O^O/ \_/ \ Torch: 2.5.1+cu124. CUDA: 9.0. CUDA Toolkit: 12.4. Triton: 3.1.0 \ / Bfloat16 = TRUE. FA [Xformers = 0.0.28.post3. FA2 = True] "-____-" Free Apache license: http://github.com/unslothai/unsloth Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored! Traceback (most recent call last): File "/home/ubuntu/LegalLlama/train.py", line 75, in <module> model, tokenizer = FastLanguageModel.from_pretrained( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/ubuntu/miniconda3/lib/python3.12/site-packages/unsloth/models/loader.py", line 256, in from_pretrained model, tokenizer = dispatch_model.from_pretrained( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/ubuntu/miniconda3/lib/python3.12/site-packages/unsloth/models/llama.py", line 1663, in from_pretrained model = AutoModelForCausalLM.from_pretrained( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/ubuntu/miniconda3/lib/python3.12/site-packages/transformers/models/auto/auto_factory.py", line 564, in from_pretrained return model_class.from_pretrained( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/ubuntu/miniconda3/lib/python3.12/site-packages/transformers/modeling_utils.py", line 4130, in from_pretrained model = cls(config, *model_args, **model_kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/ubuntu/miniconda3/lib/python3.12/site-packages/transformers/models/cohere/modeling_cohere.py", line 1078, in __init__ self.model = CohereModel(config) ^^^^^^^^^^^^^^^^^^^ File "/home/ubuntu/miniconda3/lib/python3.12/site-packages/transformers/models/cohere/modeling_cohere.py", line 809, in __init__ [CohereDecoderLayer(config, layer_idx) for layer_idx in range(config.num_hidden_layers)] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/ubuntu/miniconda3/lib/python3.12/site-packages/transformers/models/cohere/modeling_cohere.py", line 604, in __init__ self.self_attn = COHERE_ATTENTION_CLASSES[config._attn_implementation](config=config, layer_idx=layer_idx) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "<string>", line 35, in CohereAttention__init__ NameError: name 'CohereLayerNorm' is not defined
Here my environment on a Ubuntu 22 Machine with a H100 GPU:
accelerate 1.1.1
aiohappyeyeballs 2.4.4
aiohttp 3.11.9
aiosignal 1.3.1
anaconda-anon-usage 0.4.4
archspec 0.2.3
attrs 24.2.0
bitsandbytes 0.44.1
boltons 23.0.0
Brotli 1.0.9
certifi 2024.8.30
cffi 1.17.1
charset-normalizer 3.3.2
click 8.1.7
conda 24.9.2
conda-content-trust 0.2.0
conda-libmamba-solver 24.9.0
conda-package-handling 2.3.0
conda_package_streaming 0.10.0
cryptography 43.0.0
cut-cross-entropy 24.11.4
datasets 3.1.0
dill 0.3.8
distro 1.9.0
docker-pycreds 0.4.0
docstring_parser 0.16
einops 0.8.0
filelock 3.16.1
flash-attn 2.7.0.post2
frozendict 2.4.2
frozenlist 1.5.0
fsspec 2024.9.0
gitdb 4.0.11
GitPython 3.1.43
hf_transfer 0.1.8
huggingface-hub 0.26.3
idna 3.7
inquirerpy 0.3.4
Jinja2 3.1.4
jsonpatch 1.33
jsonpointer 2.1
libmambapy 1.5.8
markdown-it-py 3.0.0
MarkupSafe 3.0.2
mdurl 0.1.2
menuinst 2.1.2
mpmath 1.3.0
multidict 6.1.0
multiprocess 0.70.16
networkx 3.4.2
numpy 2.1.3
nvidia-cublas-cu12 12.4.5.8
nvidia-cuda-cupti-cu12 12.4.127
nvidia-cuda-nvrtc-cu12 12.4.127
nvidia-cuda-runtime-cu12 12.4.127
nvidia-cudnn-cu12 9.1.0.70
nvidia-cufft-cu12 11.2.1.3
nvidia-curand-cu12 10.3.5.147
nvidia-cusolver-cu12 11.6.1.9
nvidia-cusparse-cu12 12.3.1.170
nvidia-nccl-cu12 2.21.5
nvidia-nvjitlink-cu12 12.4.127
nvidia-nvtx-cu12 12.4.127
packaging 24.1
pandas 2.2.3
peft 0.13.2
pfzy 0.3.4
pillow 11.0.0
pip 24.2
platformdirs 3.10.0
pluggy 1.0.0
prompt_toolkit 3.0.48
propcache 0.2.1
protobuf 3.20.3
psutil 6.1.0
pyarrow 18.1.0
pycosat 0.6.6
pycparser 2.21
Pygments 2.18.0
PySocks 1.7.1
python-dateutil 2.9.0.post0
pytz 2024.2
PyYAML 6.0.2
regex 2024.11.6
requests 2.32.3
rich 13.9.4
ruamel.yaml 0.18.6
ruamel.yaml.clib 0.2.8
safetensors 0.4.5
sentencepiece 0.2.0
sentry-sdk 2.19.0
setproctitle 1.3.4
setuptools 75.1.0
shtab 1.7.1
six 1.16.0
smmap 5.0.1
sympy 1.13.1
tensorboardX 2.6.2.2
tokenizers 0.21.0
torch 2.5.1
tqdm 4.66.5
transformers 4.47.1
triton 3.1.0
trl 0.12.1
truststore 0.8.0
typeguard 4.4.1
typing_extensions 4.12.2
tyro 0.9.2
tzdata 2024.2
unsloth 2024.12.4
unsloth_zoo 2024.12.1
urllib3 2.2.3
wandb 0.18.7
wcwidth 0.2.13
wheel 0.44.0
xformers 0.0.28.post3
xxhash 3.5.0
yarl 1.18.3
zstandard 0.23.0
Oop apologies cohere currently isn't supported but we're adding support for it pretty soon. I'll let you know when or you can join our newsletter if you want: https://unsloth.ai/
Great, thanks for letting me know!