NameError: name 'CohereLayerNorm' is not defined

#1
by joelniklaus - opened

Thanks for adding this model!

I am currently getting this error:

Loading model...
==((====))==  Unsloth 2024.12.4: Fast Cohere patching. Transformers:4.47.1.
   \\   /|    GPU: NVIDIA H100 PCIe. Max memory: 79.109 GB. Platform: Linux.
O^O/ \_/ \    Torch: 2.5.1+cu124. CUDA: 9.0. CUDA Toolkit: 12.4. Triton: 3.1.0
\        /    Bfloat16 = TRUE. FA [Xformers = 0.0.28.post3. FA2 = True]
 "-____-"     Free Apache license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!
Traceback (most recent call last):
  File "/home/ubuntu/LegalLlama/train.py", line 75, in <module>
    model, tokenizer = FastLanguageModel.from_pretrained(
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/miniconda3/lib/python3.12/site-packages/unsloth/models/loader.py", line 256, in from_pretrained
    model, tokenizer = dispatch_model.from_pretrained(
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/miniconda3/lib/python3.12/site-packages/unsloth/models/llama.py", line 1663, in from_pretrained
    model = AutoModelForCausalLM.from_pretrained(
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/miniconda3/lib/python3.12/site-packages/transformers/models/auto/auto_factory.py", line 564, in from_pretrained
    return model_class.from_pretrained(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/miniconda3/lib/python3.12/site-packages/transformers/modeling_utils.py", line 4130, in from_pretrained
    model = cls(config, *model_args, **model_kwargs)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/miniconda3/lib/python3.12/site-packages/transformers/models/cohere/modeling_cohere.py", line 1078, in __init__
    self.model = CohereModel(config)
                 ^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/miniconda3/lib/python3.12/site-packages/transformers/models/cohere/modeling_cohere.py", line 809, in __init__
    [CohereDecoderLayer(config, layer_idx) for layer_idx in range(config.num_hidden_layers)]
     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/miniconda3/lib/python3.12/site-packages/transformers/models/cohere/modeling_cohere.py", line 604, in __init__
    self.self_attn = COHERE_ATTENTION_CLASSES[config._attn_implementation](config=config, layer_idx=layer_idx)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<string>", line 35, in CohereAttention__init__
NameError: name 'CohereLayerNorm' is not defined

Here my environment on a Ubuntu 22 Machine with a H100 GPU:
accelerate 1.1.1
aiohappyeyeballs 2.4.4
aiohttp 3.11.9
aiosignal 1.3.1
anaconda-anon-usage 0.4.4
archspec 0.2.3
attrs 24.2.0
bitsandbytes 0.44.1
boltons 23.0.0
Brotli 1.0.9
certifi 2024.8.30
cffi 1.17.1
charset-normalizer 3.3.2
click 8.1.7
conda 24.9.2
conda-content-trust 0.2.0
conda-libmamba-solver 24.9.0
conda-package-handling 2.3.0
conda_package_streaming 0.10.0
cryptography 43.0.0
cut-cross-entropy 24.11.4
datasets 3.1.0
dill 0.3.8
distro 1.9.0
docker-pycreds 0.4.0
docstring_parser 0.16
einops 0.8.0
filelock 3.16.1
flash-attn 2.7.0.post2
frozendict 2.4.2
frozenlist 1.5.0
fsspec 2024.9.0
gitdb 4.0.11
GitPython 3.1.43
hf_transfer 0.1.8
huggingface-hub 0.26.3
idna 3.7
inquirerpy 0.3.4
Jinja2 3.1.4
jsonpatch 1.33
jsonpointer 2.1
libmambapy 1.5.8
markdown-it-py 3.0.0
MarkupSafe 3.0.2
mdurl 0.1.2
menuinst 2.1.2
mpmath 1.3.0
multidict 6.1.0
multiprocess 0.70.16
networkx 3.4.2
numpy 2.1.3
nvidia-cublas-cu12 12.4.5.8
nvidia-cuda-cupti-cu12 12.4.127
nvidia-cuda-nvrtc-cu12 12.4.127
nvidia-cuda-runtime-cu12 12.4.127
nvidia-cudnn-cu12 9.1.0.70
nvidia-cufft-cu12 11.2.1.3
nvidia-curand-cu12 10.3.5.147
nvidia-cusolver-cu12 11.6.1.9
nvidia-cusparse-cu12 12.3.1.170
nvidia-nccl-cu12 2.21.5
nvidia-nvjitlink-cu12 12.4.127
nvidia-nvtx-cu12 12.4.127
packaging 24.1
pandas 2.2.3
peft 0.13.2
pfzy 0.3.4
pillow 11.0.0
pip 24.2
platformdirs 3.10.0
pluggy 1.0.0
prompt_toolkit 3.0.48
propcache 0.2.1
protobuf 3.20.3
psutil 6.1.0
pyarrow 18.1.0
pycosat 0.6.6
pycparser 2.21
Pygments 2.18.0
PySocks 1.7.1
python-dateutil 2.9.0.post0
pytz 2024.2
PyYAML 6.0.2
regex 2024.11.6
requests 2.32.3
rich 13.9.4
ruamel.yaml 0.18.6
ruamel.yaml.clib 0.2.8
safetensors 0.4.5
sentencepiece 0.2.0
sentry-sdk 2.19.0
setproctitle 1.3.4
setuptools 75.1.0
shtab 1.7.1
six 1.16.0
smmap 5.0.1
sympy 1.13.1
tensorboardX 2.6.2.2
tokenizers 0.21.0
torch 2.5.1
tqdm 4.66.5
transformers 4.47.1
triton 3.1.0
trl 0.12.1
truststore 0.8.0
typeguard 4.4.1
typing_extensions 4.12.2
tyro 0.9.2
tzdata 2024.2
unsloth 2024.12.4
unsloth_zoo 2024.12.1
urllib3 2.2.3
wandb 0.18.7
wcwidth 0.2.13
wheel 0.44.0
xformers 0.0.28.post3
xxhash 3.5.0
yarl 1.18.3
zstandard 0.23.0

Unsloth AI org

Thanks for adding this model!

I am currently getting this error:

Loading model...
==((====))==  Unsloth 2024.12.4: Fast Cohere patching. Transformers:4.47.1.
   \\   /|    GPU: NVIDIA H100 PCIe. Max memory: 79.109 GB. Platform: Linux.
O^O/ \_/ \    Torch: 2.5.1+cu124. CUDA: 9.0. CUDA Toolkit: 12.4. Triton: 3.1.0
\        /    Bfloat16 = TRUE. FA [Xformers = 0.0.28.post3. FA2 = True]
 "-____-"     Free Apache license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!
Traceback (most recent call last):
  File "/home/ubuntu/LegalLlama/train.py", line 75, in <module>
    model, tokenizer = FastLanguageModel.from_pretrained(
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/miniconda3/lib/python3.12/site-packages/unsloth/models/loader.py", line 256, in from_pretrained
    model, tokenizer = dispatch_model.from_pretrained(
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/miniconda3/lib/python3.12/site-packages/unsloth/models/llama.py", line 1663, in from_pretrained
    model = AutoModelForCausalLM.from_pretrained(
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/miniconda3/lib/python3.12/site-packages/transformers/models/auto/auto_factory.py", line 564, in from_pretrained
    return model_class.from_pretrained(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/miniconda3/lib/python3.12/site-packages/transformers/modeling_utils.py", line 4130, in from_pretrained
    model = cls(config, *model_args, **model_kwargs)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/miniconda3/lib/python3.12/site-packages/transformers/models/cohere/modeling_cohere.py", line 1078, in __init__
    self.model = CohereModel(config)
                 ^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/miniconda3/lib/python3.12/site-packages/transformers/models/cohere/modeling_cohere.py", line 809, in __init__
    [CohereDecoderLayer(config, layer_idx) for layer_idx in range(config.num_hidden_layers)]
     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/miniconda3/lib/python3.12/site-packages/transformers/models/cohere/modeling_cohere.py", line 604, in __init__
    self.self_attn = COHERE_ATTENTION_CLASSES[config._attn_implementation](config=config, layer_idx=layer_idx)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<string>", line 35, in CohereAttention__init__
NameError: name 'CohereLayerNorm' is not defined

Here my environment on a Ubuntu 22 Machine with a H100 GPU:
accelerate 1.1.1
aiohappyeyeballs 2.4.4
aiohttp 3.11.9
aiosignal 1.3.1
anaconda-anon-usage 0.4.4
archspec 0.2.3
attrs 24.2.0
bitsandbytes 0.44.1
boltons 23.0.0
Brotli 1.0.9
certifi 2024.8.30
cffi 1.17.1
charset-normalizer 3.3.2
click 8.1.7
conda 24.9.2
conda-content-trust 0.2.0
conda-libmamba-solver 24.9.0
conda-package-handling 2.3.0
conda_package_streaming 0.10.0
cryptography 43.0.0
cut-cross-entropy 24.11.4
datasets 3.1.0
dill 0.3.8
distro 1.9.0
docker-pycreds 0.4.0
docstring_parser 0.16
einops 0.8.0
filelock 3.16.1
flash-attn 2.7.0.post2
frozendict 2.4.2
frozenlist 1.5.0
fsspec 2024.9.0
gitdb 4.0.11
GitPython 3.1.43
hf_transfer 0.1.8
huggingface-hub 0.26.3
idna 3.7
inquirerpy 0.3.4
Jinja2 3.1.4
jsonpatch 1.33
jsonpointer 2.1
libmambapy 1.5.8
markdown-it-py 3.0.0
MarkupSafe 3.0.2
mdurl 0.1.2
menuinst 2.1.2
mpmath 1.3.0
multidict 6.1.0
multiprocess 0.70.16
networkx 3.4.2
numpy 2.1.3
nvidia-cublas-cu12 12.4.5.8
nvidia-cuda-cupti-cu12 12.4.127
nvidia-cuda-nvrtc-cu12 12.4.127
nvidia-cuda-runtime-cu12 12.4.127
nvidia-cudnn-cu12 9.1.0.70
nvidia-cufft-cu12 11.2.1.3
nvidia-curand-cu12 10.3.5.147
nvidia-cusolver-cu12 11.6.1.9
nvidia-cusparse-cu12 12.3.1.170
nvidia-nccl-cu12 2.21.5
nvidia-nvjitlink-cu12 12.4.127
nvidia-nvtx-cu12 12.4.127
packaging 24.1
pandas 2.2.3
peft 0.13.2
pfzy 0.3.4
pillow 11.0.0
pip 24.2
platformdirs 3.10.0
pluggy 1.0.0
prompt_toolkit 3.0.48
propcache 0.2.1
protobuf 3.20.3
psutil 6.1.0
pyarrow 18.1.0
pycosat 0.6.6
pycparser 2.21
Pygments 2.18.0
PySocks 1.7.1
python-dateutil 2.9.0.post0
pytz 2024.2
PyYAML 6.0.2
regex 2024.11.6
requests 2.32.3
rich 13.9.4
ruamel.yaml 0.18.6
ruamel.yaml.clib 0.2.8
safetensors 0.4.5
sentencepiece 0.2.0
sentry-sdk 2.19.0
setproctitle 1.3.4
setuptools 75.1.0
shtab 1.7.1
six 1.16.0
smmap 5.0.1
sympy 1.13.1
tensorboardX 2.6.2.2
tokenizers 0.21.0
torch 2.5.1
tqdm 4.66.5
transformers 4.47.1
triton 3.1.0
trl 0.12.1
truststore 0.8.0
typeguard 4.4.1
typing_extensions 4.12.2
tyro 0.9.2
tzdata 2024.2
unsloth 2024.12.4
unsloth_zoo 2024.12.1
urllib3 2.2.3
wandb 0.18.7
wcwidth 0.2.13
wheel 0.44.0
xformers 0.0.28.post3
xxhash 3.5.0
yarl 1.18.3
zstandard 0.23.0

Oop apologies cohere currently isn't supported but we're adding support for it pretty soon. I'll let you know when or you can join our newsletter if you want: https://unsloth.ai/

Great, thanks for letting me know!

Sign up or log in to comment