RoPE scaling and max_position_embeddings

#12
by ag0 - opened

Hello,

In config.json, a linear rope_scaling of 8 is defined, and max_position_embeddings has been increased to 32768.

However, the huggingface Llama2 doc specifies that when a rope scaling strategy is used, max_position_embeddings should not be updated.
https://huggingface.co./docs/transformers/main/model_doc/llama2#transformers.LlamaConfig.rope_scaling

Wouldn't the existing config result in the RoPE scaling being applied twice (especially when setting trust_remote_code=False)?

This should be fixed

Together org

Hi @ag0 , thanks for bringing this up! I think this only affects the NTK scaling but not the linear scaling (which is what is adopted here): https://github.com/huggingface/transformers/blob/fdd81aea12f06e24ab5cf5ba3c7316df3ab1a779/src/transformers/models/llama/modeling_llama.py#L135-L144

Let us know what you think!:)

Sign up or log in to comment