Running into "Flash attention implementation does not support kwargs: prompt_length" when using the exact example from the Readme

#49
by HotSauce7 - opened

Hi folks,
thanks for the amazing work. Unfortunately when I use the exact example from your Readme (Model Card), which is:

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("jinaai/jina-embeddings-v3", trust_remote_code=True)

task = "retrieval.query"
embeddings = model.encode(
    ["What is the weather like in Berlin today?"],
    task=task,
    prompt_name=task,
)

with both sentence-transformers==3.2.0 and sentence-transformers==3.1.0 I am getting the warning Flash attention implementation does not support kwargs: prompt_length. If I remove prompt_name=task this does not occur, but creates totally different embeddings.
I have flash-attn==2.6.2 installed.
Any ideas what I am missing?

Have you found the solution for it?

I am experiencing the same issue. Any advice would be greatly appreciated.

# https://huggingface.co./jinaai/xlm-roberta-flash-implementation/blob/main/modeling_xlm_roberta.py
# line 671
        adapter_mask = kwargs.pop("adapter_mask", None)
        if kwargs:
            for key, value in kwargs.items():
                if value is not None:
                    logger.warning(
                        "Flash attention implementation does not support kwargs: %s",
                        key,
                    )

The file could be find in path ~/.cache/huggingface/modules/transformers_modules/jinaai/xlm-roberta-flash-implementation/12700ba4972d9e900313a85ae855f5a76fb9500e

maybe we could decrease the logger level to debug, to rm the warning.

Jina AI org

Hi, thanks for reporting the issue. This warning didn't really impact the model outputs but I still made a modification to stop passing prompt_length to the model, so you shouldn't see the warning anymore.

As for the argument itself, it seems some models need prompt_length to not include the prompt during pooling, but jina-embeddings-v3 doesn't do this, so it's not relevant for us.

@jupyterjazz Thank you for the quick response. It was rather an issue of confusion if the implementation with prompt_name=task was correct. Clarification and implementation of the solution makes perfect sense. Closing the issue!

HotSauce7 changed discussion status to closed

Sign up or log in to comment