sentence-transformers/all-MiniLM-L6-v2

ha1772007

Oct 8

Can you Plz Provided Static Quantized Version of This model

https://huggingface.co./ha1772007/all-MiniLM-L6-v2-ONNX/tree/main

I tried this but only fp16, qint8 and quint8 are working.

Else can you Plz provide ONNX conversion script if possible

tomaarsen

Sentence Transformers org Oct 28

Done! See https://huggingface.co./sentence-transformers/all-MiniLM-L6-v2/tree/main/onnx
Also, this describes the usage: https://sbert.net/docs/sentence_transformer/usage/efficiency.html

bayang

Nov 12

•

edited Nov 12

@tomaarsen can you please provide a short quick way to create the openvino and quantized into an existing repo?

given a model: https://huggingface.co./intfloat/multilingual-e5-small
convert the model into openvino -> quantize
push to a different hg repo

I'm running into issue.
This is how i do:

download the model locally via

fast_model = SentenceTransformer('intfloat/multilingual-e5-small', backend="openvino")

then it will not find the xml file, and will export the model to OpenVINO.
then quantize and upload the model into my repo fails

export_static_quantized_openvino_model(
    fast_model,
    quantization_config=None,
    model_name_or_path="my-repo/multilingual-e5-small-openvino",
    push_to_hub=True,
    create_pr=True,
)

I have an Intel CPU with enough memory:
Issue:

[CPU] Add node with name '__module.embeddings/aten::add/Add' Exception from src\plugins\intel_cpu\src\shape_inference\custom\eltwise.cpp:45:
Eltwise shape infer input shapes dim index: 1 mismatch

sentence-transformers
/

all-MiniLM-L6-v2

ONNX Models