Conversion to ONNX

#71
by mph - opened

I've fine-tuned ModernBERT-base, but I can't seem to convert it to ONNX. Clearly it's possible since the creators of this model have included ONNX models. I have the latest versions of transformers, torch, optimum, triton, etc.

from optimum.onnxruntime import ORTModelForSequenceClassification
from transformers import AutoConfig
from pathlib import Path

model_id = '/opt/markdown/model/ModernBERT-base/checkpoint-2100'

config = AutoConfig.from_pretrained(model_id)
config.attn_implementation = 'eager'
config.use_cache = False
config.output_attentions = False
config.output_hidden_states = False

model = ORTModelForSequenceClassification.from_pretrained(
    model_id,
    config=config,
    export=True,
    use_io_binding=False,
)

output_dir = Path(model_id).parent / 'onnx'
output_dir.mkdir(exist_ok=True)
model.save_pretrained(output_dir)

Here's the error:

triton.compiler.errors.CompilationError: at 32:22:
    # Meta-parameters
    BLOCK_K: tl.constexpr,
    IS_SEQLEN_OFFSETS_TENSOR: tl.constexpr,
    IS_VARLEN: tl.constexpr,
    INTERLEAVED: tl.constexpr,
    CONJUGATE: tl.constexpr,
    BLOCK_M: tl.constexpr,
):
    pid_m = tl.program_id(axis=0)
    pid_batch = tl.program_id(axis=1)
    pid_head = tl.program_id(axis=2)
    rotary_dim_half = rotary_dim // 2
                      ^
IncompatibleTypeErrorImpl('invalid operands of type pointer<int64> and triton.language.int32')

Maybe there's a different approach I should take?

Answer.AI org

cc @Xenova might have an answer here. It should be possible with optimum, support was added a while ago: https://github.com/huggingface/optimum/pull/2131

@tomaarsen @Xenova

I was able to export with this approach. However, I'm now stumped on the optimization step I normally use with newly exported ONNX models:

#! /bin/bash

optimum-cli onnxruntime optimize \
    --onnx_model /opt/markdown/model/ModernBERT-base/onnx \
    -o /opt/markdown/model/ModernBERT-base/onnx_optim \
    -O2

Error:

Traceback (most recent call last):
  File "/home/matt/miniconda3/envs/train/lib/python3.12/site-packages/optimum/onnxruntime/optimization.py", line 68, in __init__
    self.normalized_config = NormalizedConfigManager.get_normalized_config_class(self.model_type)(self.config)
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/matt/miniconda3/envs/train/lib/python3.12/site-packages/optimum/utils/normalized_config.py", line 308, in get_normalized_config_class
    cls.check_supported_model(model_type)
  File "/home/matt/miniconda3/envs/train/lib/python3.12/site-packages/optimum/utils/normalized_config.py", line 300, in check_supported_model
    raise KeyError(
KeyError: 'modernbert model type is not supported yet in NormalizedConfig. Only albert, bart, bert, big-bird, bigbird-pegasus, blenderbot, blenderbot-small, bloom, falcon, camembert, codegen, cvt, deberta, deberta-v2, deit, distilbert, donut-swin, electra, encoder-decoder, gemma, gpt2, gpt-bigcode, gpt-neo, gpt-neox, gptj, imagegpt, llama, longt5, marian, markuplm, mbart, mistral, mixtral, mpnet, mpt, mt5, m2m-100, nystromformer, opt, pegasus, pix2struct, phi, phi3, phi3small, poolformer, regnet, resnet, roberta, segformer, speech-to-text, splinter, t5, trocr, vision-encoder-decoder, vit, whisper, xlm-roberta, yolos, qwen2, granite are supported. If you want to support modernbert please propose a PR or open up an issue.'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/matt/miniconda3/envs/train/bin/optimum-cli", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/matt/miniconda3/envs/train/lib/python3.12/site-packages/optimum/commands/optimum_cli.py", line 208, in main
    service.run()
  File "/home/matt/miniconda3/envs/train/lib/python3.12/site-packages/optimum/onnxruntime/subpackage/commands/optimize.py", line 87, in run
    optimizer = ORTOptimizer.from_pretrained(self.args.onnx_model, file_names)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/matt/miniconda3/envs/train/lib/python3.12/site-packages/optimum/onnxruntime/optimization.py", line 120, in from_pretrained
    return cls(onnx_model_path, config=config, from_ortmodel=from_ortmodel)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/matt/miniconda3/envs/train/lib/python3.12/site-packages/optimum/onnxruntime/optimization.py", line 70, in __init__
    raise NotImplementedError(
NotImplementedError: Tried to use ORTOptimizer for the model type modernbert, but it is not available yet. Please open an issue or submit a PR at https://github.com/huggingface/optimum.

I tried pip install -U git+https://github.com/huggingface/optimum@main but got the same error.

Indeed, I converted it using Optimum, which @tomaarsen links to.

I've seen others have a similar issue to this, but I've never faced an issue exporting when running in a colab notebook. Could you try again from there to see if it's an issue with your environment?

For optimization, I used my quantization script here: https://github.com/huggingface/transformers.js/blob/main/scripts/quantize.py

The error suggests it's simply not listed as one of the supported models, so maybe this can be fixed with a simple PR, if you'd like to open one! Linked issue: https://github.com/huggingface/optimum/issues/2177

mph changed discussion status to closed

Sign up or log in to comment