prithivMLmods/MBERT-Context-Specifier · no tokenizer file is present in the model

AD233

7 days ago

how to recreate or use model without tokenizer?

prithivMLmods

Owner 7 days ago

•

edited 7 days ago

load the tokenizer from base model

tokenizer = AutoTokenizer.from_pretrained("answerdotai/ModernBERT-base")
Like :

!pip install -q git+https://github.com/huggingface/transformers.git

from transformers import pipeline, AutoTokenizer

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("answerdotai/ModernBERT-base")

# Load model
classifier = pipeline(
    task="text-classification",
    model="prithivMLmods/MBERT-Context-Specifier",
    tokenizer=tokenizer,
    device=0
)

# Sample text
sample = "The global market for sustainable technologies has seen rapid growth over the past decade as businesses increasingly prioritize environmental sustainability."

# Run classification
result = classifier(sample)
print(result)

tokenizer_config.json:   0%|          | 0.00/20.8k [00:00<?, ?B/s]tokenizer.json:   0%|          | 0.00/2.13M [00:00<?, ?B/s]special_tokens_map.json:   0%|          | 0.00/694 [00:00<?, ?B/s]config.json:   0%|          | 0.00/2.85k [00:00<?, ?B/s]model.safetensors:   0%|          | 0.00/599M [00:00<?, ?B/s]
Device set to use cuda:0

[{'label': 'business-and-industrial', 'score': nan}]

prithivMLmods pinned discussion 7 days ago

AD233

3 days ago

bro i am trying to make a text formatter that can take in unformatted text and give proper markdown text but i am new to this field and unable to use you model can u give me a
script as example that uses you model and give text formatting