no tokenizer file is present in the model
#2
pinned
by
AD233
- opened
how to recreate or use model without tokenizer?
load the tokenizer from base model
tokenizer = AutoTokenizer.from_pretrained("answerdotai/ModernBERT-base")
Like :
!pip install -q git+https://github.com/huggingface/transformers.git
from transformers import pipeline, AutoTokenizer
# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("answerdotai/ModernBERT-base")
# Load model
classifier = pipeline(
task="text-classification",
model="prithivMLmods/MBERT-Context-Specifier",
tokenizer=tokenizer,
device=0
)
# Sample text
sample = "The global market for sustainable technologies has seen rapid growth over the past decade as businesses increasingly prioritize environmental sustainability."
# Run classification
result = classifier(sample)
print(result)
tokenizer_config.json: 0%| | 0.00/20.8k [00:00<?, ?B/s]tokenizer.json: 0%| | 0.00/2.13M [00:00<?, ?B/s]special_tokens_map.json: 0%| | 0.00/694 [00:00<?, ?B/s]config.json: 0%| | 0.00/2.85k [00:00<?, ?B/s]model.safetensors: 0%| | 0.00/599M [00:00<?, ?B/s]
Device set to use cuda:0
[{'label': 'business-and-industrial', 'score': nan}]
prithivMLmods
pinned discussion
bro i am trying to make a text formatter that can take in unformatted text and give proper markdown text but i am new to this field and unable to use you model can u give me a
script as example that uses you model and give text formatting