metadata
library_name: transformers
license: apache-2.0
base_model: answerdotai/ModernBERT-large
tags:
- generated_from_trainer
metrics:
- accuracy
model-index:
- name: ModernBERT-large-zeroshot-v2.0-2024-12-28-00-13
results: []
ModernBERT-large-zeroshot-v2.0-2024-12-28-00-13
This model is a fine-tuned version of answerdotai/ModernBERT-large on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.1803
- F1 Macro: 0.6624
- F1 Micro: 0.7304
- Accuracy Balanced: 0.6979
- Accuracy: 0.7304
- Precision Macro: 0.6899
- Recall Macro: 0.6979
- Precision Micro: 0.7304
- Recall Micro: 0.7304
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 9e-06
- train_batch_size: 16
- eval_batch_size: 32
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.06
- num_epochs: 2
Training results
Training Loss | Epoch | Step | Validation Loss | F1 Macro | F1 Micro | Accuracy Balanced | Accuracy | Precision Macro | Recall Macro | Precision Micro | Recall Micro |
---|---|---|---|---|---|---|---|---|---|---|---|
0.3865 | 1.0 | 33915 | 0.3321 | 0.8584 | 0.8704 | 0.8600 | 0.8704 | 0.8569 | 0.8600 | 0.8704 | 0.8704 |
0.2456 | 2.0000 | 67828 | 0.4069 | 0.8600 | 0.8728 | 0.8590 | 0.8728 | 0.8610 | 0.8590 | 0.8728 | 0.8728 |
Breakdown by dataset
Datasets | Mean | Mean w/o NLI | mnli_m | mnli_mm | fevernli | anli_r1 | anli_r2 | anli_r3 | wanli | lingnli | wellformedquery | rottentomatoes | amazonpolarity | imdb | yelpreviews | hatexplain | massive | banking77 | emotiondair | emocontext | empathetic | agnews | yahootopics | biasframes_sex | biasframes_offensive | biasframes_intent | financialphrasebank | appreviews | hateoffensive | trueteacher | spam | wikitoxic_toxicaggregated | wikitoxic_obscene | wikitoxic_identityhate | wikitoxic_threat | wikitoxic_insult | manifesto | capsotu |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Accuracy | 0.85 | 0.851 | 0.942 | 0.944 | 0.894 | 0.812 | 0.717 | 0.716 | 0.836 | 0.909 | 0.815 | 0.899 | 0.964 | 0.951 | 0.984 | 0.814 | 0.8 | 0.744 | 0.752 | 0.802 | 0.544 | 0.899 | 0.735 | 0.934 | 0.864 | 0.877 | 0.913 | 0.953 | 0.921 | 0.821 | 0.989 | 0.901 | 0.927 | 0.931 | 0.959 | 0.911 | 0.497 | 0.73 |
F1 macro | 0.834 | 0.835 | 0.935 | 0.938 | 0.882 | 0.795 | 0.688 | 0.676 | 0.823 | 0.898 | 0.814 | 0.899 | 0.964 | 0.951 | 0.984 | 0.77 | 0.753 | 0.763 | 0.69 | 0.805 | 0.533 | 0.899 | 0.729 | 0.925 | 0.864 | 0.877 | 0.901 | 0.953 | 0.855 | 0.821 | 0.983 | 0.901 | 0.927 | 0.931 | 0.952 | 0.911 | 0.362 | 0.662 |
Inference text/sec (GPU, batch=32) | 1116.0 | 1104.0 | 1039.0 | 1241.0 | 1138.0 | 1102.0 | 1124.0 | 1133.0 | 1251.0 | 1240.0 | 1263.0 | 1231.0 | 1054.0 | 559.0 | 795.0 | 1238.0 | 1312.0 | 1285.0 | 1273.0 | 1268.0 | 992.0 | 1222.0 | 894.0 | 1176.0 | 1194.0 | 1197.0 | 1206.0 | 1166.0 | 1227.0 | 541.0 | 1199.0 | 1045.0 | 1054.0 | 1020.0 | 1005.0 | 1063.0 | 1214.0 | 1220.0 |
Framework versions
- Transformers 4.48.0.dev0
- Pytorch 2.5.1+cu124
- Datasets 3.2.0
- Tokenizers 0.21.0