SetFit with sentence-transformers/all-mpnet-base-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/all-mpnet-base-v2 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Model Labels

Label Examples
0
1
  • 'Police officer wounded suspect dead after exchanging shots: RICHMOND Va. (AP) \x89ÛÓ A Richmond police officer wa... http://t.co/Y0qQS2L7bS'
  • "There's a weird siren going off here...I hope Hunterston isn't in the process of blowing itself to smithereens..."
  • 'Iranian warship points weapon at American helicopter... http://t.co/cgFZk8Ha1R'

Evaluation

Metrics

Label Accuracy
all 0.9203

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("pEpOo/catastrophy8")
# Run inference
preds = model("You must be annihilated!")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 1 14.5506 54
Label Training Sample Count
0 438
1 323

Training Hyperparameters

  • batch_size: (20, 20)
  • num_epochs: (1, 1)
  • max_steps: -1
  • sampling_strategy: oversampling
  • num_iterations: 20
  • body_learning_rate: (2e-05, 2e-05)
  • head_learning_rate: 2e-05
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: False

Training Results

Epoch Step Training Loss Validation Loss
0.0001 1 0.3847 -
0.0044 50 0.3738 -
0.0088 100 0.2274 -
0.0131 150 0.2747 -
0.0175 200 0.2251 -
0.0219 250 0.2562 -
0.0263 300 0.2623 -
0.0307 350 0.1904 -
0.0350 400 0.2314 -
0.0394 450 0.1669 -
0.0438 500 0.1135 -
0.0482 550 0.1489 -
0.0525 600 0.1907 -
0.0569 650 0.1728 -
0.0613 700 0.125 -
0.0657 750 0.109 -
0.0701 800 0.0968 -
0.0744 850 0.2101 -
0.0788 900 0.1974 -
0.0832 950 0.1986 -
0.0876 1000 0.0747 -
0.0920 1050 0.1117 -
0.0963 1100 0.1092 -
0.1007 1150 0.1582 -
0.1051 1200 0.1243 -
0.1095 1250 0.2873 -
0.1139 1300 0.2415 -
0.1182 1350 0.1264 -
0.1226 1400 0.127 -
0.1270 1450 0.1308 -
0.1314 1500 0.0669 -
0.1358 1550 0.1218 -
0.1401 1600 0.114 -
0.1445 1650 0.0612 -
0.1489 1700 0.0527 -
0.1533 1750 0.1421 -
0.1576 1800 0.0048 -
0.1620 1850 0.0141 -
0.1664 1900 0.0557 -
0.1708 1950 0.0206 -
0.1752 2000 0.1171 -
0.1795 2050 0.0968 -
0.1839 2100 0.0243 -
0.1883 2150 0.0233 -
0.1927 2200 0.0738 -
0.1971 2250 0.0071 -
0.2014 2300 0.0353 -
0.2058 2350 0.0602 -
0.2102 2400 0.003 -
0.2146 2450 0.0625 -
0.2190 2500 0.0173 -
0.2233 2550 0.1017 -
0.2277 2600 0.0582 -
0.2321 2650 0.0437 -
0.2365 2700 0.104 -
0.2408 2750 0.0156 -
0.2452 2800 0.0034 -
0.2496 2850 0.0343 -
0.2540 2900 0.1106 -
0.2584 2950 0.001 -
0.2627 3000 0.004 -
0.2671 3050 0.0074 -
0.2715 3100 0.0849 -
0.2759 3150 0.0009 -
0.2803 3200 0.0379 -
0.2846 3250 0.0109 -
0.2890 3300 0.0019 -
0.2934 3350 0.0154 -
0.2978 3400 0.0017 -
0.3022 3450 0.0003 -
0.3065 3500 0.0002 -
0.3109 3550 0.0025 -
0.3153 3600 0.0123 -
0.3197 3650 0.0007 -
0.3240 3700 0.0534 -
0.3284 3750 0.0004 -
0.3328 3800 0.0084 -
0.3372 3850 0.0088 -
0.3416 3900 0.0201 -
0.3459 3950 0.0002 -
0.3503 4000 0.0102 -
0.3547 4050 0.0043 -
0.3591 4100 0.0124 -
0.3635 4150 0.0845 -
0.3678 4200 0.0002 -
0.3722 4250 0.0014 -
0.3766 4300 0.1131 -
0.3810 4350 0.0612 -
0.3854 4400 0.0577 -
0.3897 4450 0.0235 -
0.3941 4500 0.0156 -
0.3985 4550 0.0078 -
0.4029 4600 0.0356 -
0.4073 4650 0.0595 -
0.4116 4700 0.0001 -
0.4160 4750 0.0018 -
0.4204 4800 0.0013 -
0.4248 4850 0.0008 -
0.4291 4900 0.0832 -
0.4335 4950 0.0083 -
0.4379 5000 0.0007 -
0.4423 5050 0.0417 -
0.4467 5100 0.0001 -
0.4510 5150 0.0218 -
0.4554 5200 0.0001 -
0.4598 5250 0.0012 -
0.4642 5300 0.0002 -
0.4686 5350 0.0006 -
0.4729 5400 0.0223 -
0.4773 5450 0.0612 -
0.4817 5500 0.0004 -
0.4861 5550 0.0 -
0.4905 5600 0.0007 -
0.4948 5650 0.0007 -
0.4992 5700 0.0116 -
0.5036 5750 0.0262 -
0.5080 5800 0.0336 -
0.5123 5850 0.026 -
0.5167 5900 0.0004 -
0.5211 5950 0.0001 -
0.5255 6000 0.0001 -
0.5299 6050 0.0001 -
0.5342 6100 0.0029 -
0.5386 6150 0.0001 -
0.5430 6200 0.0699 -
0.5474 6250 0.0262 -
0.5518 6300 0.0269 -
0.5561 6350 0.0002 -
0.5605 6400 0.0666 -
0.5649 6450 0.0209 -
0.5693 6500 0.0003 -
0.5737 6550 0.0001 -
0.5780 6600 0.0115 -
0.5824 6650 0.0003 -
0.5868 6700 0.0001 -
0.5912 6750 0.0056 -
0.5956 6800 0.0603 -
0.5999 6850 0.0002 -
0.6043 6900 0.0003 -
0.6087 6950 0.0092 -
0.6131 7000 0.0562 -
0.6174 7050 0.0408 -
0.6218 7100 0.0001 -
0.6262 7150 0.0035 -
0.6306 7200 0.0337 -
0.6350 7250 0.0024 -
0.6393 7300 0.0005 -
0.6437 7350 0.0001 -
0.6481 7400 0.0 -
0.6525 7450 0.0001 -
0.6569 7500 0.0002 -
0.6612 7550 0.0004 -
0.6656 7600 0.0125 -
0.6700 7650 0.0005 -
0.6744 7700 0.0157 -
0.6788 7750 0.0055 -
0.6831 7800 0.0 -
0.6875 7850 0.0053 -
0.6919 7900 0.0 -
0.6963 7950 0.0002 -
0.7006 8000 0.0002 -
0.7050 8050 0.0001 -
0.7094 8100 0.0001 -
0.7138 8150 0.0001 -
0.7182 8200 0.0007 -
0.7225 8250 0.0002 -
0.7269 8300 0.0001 -
0.7313 8350 0.0 -
0.7357 8400 0.0156 -
0.7401 8450 0.0098 -
0.7444 8500 0.0 -
0.7488 8550 0.0001 -
0.7532 8600 0.0042 -
0.7576 8650 0.0 -
0.7620 8700 0.0 -
0.7663 8750 0.0056 -
0.7707 8800 0.0 -
0.7751 8850 0.0 -
0.7795 8900 0.013 -
0.7839 8950 0.0 -
0.7882 9000 0.0001 -
0.7926 9050 0.0 -
0.7970 9100 0.0 -
0.8014 9150 0.0 -
0.8057 9200 0.0 -
0.8101 9250 0.0 -
0.8145 9300 0.0007 -
0.8189 9350 0.0 -
0.8233 9400 0.0002 -
0.8276 9450 0.0 -
0.8320 9500 0.0 -
0.8364 9550 0.0089 -
0.8408 9600 0.0001 -
0.8452 9650 0.0 -
0.8495 9700 0.0 -
0.8539 9750 0.0 -
0.8583 9800 0.0565 -
0.8627 9850 0.0161 -
0.8671 9900 0.0 -
0.8714 9950 0.0246 -
0.8758 10000 0.0 -
0.8802 10050 0.0 -
0.8846 10100 0.012 -
0.8889 10150 0.0 -
0.8933 10200 0.0 -
0.8977 10250 0.0 -
0.9021 10300 0.0 -
0.9065 10350 0.0 -
0.9108 10400 0.0 -
0.9152 10450 0.0 -
0.9196 10500 0.0 -
0.9240 10550 0.0023 -
0.9284 10600 0.0 -
0.9327 10650 0.0006 -
0.9371 10700 0.0 -
0.9415 10750 0.0 -
0.9459 10800 0.0 -
0.9503 10850 0.0 -
0.9546 10900 0.0 -
0.9590 10950 0.0243 -
0.9634 11000 0.0107 -
0.9678 11050 0.0001 -
0.9721 11100 0.0 -
0.9765 11150 0.0 -
0.9809 11200 0.0274 -
0.9853 11250 0.0 -
0.9897 11300 0.0 -
0.9940 11350 0.0 -
0.9984 11400 0.0 -
0.0007 1 0.2021 -
0.0329 50 0.1003 -
0.0657 100 0.2282 -
0.0986 150 0.0507 -
0.1314 200 0.046 -
0.1643 250 0.0001 -
0.1971 300 0.0495 -
0.2300 350 0.0031 -
0.2628 400 0.0004 -
0.2957 450 0.0002 -
0.3285 500 0.0 -
0.3614 550 0.0 -
0.3942 600 0.0 -
0.4271 650 0.0001 -
0.4599 700 0.0 -
0.4928 750 0.0 -
0.5256 800 0.0 -
0.5585 850 0.0 -
0.5913 900 0.0001 -
0.6242 950 0.0 -
0.6570 1000 0.0001 -
0.6899 1050 0.0 -
0.7227 1100 0.0 -
0.7556 1150 0.0 -
0.7884 1200 0.0 -
0.8213 1250 0.0 -
0.8541 1300 0.0 -
0.8870 1350 0.0 -
0.9198 1400 0.0 -
0.9527 1450 0.0001 -
0.9855 1500 0.0 -

Framework Versions

  • Python: 3.10.12
  • SetFit: 1.0.1
  • Sentence Transformers: 2.2.2
  • Transformers: 4.35.2
  • PyTorch: 2.1.0+cu121
  • Datasets: 2.15.0
  • Tokenizers: 0.15.0

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
22
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for pEpOo/catastrophy8

Finetuned
(184)
this model

Evaluation results