SetFit with BAAI/bge-small-en-v1.5

This is a SetFit model that can be used for Text Classification. This SetFit model uses BAAI/bge-small-en-v1.5 as the Sentence Transformer embedding model. A SetFitHead instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

  • Model Type: SetFit
  • Sentence Transformer body: BAAI/bge-small-en-v1.5
  • Classification head: a SetFitHead instance
  • Maximum Sequence Length: 512 tokens
  • Number of Classes: 2 classes

Model Sources

Model Labels

Label Examples
NON_SARCASTIC
  • 'so the newer devices have the ios screenshot i m still on ios but my ipad mini 1 st gen shows the ios screenshot . odd .'
  • 'why do amazon need a test authorisation when i add a new payment card , as well as the authorisation they get when i actually use it ?'
  • 'waterboarding sounds like a lot of fun until you find out what it is'
SARCASTIC
  • "have you been reading long ? you are not very good at it . it has nothing to do with who i like , especially since i am not a fan of corbyn anyway . it ' s that in one case someone was literally slapped in the face , and in the other someone wore a milkshake . battery > being annoying"
  • 'wish one of the many people dressed as killers were actually one n killed me'
  • 'is it even christmas if there isn t a fight with neighbours and a broken wrist ?'

Evaluation

Metrics

Label Accuracy F1 Precision Recall
all 0.6618 0.3952 0.2891 0.6242

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("w11wo/bge-small-en-v1.5-isarcasm")
# Run inference
preds = model("last day in my twenties")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 2 19.8489 63
Label Training Sample Count
NON_SARCASTIC 609
SARCASTIC 609

Training Hyperparameters

  • batch_size: (256, 16)
  • num_epochs: (3, 8)
  • max_steps: -1
  • sampling_strategy: oversampling
  • body_learning_rate: (2e-05, 5e-06)
  • head_learning_rate: 0.002
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: True
  • use_amp: False
  • warmup_proportion: 0.1
  • l2_weight: 0.01
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: True

Training Results

Epoch Step Training Loss Validation Loss
0.0003 1 0.2571 -
0.0172 50 0.251 -
0.0344 100 0.2556 -
0.0517 150 0.2513 -
0.0689 200 0.2531 -
0.0861 250 0.2518 -
0.1033 300 0.2553 -
0.1206 350 0.2501 -
0.1378 400 0.2546 -
0.1550 450 0.2506 -
0.1722 500 0.2317 -
0.1895 550 0.093 -
0.2067 600 0.0139 -
0.2239 650 0.0166 -
0.2411 700 0.0053 -
0.2584 750 0.0013 -
0.2756 800 0.0121 -
0.2928 850 0.0096 -
0.3100 900 0.0043 -
0.3272 950 0.0014 -
0.3445 1000 0.0009 -
0.3617 1050 0.0117 -
0.3789 1100 0.0144 -
0.3961 1150 0.0084 -
0.4134 1200 0.0006 -
0.4306 1250 0.0005 -
0.4478 1300 0.0081 -
0.4650 1350 0.0144 -
0.4823 1400 0.0045 -
0.4995 1450 0.0042 -
0.5167 1500 0.0005 -
0.5339 1550 0.003 -
0.5512 1600 0.0004 -
0.5684 1650 0.0005 -
0.5856 1700 0.0004 -
0.6028 1750 0.0004 -
0.6200 1800 0.0026 -
0.6373 1850 0.0004 -
0.6545 1900 0.0004 -
0.6717 1950 0.0003 -
0.6889 2000 0.0014 -
0.7062 2050 0.0004 -
0.7234 2100 0.0003 -
0.7406 2150 0.0003 -
0.7578 2200 0.0004 -
0.7751 2250 0.0003 -
0.7923 2300 0.0003 -
0.8095 2350 0.0003 -
0.8267 2400 0.0003 -
0.8440 2450 0.0003 -
0.8612 2500 0.0003 -
0.8784 2550 0.0003 -
0.8956 2600 0.0003 -
0.9128 2650 0.0003 -
0.9301 2700 0.0003 -
0.9473 2750 0.0004 -
0.9645 2800 0.0003 -
0.9817 2850 0.0003 -
0.9990 2900 0.0036 -
1.0162 2950 0.0003 -
1.0334 3000 0.0003 -
1.0506 3050 0.0002 -
1.0679 3100 0.0002 -
1.0851 3150 0.0002 -
1.1023 3200 0.0002 -
1.1195 3250 0.0002 -
1.1368 3300 0.0003 -
1.1540 3350 0.0004 -
1.1712 3400 0.0002 -
1.1884 3450 0.0002 -
1.2056 3500 0.0002 -
1.2229 3550 0.0002 -
1.2401 3600 0.0002 -
1.2573 3650 0.0009 -
1.2745 3700 0.0002 -
1.2918 3750 0.0002 -
1.3090 3800 0.0002 -
1.3262 3850 0.0002 -
1.3434 3900 0.0002 -
1.3607 3950 0.0002 -
1.3779 4000 0.0002 -
1.3951 4050 0.0002 -
1.4123 4100 0.0002 -
1.4296 4150 0.0002 -
1.4468 4200 0.0003 -
1.4640 4250 0.0002 -
1.4812 4300 0.0002 -
1.4984 4350 0.0002 -
1.5157 4400 0.0002 -
1.5329 4450 0.0002 -
1.5501 4500 0.0002 -
1.5673 4550 0.0002 -
1.5846 4600 0.0002 -
1.6018 4650 0.0002 -
1.6190 4700 0.0002 -
1.6362 4750 0.0002 -
1.6535 4800 0.0002 -
1.6707 4850 0.0002 -
1.6879 4900 0.0002 -
1.7051 4950 0.0002 -
1.7224 5000 0.0003 -
1.7396 5050 0.0002 -
1.7568 5100 0.0002 -
1.7740 5150 0.0002 -
1.7913 5200 0.0002 -
1.8085 5250 0.0002 -
1.8257 5300 0.0038 -
1.8429 5350 0.0002 -
1.8601 5400 0.0002 -
1.8774 5450 0.0002 -
1.8946 5500 0.0002 -
1.9118 5550 0.0002 -
1.9290 5600 0.0005 -
1.9463 5650 0.0002 -
1.9635 5700 0.0002 -
1.9807 5750 0.0002 -
1.9979 5800 0.0002 -
2.0152 5850 0.0001 -
2.0324 5900 0.0002 -
2.0496 5950 0.0002 -
2.0668 6000 0.0002 -
2.0841 6050 0.0002 -
2.1013 6100 0.0002 -
2.1185 6150 0.0002 -
2.1357 6200 0.0001 -
2.1529 6250 0.0002 -
2.1702 6300 0.0002 -
2.1874 6350 0.0001 -
2.2046 6400 0.0001 -
2.2218 6450 0.0001 -
2.2391 6500 0.0001 -
2.2563 6550 0.0001 -
2.2735 6600 0.0001 -
2.2907 6650 0.0001 -
2.3080 6700 0.0001 -
2.3252 6750 0.0001 -
2.3424 6800 0.0001 -
2.3596 6850 0.0001 -
2.3769 6900 0.0001 -
2.3941 6950 0.0001 -
2.4113 7000 0.0001 -
2.4285 7050 0.0001 -
2.4457 7100 0.0001 -
2.4630 7150 0.0001 -
2.4802 7200 0.0001 -
2.4974 7250 0.0001 -
2.5146 7300 0.0001 -
2.5319 7350 0.0001 -
2.5491 7400 0.0001 -
2.5663 7450 0.0001 -
2.5835 7500 0.0001 -
2.6008 7550 0.0001 -
2.6180 7600 0.0001 -
2.6352 7650 0.0001 -
2.6524 7700 0.0001 -
2.6697 7750 0.0001 -
2.6869 7800 0.0001 -
2.7041 7850 0.0001 -
2.7213 7900 0.0001 -
2.7385 7950 0.0001 -
2.7558 8000 0.0001 -
2.7730 8050 0.0001 -
2.7902 8100 0.0001 -
2.8074 8150 0.0001 -
2.8247 8200 0.0001 -
2.8419 8250 0.0001 -
2.8591 8300 0.0001 -
2.8763 8350 0.0001 -
2.8936 8400 0.0001 -
2.9108 8450 0.0001 -
2.9280 8500 0.0001 -
2.9452 8550 0.0001 -
2.9625 8600 0.0001 -
2.9797 8650 0.0001 -
2.9969 8700 0.0001 -

Framework Versions

  • Python: 3.10.12
  • SetFit: 1.0.1
  • Sentence Transformers: 2.2.2
  • Transformers: 4.32.0
  • PyTorch: 2.1.1+cu121
  • Datasets: 2.14.5
  • Tokenizers: 0.13.3

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
23
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for w11wo/bge-small-en-v1.5-isarcasm

Finetuned
(134)
this model

Evaluation results