SetFit with intfloat/multilingual-e5-small

This is a SetFit model that can be used for Text Classification. This SetFit model uses intfloat/multilingual-e5-small as the Sentence Transformer embedding model. A SetFitHead instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Model Labels

Label Examples
0
  • 'query: Értem. Mit csinálunk most?'
  • 'query: Ola Luca, que tal? Rematache o traballo?'
  • 'query: Lijepo je. Hvala.'
1
  • 'query: Жөнейін, кейін кездесеміз.'
  • 'query: Така, ќе се видиме повторно.'
  • 'query: ठीक है बाद में बात करते हैं मार्क अच्छा दिन'

Evaluation

Metrics

Label Accuracy
all 0.9333

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("setfit_model_id")
# Run inference
preds = model("query: Tôi xin lỗi nhưng tôi phải đi")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 2 7.2168 25
Label Training Sample Count
0 346
1 346

Training Hyperparameters

  • batch_size: (16, 2)
  • num_epochs: (1, 16)
  • max_steps: 2500
  • sampling_strategy: undersampling
  • body_learning_rate: (1e-06, 1e-06)
  • head_learning_rate: 0.001
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • seed: 42
  • run_name: multilingual-e5-small
  • eval_max_steps: -1
  • load_best_model_at_end: False

Training Results

Epoch Step Training Loss Validation Loss
0.0002 1 0.3607 -
0.0100 50 0.3634 0.3452
0.0200 100 0.3493 0.3377
0.0300 150 0.3244 0.3234
0.0400 200 0.3244 0.3034
0.0500 250 0.2931 0.2731
0.0600 300 0.2471 0.2398
0.0700 350 0.237 0.2168
0.0800 400 0.1964 0.2082
0.0900 450 0.2319 0.198
0.1000 500 0.2003 0.1968
0.1100 550 0.2014 0.1968
0.1200 600 0.1617 0.1879
0.1300 650 0.2214 0.1798
0.1400 700 0.2498 0.1768
0.1500 750 0.1527 0.1764
0.1600 800 0.1134 0.1733
0.1700 850 0.1393 0.1614
0.1800 900 0.1052 0.1549
0.1900 950 0.1772 0.149
0.2000 1000 0.1065 0.1504
0.2100 1050 0.087 0.1392
0.2200 1100 0.1416 0.1333
0.2300 1150 0.0767 0.1279
0.2400 1200 0.1228 0.1243
0.2500 1250 0.099 0.1128
0.2599 1300 0.1125 0.1106
0.2699 1350 0.1012 0.1156
0.2799 1400 0.0343 0.1022
0.2899 1450 0.0814 0.1012
0.2999 1500 0.0947 0.0965
0.3099 1550 0.0799 0.0964
0.3199 1600 0.113 0.0942
0.3299 1650 0.1125 0.0917
0.3399 1700 0.0507 0.0899
0.3499 1750 0.0986 0.0938
0.3599 1800 0.0885 0.0913
0.3699 1850 0.0712 0.0841
0.3799 1900 0.1131 0.0851
0.3899 1950 0.0701 0.0852
0.3999 2000 0.0805 0.0878
0.4099 2050 0.0375 0.0814
0.4199 2100 0.1236 0.0797
0.4299 2150 0.0532 0.0881
0.4399 2200 0.0265 0.0806
0.4499 2250 0.1268 0.0801
0.4599 2300 0.0557 0.0797
0.4699 2350 0.0956 0.0832
0.4799 2400 0.0671 0.081
0.4899 2450 0.1394 0.0794
0.4999 2500 0.1165 0.0798

Framework Versions

  • Python: 3.10.11
  • SetFit: 1.0.3
  • Sentence Transformers: 2.7.0
  • Transformers: 4.39.3
  • PyTorch: 2.4.0
  • Datasets: 2.20.0
  • Tokenizers: 0.15.2

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
12
Safetensors
Model size
118M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for thegenerativegeneration/stay_or_go_conversation_classifier_s_v2

Finetuned
(59)
this model

Evaluation results