SetFit with microsoft/deberta-v3-base
This is a SetFit model trained on the bhujith10/multi_class_classification_dataset dataset that can be used for Text Classification. This SetFit model uses microsoft/deberta-v3-base as the Sentence Transformer embedding model. A SetFitHead instance is used for classification.
The model has been trained using an efficient few-shot learning technique that involves:
- Fine-tuning a Sentence Transformer with contrastive learning.
- Training a classification head with features from the fine-tuned Sentence Transformer.
Model Details
Model Description
- Model Type: SetFit
- Sentence Transformer body: microsoft/deberta-v3-base
- Classification head: a SetFitHead instance
- Maximum Sequence Length: 512 tokens
- Number of Classes: 6 classes
- Training Dataset: bhujith10/multi_class_classification_dataset
Model Sources
- Repository: SetFit on GitHub
- Paper: Efficient Few-Shot Learning Without Prompts
- Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts
Uses
Direct Use for Inference
First install the SetFit library:
pip install setfit
Then you can load this model and run inference.
from setfit import SetFitModel
# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("bhujith10/deberta-v3-base-setfit_finetuned")
# Run inference
preds = model("Title: Influence of Spin Orbit Coupling in the Iron-Based Superconductors,
Abstract: We report on the influence of spin-orbit coupling (SOC) in the Fe-based
superconductors (FeSCs) via application of circularly-polarized spin and
angle-resolved photoemission spectroscopy. We combine this technique in
representative members of both the Fe-pnictides and Fe-chalcogenides with ab
initio density functional theory and tight-binding calculations to establish an
ubiquitous modification of the electronic structure in these materials imbued
by SOC. The influence of SOC is found to be concentrated on the hole pockets
where the superconducting gap is generally found to be largest. This result
contests descriptions of superconductivity in these materials in terms of pure
spin-singlet eigenstates, raising questions regarding the possible pairing
mechanisms and role of SOC therein.")
Training Details
Training Set Metrics
Training set | Min | Median | Max |
---|---|---|---|
Word count | 23 | 148.1 | 303 |
Training Hyperparameters
- batch_size: (4, 4)
- num_epochs: (1, 1)
- max_steps: -1
- sampling_strategy: oversampling
- body_learning_rate: (2e-05, 1e-05)
- head_learning_rate: 0.01
- loss: CosineSimilarityLoss
- distance_metric: cosine_distance
- margin: 0.25
- end_to_end: False
- use_amp: False
- warmup_proportion: 0.1
- l2_weight: 0.01
- seed: 42
- eval_max_steps: -1
- load_best_model_at_end: True
Training Results
Epoch | Step | Training Loss | Validation Loss |
---|---|---|---|
0.0002 | 1 | 0.4731 | - |
0.0078 | 50 | 0.4561 | - |
0.0155 | 100 | 0.4156 | - |
0.0233 | 150 | 0.2469 | - |
0.0311 | 200 | 0.2396 | - |
0.0388 | 250 | 0.2376 | - |
0.0466 | 300 | 0.2519 | - |
0.0543 | 350 | 0.1987 | - |
0.0621 | 400 | 0.1908 | - |
0.0699 | 450 | 0.161 | - |
0.0776 | 500 | 0.1532 | - |
0.0854 | 550 | 0.17 | - |
0.0932 | 600 | 0.139 | - |
0.1009 | 650 | 0.1406 | - |
0.1087 | 700 | 0.1239 | - |
0.1165 | 750 | 0.1332 | - |
0.1242 | 800 | 0.1566 | - |
0.1320 | 850 | 0.0932 | - |
0.1398 | 900 | 0.1101 | - |
0.1475 | 950 | 0.1153 | - |
0.1553 | 1000 | 0.0979 | - |
0.1630 | 1050 | 0.0741 | - |
0.1708 | 1100 | 0.0603 | - |
0.1786 | 1150 | 0.1027 | - |
0.1863 | 1200 | 0.0948 | - |
0.1941 | 1250 | 0.0968 | - |
0.2019 | 1300 | 0.085 | - |
0.2096 | 1350 | 0.0883 | - |
0.2174 | 1400 | 0.0792 | - |
0.2252 | 1450 | 0.1054 | - |
0.2329 | 1500 | 0.0556 | - |
0.2407 | 1550 | 0.0777 | - |
0.2484 | 1600 | 0.0922 | - |
0.2562 | 1650 | 0.076 | - |
0.2640 | 1700 | 0.0693 | - |
0.2717 | 1750 | 0.0857 | - |
0.2795 | 1800 | 0.0907 | - |
0.2873 | 1850 | 0.0621 | - |
0.2950 | 1900 | 0.0792 | - |
0.3028 | 1950 | 0.0608 | - |
0.3106 | 2000 | 0.052 | - |
0.3183 | 2050 | 0.056 | - |
0.3261 | 2100 | 0.0501 | - |
0.3339 | 2150 | 0.0559 | - |
0.3416 | 2200 | 0.0526 | - |
0.3494 | 2250 | 0.0546 | - |
0.3571 | 2300 | 0.0398 | - |
0.3649 | 2350 | 0.0527 | - |
0.3727 | 2400 | 0.0522 | - |
0.3804 | 2450 | 0.0468 | - |
0.3882 | 2500 | 0.0465 | - |
0.3960 | 2550 | 0.0393 | - |
0.4037 | 2600 | 0.0583 | - |
0.4115 | 2650 | 0.0278 | - |
0.4193 | 2700 | 0.0502 | - |
0.4270 | 2750 | 0.0413 | - |
0.4348 | 2800 | 0.0538 | - |
0.4425 | 2850 | 0.0361 | - |
0.4503 | 2900 | 0.0648 | - |
0.4581 | 2950 | 0.0459 | - |
0.4658 | 3000 | 0.0521 | - |
0.4736 | 3050 | 0.0288 | - |
0.4814 | 3100 | 0.0323 | - |
0.4891 | 3150 | 0.0335 | - |
0.4969 | 3200 | 0.0472 | - |
0.5047 | 3250 | 0.0553 | - |
0.5124 | 3300 | 0.0426 | - |
0.5202 | 3350 | 0.0276 | - |
0.5280 | 3400 | 0.0395 | - |
0.5357 | 3450 | 0.042 | - |
0.5435 | 3500 | 0.0343 | - |
0.5512 | 3550 | 0.0314 | - |
0.5590 | 3600 | 0.0266 | - |
0.5668 | 3650 | 0.0314 | - |
0.5745 | 3700 | 0.0379 | - |
0.5823 | 3750 | 0.0485 | - |
0.5901 | 3800 | 0.0311 | - |
0.5978 | 3850 | 0.0415 | - |
0.6056 | 3900 | 0.0266 | - |
0.6134 | 3950 | 0.0384 | - |
0.6211 | 4000 | 0.0348 | - |
0.6289 | 4050 | 0.0298 | - |
0.6366 | 4100 | 0.032 | - |
0.6444 | 4150 | 0.031 | - |
0.6522 | 4200 | 0.0367 | - |
0.6599 | 4250 | 0.0289 | - |
0.6677 | 4300 | 0.0333 | - |
0.6755 | 4350 | 0.0281 | - |
0.6832 | 4400 | 0.0307 | - |
0.6910 | 4450 | 0.0312 | - |
0.6988 | 4500 | 0.0488 | - |
0.7065 | 4550 | 0.03 | - |
0.7143 | 4600 | 0.0309 | - |
0.7220 | 4650 | 0.031 | - |
0.7298 | 4700 | 0.0268 | - |
0.7376 | 4750 | 0.0324 | - |
0.7453 | 4800 | 0.041 | - |
0.7531 | 4850 | 0.0349 | - |
0.7609 | 4900 | 0.0349 | - |
0.7686 | 4950 | 0.0291 | - |
0.7764 | 5000 | 0.025 | - |
0.7842 | 5050 | 0.0249 | - |
0.7919 | 5100 | 0.0272 | - |
0.7997 | 5150 | 0.0302 | - |
0.8075 | 5200 | 0.0414 | - |
0.8152 | 5250 | 0.0295 | - |
0.8230 | 5300 | 0.033 | - |
0.8307 | 5350 | 0.0203 | - |
0.8385 | 5400 | 0.0275 | - |
0.8463 | 5450 | 0.0354 | - |
0.8540 | 5500 | 0.0254 | - |
0.8618 | 5550 | 0.0313 | - |
0.8696 | 5600 | 0.0296 | - |
0.8773 | 5650 | 0.0248 | - |
0.8851 | 5700 | 0.036 | - |
0.8929 | 5750 | 0.025 | - |
0.9006 | 5800 | 0.0234 | - |
0.9084 | 5850 | 0.0221 | - |
0.9161 | 5900 | 0.0314 | - |
0.9239 | 5950 | 0.0273 | - |
0.9317 | 6000 | 0.0299 | - |
0.9394 | 6050 | 0.0262 | - |
0.9472 | 6100 | 0.0285 | - |
0.9550 | 6150 | 0.021 | - |
0.9627 | 6200 | 0.0215 | - |
0.9705 | 6250 | 0.0312 | - |
0.9783 | 6300 | 0.0259 | - |
0.9860 | 6350 | 0.0234 | - |
0.9938 | 6400 | 0.0222 | - |
1.0 | 6440 | - | 0.1609 |
Framework Versions
- Python: 3.10.14
- SetFit: 1.1.0
- Sentence Transformers: 3.3.1
- Transformers: 4.45.2
- PyTorch: 2.4.0
- Datasets: 3.0.1
- Tokenizers: 0.20.0
Citation
BibTeX
@article{https://doi.org/10.48550/arxiv.2209.11055,
doi = {10.48550/ARXIV.2209.11055},
url = {https://arxiv.org/abs/2209.11055},
author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
title = {Efficient Few-Shot Learning Without Prompts},
publisher = {arXiv},
year = {2022},
copyright = {Creative Commons Attribution 4.0 International}
}
- Downloads last month
- 9
Inference API (serverless) has been turned off for this model.
Model tree for bhujith10/deberta-v3-base-setfit_finetuned
Base model
microsoft/deberta-v3-base