SetFit with projecte-aina/ST-NLI-ca_paraphrase-multilingual-mpnet-base

This is a SetFit model that can be used for Text Classification. This SetFit model uses projecte-aina/ST-NLI-ca_paraphrase-multilingual-mpnet-base as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: projecte-aina/ST-NLI-ca_paraphrase-multilingual-mpnet-base
Classification head: a LogisticRegression instance
Maximum Sequence Length: 128 tokens
Number of Classes: 2 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
1	'Bona nit, com estàs?' 'Ei, què tal tot?' 'Hola, com està el temps?'
0	'Quin és el propòsit de la llicència administrativa?' 'Quin és el benefici de les subvencions per als infants?' "Què acredita el certificat d'empadronament col·lectiu?"

Evaluation

Metrics

Label	Accuracy
all	0.9978

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("adriansanz/greetings-v2")
# Run inference
preds = model("Salut, tanque's")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	2	9.8187	23

Label	Training Sample Count
0	100
1	60

Training Hyperparameters

batch_size: (16, 16)
num_epochs: (3, 3)
max_steps: -1
sampling_strategy: oversampling
body_learning_rate: (2e-05, 1e-05)
head_learning_rate: 0.01
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
l2_weight: 0.01
seed: 42
eval_max_steps: -1
load_best_model_at_end: False

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0012	1	0.2127	-
0.0581	50	0.1471	-
0.1163	100	0.0168	-
0.1744	150	0.001	-
0.2326	200	0.0004	-
0.2907	250	0.0002	-
0.3488	300	0.0001	-
0.4070	350	0.0001	-
0.4651	400	0.0001	-
0.5233	450	0.0001	-
0.5814	500	0.0001	-
0.6395	550	0.0001	-
0.6977	600	0.0001	-
0.7558	650	0.0	-
0.8140	700	0.0	-
0.8721	750	0.0	-
0.9302	800	0.0	-
0.9884	850	0.0	-
1.0465	900	0.0	-
1.1047	950	0.0	-
1.1628	1000	0.0	-
1.2209	1050	0.0	-
1.2791	1100	0.0	-
1.3372	1150	0.0	-
1.3953	1200	0.0	-
1.4535	1250	0.0	-
1.5116	1300	0.0	-
1.5698	1350	0.0	-
1.6279	1400	0.0	-
1.6860	1450	0.0	-
1.7442	1500	0.0	-
1.8023	1550	0.0	-
1.8605	1600	0.0	-
1.9186	1650	0.0	-
1.9767	1700	0.0	-
2.0349	1750	0.0	-
2.0930	1800	0.0	-
2.1512	1850	0.0	-
2.2093	1900	0.0	-
2.2674	1950	0.0	-
2.3256	2000	0.0	-
2.3837	2050	0.0	-
2.4419	2100	0.0	-
2.5	2150	0.0	-
2.5581	2200	0.0	-
2.6163	2250	0.0	-
2.6744	2300	0.0	-
2.7326	2350	0.0	-
2.7907	2400	0.0	-
2.8488	2450	0.0	-
2.9070	2500	0.0	-
2.9651	2550	0.0	-

Framework Versions

Python: 3.10.12
SetFit: 1.1.0
Sentence Transformers: 3.2.1
Transformers: 4.44.2
PyTorch: 2.5.0+cu121
Datasets: 3.1.0
Tokenizers: 0.19.1

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}

adriansanz
/

greetings-v1