SetFit with mini1013/master_domain

This is a SetFit model that can be used for Text Classification. This SetFit model uses mini1013/master_domain as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: mini1013/master_domain
Classification head: a LogisticRegression instance
Maximum Sequence Length: 512 tokens
Number of Classes: 7 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
0.0	'콜라겐 비비크림 50g 23호 옵션없음 심완태' '본체청정 물광 커버력 좋은 재생 톤업 bb 비비 크림 연 퍼펙트 매직 50ml 옵션없음 에테르' '빈토르테 미네랄 CC크림 자외선차단 SPF50+ 30g 옵션없음 토스토'
3.0	'바비브라운 코렉터 1.4g 피치 비스크 호이컴퍼니' '더샘 커버 퍼펙션 트리플 팟 컨실러 5colors 04 톤업 베이지 주식회사 더샘인터내셔날' '티핏 tfit 커버 업 프로 컨실러 15G 03 쿨 티핏클래스 주식회사'
1.0	'누즈 케어 톤업 30ml(SPF50+) 옵션없음 달토끼네멋진마켓' 'MAC 맥 스트롭 크림 50ml 피치라이트 호이컴퍼니' '더후 공진향 미 럭셔리 선베이스 45ml33881531 옵션없음 씨플랩몰'
5.0	'에이지투웨니스 벨벳 래스팅 팩트 14g + 14g(리필, SPF50+) 미디움베이지 위브로5' '메리쏘드 릴커버 멜팅팩트 본품 11g + 리필 11g +퍼프2개 내추럴베이지(본품+리필)+퍼프2개 주식회사 벨라솔레' '퓌 쿠션 스웨이드 15g(SPF50+) 누드스웨이드(03) 강원상회'
4.0	'쥬리아 루나리스 실키 핏 스킨카바 23호리필내장 옵션없음 에테르노' 'Almay 프레스드 파우더 올 세트 노 샤인, 마이 베스트 라이트, [100] 0.20 oz 옵션없음 케이피스토어' '철벽보습커버 21호 리필내장 쥬얼성분배합 투웨이케익 옵션없음 후니후니003'
6.0	'VDL 루미레이어 프라이머 30ml 옵션없음 페퍼파우더' '어바웃톤 블러 래스팅 스틱 프라이머 10g AT.블러 래스팅 스틱 프라이머 (주)삐아' '로라 메르시에 퓨어 캔버스 프라이머 25ml - 트래블 사이즈 하이드레이팅 고온누리'
2.0	'후 공진향 미 럭셔리 비비 스페셜 세트 267578 옵션없음 펀펀마켓' '케이트 리얼 커버 리퀴드 파운데이션 세미 매트 + 스틱컨실러 A 세트 케이트' '커버력높은 쿠션팩트 승무원팩트 본품+리필 or 광채CC크림 2종세트 SPF 50+ 뷰디아니'

Evaluation

Metrics

Label	Accuracy
all	0.7155

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("mini1013/master_cate_bt4_test")
# Run inference
preds = model("나스 래디언스 프라이머 30ml(SPF35) 옵션없음 블루밍컴퍼니")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	5	9.7872	19

Label	Training Sample Count
0.0	19
1.0	21
2.0	10
3.0	19
4.0	28
5.0	23
6.0	21

Training Hyperparameters

batch_size: (512, 512)
num_epochs: (50, 50)
max_steps: -1
sampling_strategy: oversampling
num_iterations: 60
body_learning_rate: (2e-05, 1e-05)
head_learning_rate: 0.01
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
l2_weight: 0.01
seed: 42
eval_max_steps: -1
load_best_model_at_end: False

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0588	1	0.499	-
2.9412	50	0.3295	-
5.8824	100	0.0469	-
8.8235	150	0.0217	-
11.7647	200	0.0013	-
14.7059	250	0.0001	-
17.6471	300	0.0001	-
20.5882	350	0.0	-
23.5294	400	0.0	-
26.4706	450	0.0	-
29.4118	500	0.0	-
32.3529	550	0.0	-
35.2941	600	0.0	-
38.2353	650	0.0	-
41.1765	700	0.0	-
44.1176	750	0.0	-
47.0588	800	0.0	-
50.0	850	0.0	-

Framework Versions

Python: 3.10.12
SetFit: 1.1.0
Sentence Transformers: 3.3.1
Transformers: 4.44.2
PyTorch: 2.2.0a0+81ea7a4
Datasets: 3.2.0
Tokenizers: 0.19.1

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}

mini1013
/

master_cate_bt4_test