SetFit with mini1013/master_domain

This is a SetFit model that can be used for Text Classification. This SetFit model uses mini1013/master_domain as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: mini1013/master_domain
Classification head: a LogisticRegression instance
Maximum Sequence Length: 512 tokens
Number of Classes: 8 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
6.0	'CHI 실크 인퓨전 12 Fl oz (관부가세포함) 옵션없음 제이글로벌컴퍼니' '아모스 리페어 샤인 모이스트 에센스 100ml 옵션없음 티비' 'BAO H LAB Hair Loss Care Ampoule 바오에이치랩 탈모케어앰플 옵션없음 주식회사 바오젠'
7.0	'커리쉴 프레스티지 실키 3종 옵션없음 (주)커리쉴' '미쟝센 퍼펙트 매직 스트레이트 샴푸&트리트먼트&세럼 3종 세트+트리트먼트 30ml 아모레퍼시픽' '[르도암 공식]르도암 카멜리아 헤어 2종 세트(샴푸+트리트먼트) LEDOAM1935'
0.0	'실키드 검은콩 코팅 탈모펜슬™ / 머리숱앰플 두피앰플 산후탈모 서리태 비건 에센스 홈 1개 (1개월) 탈모펜슬™ 주식회사 팀오브라만차(Team of la mancha Corp.)' '에버미라클 200ml EM 풀라무 토너 스칼프 토닉 8W98E7F225 옵션없음 파워몰' '포티샤 모발강화 두피세럼 100ml/르네휘테르 옵션없음 롯데쇼핑(주)'
4.0	'[클렌징대전(클렌징밤 )] 로픈 바오밥 세라마이드LPP 프리미엄 헤어트리트먼트 베이비파우더향 1000g 옵션없음 (주)우신뷰티' '허벌리스테 헤어 리페어세럼 150ml 1개 + 헤어 마스크 500ml - 1개 옵션없음 복슬강아지' '[백화점 정품] 모로칸오일 오리지널 오일 트리트먼트 100ml 제3자 배송관련 개인정보활용에 동의함 버니버즈'
2.0	'헤드앤숄더 시트러스 레몬 샴푸 750ml 옵션없음 포에이치제이' '아렌 일진 산성샴푸펌컬러 1000ml 옵션없음 해문인터내셔널' '물없이쓰는샴푸 물없이머리감는 입원준비물 노워시 옵션없음 해피2데이'
5.0	'바이오테닉스 홈케어 매직헬프 바이-페이즈 리컨디셔너 60ml 비너스 클리닉 옵션없음 주식회사 위즈온컴퍼니' '[바이레도] 블랑쉬 헤어퍼퓸 75ml 화이트_F 푸치코리아 유한책임회사' '바이레도 집시 워터 헤어퍼퓸 75ml 백화점 상품 옵션없음 코코스팜'
1.0	'케라시스린스 퍼퓸 체리블라썸 1000ml 옵션없음 땡그리나' '[갤러리아] [비건 NEW] 진저 스캘프 케어 대용량 컨디셔너 400ML(한화갤러리아㈜ 광교점) 옵션없음 한화갤러리아(주)' '케라시스 스위트 앤 플라워리 퍼퓸 린스 1L 옵션없음 해피쭈몰'
3.0	'모비88 아데노신 특허등록 탈모토닉 볼륨업 비듬 제거 옵션없음 달이커머스' '힐텀 어성초 맥주효모 토닉 120ml 옵션없음 현스 마켓' '닥터포헤어 폴리젠 토닉 120ml x 2개 두피 영양공급 탈모증상완화 영양제 코스트코 옵션없음 또또상회'

Evaluation

Metrics

Label	Accuracy
all	0.6042

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("mini1013/master_cate_bt12_test")
# Run inference
preds = model("수앤 오리진 블랙 단백질샴푸700ml,4개 옵션없음 다부자")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	4	9.25	21

Label	Training Sample Count
0.0	12
1.0	23
2.0	19
3.0	14
4.0	18
5.0	20
6.0	28
7.0	18

Training Hyperparameters

batch_size: (512, 512)
num_epochs: (50, 50)
max_steps: -1
sampling_strategy: oversampling
num_iterations: 60
body_learning_rate: (2e-05, 1e-05)
head_learning_rate: 0.01
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
l2_weight: 0.01
seed: 42
eval_max_steps: -1
load_best_model_at_end: False

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0556	1	0.4865	-
2.7778	50	0.3392	-
5.5556	100	0.0584	-
8.3333	150	0.0087	-
11.1111	200	0.003	-
13.8889	250	0.0002	-
16.6667	300	0.0001	-
19.4444	350	0.0001	-
22.2222	400	0.0001	-
25.0	450	0.0001	-
27.7778	500	0.0001	-
30.5556	550	0.0	-
33.3333	600	0.0	-
36.1111	650	0.0	-
38.8889	700	0.0	-
41.6667	750	0.0	-
44.4444	800	0.0	-
47.2222	850	0.0	-
50.0	900	0.0	-

Framework Versions

Python: 3.10.12
SetFit: 1.1.0
Sentence Transformers: 3.3.1
Transformers: 4.44.2
PyTorch: 2.2.0a0+81ea7a4
Datasets: 3.2.0
Tokenizers: 0.19.1

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}

mini1013
/

master_cate_bt12_test