SetFit with dunzhang/stella_en_400M_v5

This is a SetFit model that can be used for Text Classification. This SetFit model uses dunzhang/stella_en_400M_v5 as the Sentence Transformer embedding model. A SetFitHead instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

  • Model Type: SetFit
  • Sentence Transformer body: dunzhang/stella_en_400M_v5
  • Classification head: a SetFitHead instance
  • Maximum Sequence Length: 512 tokens
  • Number of Classes: 2 classes

Model Sources

Model Labels

Label Examples
1
  • 'Lucknow: Deputy CM Brajesh Pathak recommends dismissal of 17 govt doctors for absenteeism LUCKNOW: State govt has recommended the dismissal of 17 medical officers after they were found absent from duty for several months. In addition to this, disciplinary action has been ordered against three medical officers.The order was issued by deputy CM Brajesh Pathak who also holds the charge of health and medical education departments, said a govt spokesman on Thursday. In his order, Pathak stated: "No doctor or health worker who is negligent in medical services will be forgiven." tnn 'Committed to high-level health services'Strict action will be taken against them. The state is committed to providing high-level health services to the people and no laxity on the count will be tolerated," Pathak stated. Three doctors who will face disciplinary action are Dr Mukul Mishra, orthopedic specialist of District Hospital, Jhansi; Dr Madhavi Singh, ophthalmologist posted at Community Health Centre, Fatehpur, Barabanki and Dr Pramod Kumar Sharma under Chief Medical Officer, Bareilly.'
  • "Kerala model therapy: Govt gives 56 absentee doctors 'show-cause pill' Thiruvananthapuram: The state health and family welfare department has issued show-cause notice to 56 doctors who have been on unauthorised absence in various medical colleges and pharmacy colleges in Kerala. In the notice issued by Rajan Khobragade, additional chief secretary, health and family welfare department, the doctors have been directed to report for duty before the ACS at the secretariat within 15 days."
  • '42% of Nigerian Doctors, Nurse Demand Bribes Before Attending to Patients - NBS Reports The National Bureau of Statistics (NBS) recently published a report titled "NBS Corruption in Nigeria: Patterns and Trend" for 2023, revealing concerning statistics about corruption in the healthcare sector. According to the report, two-thirds of Nigerian doctors, nurses, and midwives demand bribes from patients before providing treatment. Additionally, 42 percent of these health workers accept bribes to expedite procedures, while 15 percent take bribes to ensure the completion of medical procedures. It, however, added that 11 per cent were paid bribes as a "sign of appreciation," which still reflects the purpose of gratification for the healthcare service they received. "As for doctors, nurses and midwives, 11 per cent of bribes were paid as a sign of appreciation, possibly reflecting gratitude for the care received," it stated. The report comes as Nigerians have continued to raise concerns over poor quality health services in the country. With these concerns, a shortage of health workers continues to plague the health system even as practitioners travel abroad to seek better welfare with the "japa syndrome." The NBS report, in collaboration with the United Nations Office on Drugs and Crimes (UNODC), also revealed how Nigerian public officials received nothing less than N721 billion as bribes in 2023'
0
  • 'Malta's former prime minister charged with corruption over hospital scandal Malta's former prime minister Joseph Muscat has been charged with corruption in a hospital privatisation scandal that was once investigated by the murdered investigative journalist Daphne Caruana Galizia. Muscat has been charged with accepting bribes, corruption in public office and money laundering, according to documents seen by AFP. He has described the allegations as "fantasies and lies" and said he was the victim of a political vendetta. Chris Fearne, Malta's deputy prime minister, who is tipped to become Malta's next European commissioner, and the country's former finance minister Edward Scicluna, who is now the governor of Malta's central bank, were charged with fraud, misappropriation and fraudulent gain.'
  • "US Supreme Court gives pharma companies a chance to thwart terrorism-funding lawsuit 21 pharmaceutical and medical equipment companies, including AstraZeneca, Pfizer, GE Healthcare USA, Johnson & Johnson, and F. Hoffmann-La Roche, are accused of illegally helping to fund terrorism in Iraq by providing corrupt payments to the Hezbollah-sponsored militia group Jaysh al-Mahdi to obtain medical supply contracts from Iraq's health ministry. The lawsuit seeks unspecified damages under the Anti-Terrorism Act."
  • 'Health Ministry Official Arrested in Procurement Scandal JAKARTA - Indonesian authorities have arrested a high-ranking Health Ministry official on suspicion of corruption in medical equipment procurement. Agus Sutiyo, 52, Director of Medical Supplies, is accused of accepting bribes totaling $1.2 million from suppliers in exchange for awarding inflated contracts. The Corruption Eradication Commission (KPK) alleges that Sutiyo manipulated tender processes, favoring companies that offered kickbacks. The scheme reportedly cost the government an estimated $10 million in overpayments. KPK spokesperson Febri Diansyah stated, "This case undermines public trust and diverts crucial resources from healthcare services." Sutiyo faces up to 20 years in prison if convicted.'

Evaluation

Metrics

Label Accuracy
all 0.7778

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("twright8/news_cats_2")
# Run inference
preds = model("Global Coffee Prices Surge Amid Brazilian Drought Coffee futures hit a five-year high today as severe drought continues to ravage Brazil's coffee-growing regions. Experts warn consumers may see significant price increases in coming months.")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 55 153.8462 290
Label Training Sample Count
0 13
1 13

Training Hyperparameters

  • batch_size: (1, 1)
  • num_epochs: (3, 17)
  • max_steps: -1
  • sampling_strategy: oversampling
  • body_learning_rate: (9.629116538858926e-05, 2.651259436793277e-05)
  • head_learning_rate: 0.02145586669240117
  • loss: CoSENTLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: True
  • use_amp: True
  • warmup_proportion: 0.1
  • max_length: 512
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: True

Training Results

Epoch Step Training Loss Validation Loss
0.0027 1 0.0 -
0.0549 20 0.0 0.0
0.1099 40 0.0 0.0
0.1648 60 0.0 0.0
0.2198 80 0.0 0.0
0.2747 100 0.0 0.0
0.3297 120 0.0 0.0
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.10.13
  • SetFit: 1.0.3
  • Sentence Transformers: 3.0.1
  • Transformers: 4.39.0
  • PyTorch: 2.3.0+cu121
  • Datasets: 2.20.0
  • Tokenizers: 0.15.2

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
14
Safetensors
Model size
434M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for twright8/news_cats_2

Finetuned
(8)
this model

Evaluation results