ruslanmv's picture
Update README.md
86be246 verified
---
base_model: deepseek-ai/DeepSeek-R1-Distill-Llama-8B
library_name: peft
---
# Model Card for Fine-Tuned DeepSeek V1 Empath
## Model Summary
Fine-Tuned DeepSeek V1 Empath is a large language model fine-tuned to enhance emotional understanding and generate needs-based responses. This model is designed for use in psychology, therapy, conflict resolution, human-computer interaction, and online moderation.
## Model Details
### Model Description
- **Developed by:** AI Medical in collaboration with Ruslanmv.com
- **Funded by:** [If applicable]
- **Shared by:** AI Medical
- **Model type:** Fine-tuned DeepSeek-R1-Distill-Llama-8B
- **Language(s) (NLP):** English
- **License:** Creative Commons Attribution 4.0 International License (CC BY 4.0)
- **Fine-tuned from model:** deepseek-ai/DeepSeek-R1-Distill-Llama-8B
### Model Sources
- **Repository:** [Hugging Face Model Repository](https://huggingface.co./ai-medical/fine_tuned_deepseek_v1_empathy)
- **Demo:** [https://huggingface.co./spaces/ruslanmv/Empathy_Chatbot_v1]
## Uses
### Direct Use
- **Psychology & Therapy:** Assisting professionals in understanding and responding empathetically to patient emotions.
- **Conflict Resolution:** Helping mediators decode emotional expressions and address underlying needs.
- **Human-Computer Interaction:** Enhancing chatbots and virtual assistants with emotionally aware responses.
- **Social Media Moderation:** Reducing toxicity and improving online discourse through need-based responses.
- **Education:** Supporting emotional intelligence training and communication skill development.
### Downstream Use
- Fine-tuning for specialized applications in mental health, conflict resolution, or AI-driven assistance.
- Integration into virtual therapists, mental health applications, and online support systems.
### Out-of-Scope Use
- Not a substitute for professional psychological evaluation or medical treatment.
- Not suitable for high-risk applications requiring absolute accuracy in emotional interpretation.
## Bias, Risks, and Limitations
- **Bias:** As with any NLP model, biases may exist due to the dataset and training methodology.
- **Risk of Misinterpretation:** Emotional expressions are subjective and may be misclassified in complex scenarios.
- **Generalization Limitations:** May not fully capture cultural and contextual variations in emotional expressions.
### Recommendations
Users should verify outputs before applying them in professional or high-stakes settings. Continuous evaluation and user feedback are recommended.
## How to Get Started with the Model
```python
from transformers import pipeline
model_name = "ai-medical/fine_tuned_deepseek_v1_empathy"
model = pipeline("text-generation", model=model_name)
prompt = "I feel betrayed."
response = model(prompt, max_length=50)
print(response)
```
## Training Details
### Training Data
- **Dataset:** Annotated dataset mapping evaluative expressions to emotions and needs.
- **Annotations:** 1,500+ labeled examples linking expressions to emotional states and corresponding needs.
### Training Procedure
#### Preprocessing
- Tokenized using Hugging Face `transformers` library.
- Augmented with synonym variations and paraphrased sentences.
#### Training Hyperparameters
- **Training regime:** Mixed precision training using QLoRA.
- **Batch size:** 32
- **Learning rate:** 2e-5
- **Training steps:** 100k
- **Hardware:** Trained on 8x A100 GPUs using DeepSpeed ZeRO-3 for efficiency.
## Evaluation
### Testing Data, Factors & Metrics
#### Testing Data
- Held-out dataset containing unseen evaluative expressions.
#### Factors
- Performance across different emotional expression categories.
- Sensitivity to nuanced phrasing and variations.
#### Metrics
- **Accuracy:** Measures correct classification of emotions and needs.
- **Precision & Recall:** Evaluates the balance between capturing true emotions and avoiding false positives.
- **F1-Score:** Measures the balance between precision and recall.
### Results
- **Accuracy:** 89.5%
- **F1-Score:** 87.2%
- **Latency:** <500ms response time
## Environmental Impact
- **Hardware Type:** A100 GPUs
- **Training Time:** 120 hours
- **Carbon Emitted:** Estimated using [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute).
## Technical Specifications
### Model Architecture and Objective
- Base Model: DeepSeek-R1-Distill-Llama-8B
- Fine-tuned using QLoRA for parameter-efficient training.
### Compute Infrastructure
- **Hardware:** AWS spot instances (8x A100 GPUs)
- **Software:** Hugging Face `transformers`, DeepSpeed, PyTorch
## Citation
If you use this model, please cite:
```bibtex
@misc{ai-medical_2025,
author = {AI Medical, ruslanmv.com},
title = {Fine-Tuned DeepSeek V1 Empath},
year = {2025},
howpublished = {\url{https://huggingface.co./ai-medical/fine_tuned_deepseek_v1_empathy}}
}
```
## More Information
- **Model Card Authors:** AI Medical Team, ruslanmv.com
- **Framework Versions:** PEFT 0.14.0