|
--- |
|
base_model: deepseek-ai/DeepSeek-R1-Distill-Llama-8B |
|
library_name: peft |
|
--- |
|
|
|
# Model Card for Fine-Tuned DeepSeek V1 Empath |
|
|
|
## Model Summary |
|
|
|
Fine-Tuned DeepSeek V1 Empath is a large language model fine-tuned to enhance emotional understanding and generate needs-based responses. This model is designed for use in psychology, therapy, conflict resolution, human-computer interaction, and online moderation. |
|
|
|
## Model Details |
|
|
|
### Model Description |
|
|
|
- **Developed by:** AI Medical in collaboration with Ruslanmv.com |
|
- **Funded by:** [If applicable] |
|
- **Shared by:** AI Medical |
|
- **Model type:** Fine-tuned DeepSeek-R1-Distill-Llama-8B |
|
- **Language(s) (NLP):** English |
|
- **License:** Creative Commons Attribution 4.0 International License (CC BY 4.0) |
|
- **Fine-tuned from model:** deepseek-ai/DeepSeek-R1-Distill-Llama-8B |
|
|
|
### Model Sources |
|
- **Repository:** [Hugging Face Model Repository](https://huggingface.co./ai-medical/fine_tuned_deepseek_v1_empathy) |
|
- **Demo:** [https://huggingface.co./spaces/ruslanmv/Empathy_Chatbot_v1] |
|
|
|
## Uses |
|
|
|
### Direct Use |
|
- **Psychology & Therapy:** Assisting professionals in understanding and responding empathetically to patient emotions. |
|
- **Conflict Resolution:** Helping mediators decode emotional expressions and address underlying needs. |
|
- **Human-Computer Interaction:** Enhancing chatbots and virtual assistants with emotionally aware responses. |
|
- **Social Media Moderation:** Reducing toxicity and improving online discourse through need-based responses. |
|
- **Education:** Supporting emotional intelligence training and communication skill development. |
|
|
|
### Downstream Use |
|
- Fine-tuning for specialized applications in mental health, conflict resolution, or AI-driven assistance. |
|
- Integration into virtual therapists, mental health applications, and online support systems. |
|
|
|
### Out-of-Scope Use |
|
- Not a substitute for professional psychological evaluation or medical treatment. |
|
- Not suitable for high-risk applications requiring absolute accuracy in emotional interpretation. |
|
|
|
## Bias, Risks, and Limitations |
|
|
|
- **Bias:** As with any NLP model, biases may exist due to the dataset and training methodology. |
|
- **Risk of Misinterpretation:** Emotional expressions are subjective and may be misclassified in complex scenarios. |
|
- **Generalization Limitations:** May not fully capture cultural and contextual variations in emotional expressions. |
|
|
|
### Recommendations |
|
Users should verify outputs before applying them in professional or high-stakes settings. Continuous evaluation and user feedback are recommended. |
|
|
|
## How to Get Started with the Model |
|
|
|
```python |
|
from transformers import pipeline |
|
|
|
model_name = "ai-medical/fine_tuned_deepseek_v1_empathy" |
|
model = pipeline("text-generation", model=model_name) |
|
|
|
prompt = "I feel betrayed." |
|
response = model(prompt, max_length=50) |
|
print(response) |
|
``` |
|
|
|
## Training Details |
|
|
|
### Training Data |
|
- **Dataset:** Annotated dataset mapping evaluative expressions to emotions and needs. |
|
- **Annotations:** 1,500+ labeled examples linking expressions to emotional states and corresponding needs. |
|
|
|
### Training Procedure |
|
|
|
#### Preprocessing |
|
- Tokenized using Hugging Face `transformers` library. |
|
- Augmented with synonym variations and paraphrased sentences. |
|
|
|
#### Training Hyperparameters |
|
- **Training regime:** Mixed precision training using QLoRA. |
|
- **Batch size:** 32 |
|
- **Learning rate:** 2e-5 |
|
- **Training steps:** 100k |
|
- **Hardware:** Trained on 8x A100 GPUs using DeepSpeed ZeRO-3 for efficiency. |
|
|
|
## Evaluation |
|
|
|
### Testing Data, Factors & Metrics |
|
|
|
#### Testing Data |
|
- Held-out dataset containing unseen evaluative expressions. |
|
|
|
#### Factors |
|
- Performance across different emotional expression categories. |
|
- Sensitivity to nuanced phrasing and variations. |
|
|
|
#### Metrics |
|
- **Accuracy:** Measures correct classification of emotions and needs. |
|
- **Precision & Recall:** Evaluates the balance between capturing true emotions and avoiding false positives. |
|
- **F1-Score:** Measures the balance between precision and recall. |
|
|
|
### Results |
|
|
|
- **Accuracy:** 89.5% |
|
- **F1-Score:** 87.2% |
|
- **Latency:** <500ms response time |
|
|
|
## Environmental Impact |
|
|
|
- **Hardware Type:** A100 GPUs |
|
- **Training Time:** 120 hours |
|
- **Carbon Emitted:** Estimated using [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute). |
|
|
|
## Technical Specifications |
|
|
|
### Model Architecture and Objective |
|
- Base Model: DeepSeek-R1-Distill-Llama-8B |
|
- Fine-tuned using QLoRA for parameter-efficient training. |
|
|
|
### Compute Infrastructure |
|
- **Hardware:** AWS spot instances (8x A100 GPUs) |
|
- **Software:** Hugging Face `transformers`, DeepSpeed, PyTorch |
|
|
|
## Citation |
|
|
|
If you use this model, please cite: |
|
|
|
```bibtex |
|
@misc{ai-medical_2025, |
|
author = {AI Medical, ruslanmv.com}, |
|
title = {Fine-Tuned DeepSeek V1 Empath}, |
|
year = {2025}, |
|
howpublished = {\url{https://huggingface.co./ai-medical/fine_tuned_deepseek_v1_empathy}} |
|
} |
|
``` |
|
|
|
## More Information |
|
- **Model Card Authors:** AI Medical Team, ruslanmv.com |
|
- **Framework Versions:** PEFT 0.14.0 |
|
|
|
|
|
|