You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Jivi-MedX-v2: The Next-Generation Medical Language Model

Jivi-MedX-v2 is a cutting-edge medical language model developed by Jivi AI to support a wide range of clinical applications. Built on the Meta-Llama-3-8B architecture, it has been fine-tuned with Supervised Fine-Tuning (SFT) and Odds Ratio Preference Optimization (ORPO) to enhance its medical reasoning capabilities. The model is designed to generate accurate, domain-specific responses, making it ideal for clinical decision support, medical research, and healthcare automation. Jivi-RadX-v1

Purpose-Built for the Medical Domain:

Jivi-MedX-v2 is specifically fine-tuned to meet the complex language and knowledge demands of the healthcare industry. Trained on a high-quality, domain-specific dataset, it excels in comprehending and generating precise medical text, making it an invaluable tool for clinical decision support, research, and patient education.

Training Process:

Built on the Meta-Llama-3-8B architecture, Jivi-MedX-v2 has been meticulously refined using Supervised Fine-Tuning (SFT) and Odds Ratio Preference Optimization (ORPO). This ensures that the model aligns effectively with medical terminology and reasoning while maintaining learning efficiency. Hyperparameter tuning strategies have been carefully implemented to prevent catastrophic forgetting, ensuring consistent performance across various tasks.

Data Preparation:

Jivi-MedX-v2 has been trained on a curated dataset of over 1,000,000 data points, covering diverse medical literature, clinical notes, research papers, and diagnostic guidelines. This comprehensive dataset enhances its ability to generate accurate and contextually relevant medical information.

Benchmarks:

With 8 billion parameters, Jivi-MedX-v2 outperforms other models of similar size, delivering best-in-class results on multiple medical benchmarks. It surpasses larger proprietary and open-source models, including GPT-4o and Deepseek-R1, in key performance evaluations, setting a new standard for AI-driven medical intelligence.

Model MedMCQA MedQA MMLU Anatomy MMLU Clinical Knowledge MMLU College Biology MMLU College Medicine MMLU Medical Genetics MMLU Professional Medicine PubMedQA Average
Jivi-MedX-v2 81.78% 86.57% 96.30% 97.74% 99.31% 95.38% 99.00% 98.90% 77.40% 92.49%
DeepSeek-R1 78.91% 91.99% 91.11% 94.34% 98.61% 87.86% 100.00% 95.96% 74.00% 90.31%
OpenAI GPT-4o 74.85% 87.70% 87.41% 91.69% 93.06% 82.08% 96.00% 94.85% 74.80% 86.94%
DeepSeek-R1-Distill-Llama-70B 73.86% 87.82% 82.22% 88.68% 96.53% 84.97% 96.00% 94.85% 77.20% 86.90%
DeepSeek-R1-Distill-Llama-8B 40.00% 43.36% 51.11% 58.11% 55.56% 51.45% 63.00% 56.62% 68.40% 54.18%

How to use

Use with transformers

Please ensure transformers>=4.45.2

import torch
import transformers

model_id = "jiviai/medX_v2"

pipeline = transformers.pipeline(
    "text-generation",
    model=model_id,
    model_kwargs={"torch_dtype": torch.bfloat16},
    device_map="auto",
)

prompt = "Question: A 20-year-old man comes to the physician because of worsening gait unsteadiness and bilateral hearing loss for 1 month. He has had intermittent tingling sensations on both cheeks over this time period. He has no history of serious medical illness and takes no medications. Audiometry shows bilateral sensorineural hearing loss. Genetic evaluation shows a mutation of a tumor suppressor gene on chromosome 22 that encodes merlin. This patient is at increased risk for which of the following conditions?\nA. Renal cell carcinoma\nB. Meningioma\nC. Astrocytoma\nD. Vascular malformations\nAnswer:\n"
gen_kwargs = {
    "return_full_text": False,
    "max_new_tokens": 100,
}
print(pipeline(prompt, **gen_kwargs))

Supported Languages: Currently this model supports English. We are planning to introduce multi-lingual support shortly.

Feedback: To send any feedback/questions please use the community section of the model.

Intended use

The data, code, and model checkpoints are intended to be used solely for:

  1. Future research on medical query answering.
  2. Reproducibility of the experimental results reported in the reference paper.

Disclaimer: The data, code, and model checkpoints are not intended to be used in clinical care or for any clinical decision-making purposes.

Downloads last month
66
Safetensors
Model size
8.03B params
Tensor type
BF16
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.