Fine-Tuned
Collection
41 items
β’
Updated
β’
7
This model is a fine-tuned version of the powerful Qwen/Qwen2-72B-Instruct
, pushing the boundaries of natural language understanding and generation even further. My goal was to create a versatile and robust model that excels across a wide range of benchmarks and real-world applications.
This model is suitable for a wide range of applications, including but not limited to:
All GGUF models are available here: MaziyarPanahi/calme-2.1-qwen2-72b-GGUF
Detailed results can be found here
Metric | Value |
---|---|
Avg. | 43.61 |
IFEval (0-Shot) | 81.63 |
BBH (3-Shot) | 57.33 |
MATH Lvl 5 (4-Shot) | 36.03 |
GPQA (0-shot) | 17.45 |
MuSR (0-shot) | 20.15 |
MMLU-PRO (5-shot) | 49.05 |
Tasks | Version | Filter | n-shot | Metric | Value | Stderr | |
---|---|---|---|---|---|---|---|
truthfulqa_mc2 | 2 | none | 0 | acc | 0.6761 | Β± | 0.0148 |
Tasks | Version | Filter | n-shot | Metric | Value | Stderr | |
---|---|---|---|---|---|---|---|
winogrande | 1 | none | 5 | acc | 0.8248 | Β± | 0.0107 |
Tasks | Version | Filter | n-shot | Metric | Value | Stderr | |
---|---|---|---|---|---|---|---|
arc_challenge | 1 | none | 25 | acc | 0.6852 | Β± | 0.0136 |
none | 25 | acc_norm | 0.7184 | Β± | 0.0131 |
Tasks | Version | Filter | n-shot | Metric | Value | Stderr | |
---|---|---|---|---|---|---|---|
gsm8k | 3 | strict-match | 5 | exact_match | 0.8582 | Β± | 0.0096 |
flexible-extract | 5 | exact_match | 0.8893 | Β± | 0.0086 |
This model uses ChatML
prompt template:
<|im_start|>system
{System}
<|im_end|>
<|im_start|>user
{User}
<|im_end|>
<|im_start|>assistant
{Assistant}
# Use a pipeline as a high-level helper
from transformers import pipeline
messages = [
{"role": "user", "content": "Who are you?"},
]
pipe = pipeline("text-generation", model="MaziyarPanahi/calme-2.1-qwen2-72b")
pipe(messages)
# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("MaziyarPanahi/calme-2.1-qwen2-72b")
model = AutoModelForCausalLM.from_pretrained("MaziyarPanahi/calme-2.1-qwen2-72b")
As with any large language model, users should be aware of potential biases and limitations. We recommend implementing appropriate safeguards and human oversight when deploying this model in production environments.
Base model
Qwen/Qwen2-72B