AlphaMonarch-dora

image/jpeg

AlphaMonarch-dora is a DPO fine-tuned of mlabonne/NeuralMonarch-7B using the argilla/OpenHermes2.5-dpo-binarized-alpha preference dataset using DoRA. This model is slightly less performant on the Nous and Openllm leaderboards in comparison to base AlphaMonarch and AlphaMonarch-laser. I have trained this model for 1080 steps. All hyperparams were kept consist across all these experiments.

πŸ† Evaluation results

OpenLLM Benchmark

image/png

Nous Benchmark

AGIEVAL

Task Version Accuracy Accuracy StdErr Normalized Accuracy Normalized Accuracy StdErr
agieval_aqua_rat 0 28.35% 2.83% 26.38% 2.77%
agieval_logiqa_en 0 38.71% 1.91% 38.25% 1.90%
agieval_lsat_ar 0 23.91% 2.82% 23.48% 2.80%
agieval_lsat_lr 0 52.55% 2.21% 53.73% 2.21%
agieval_lsat_rc 0 66.91% 2.87% 66.54% 2.88%
agieval_sat_en 0 78.64% 2.86% 78.64% 2.86%
agieval_sat_en_without_passage 0 45.15% 3.48% 44.17% 3.47%
agieval_sat_math 0 33.64% 3.19% 31.82% 3.15%

AVG = 45.976

GPT4ALL

Task Version Accuracy Accuracy StdErr Normalized Accuracy Normalized Accuracy StdErr
arc_challenge 0 65.87% 1.39% 67.92% 1.36%
arc_easy 0 86.49% 0.70% 80.64% 0.81%
boolq 1 87.16% 0.59% - -
hellaswag 0 69.86% 0.46% 87.51% 0.33%
openbookqa 0 39.00% 2.18% 49.20% 2.24%
piqa 0 83.03% 0.88% 84.82% 0.84%
winogrande 0 80.98% 1.10% - -

AVG = 73.18

TRUTHFUL-QA

Task Version MC1 Accuracy MC1 Accuracy StdErr MC2 Accuracy MC2 Accuracy StdErr
truthfulqa_mc 1 62.91% 1.69% 78.48% 1.37%

AVG = 70.69

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-7
  • train_batch_size: 2
  • eval_batch_size: Not specified
  • seed: Not specified
  • gradient_accumulation_steps: 8
  • total_train_batch_size: Not specified
  • optimizer: PagedAdamW with 32-bit precision
  • lr_scheduler_type: Cosine
  • lr_scheduler_warmup_steps: 100
  • training_steps: 1080

Framework versions

  • Transformers 4.39.0.dev0
  • Peft 0.9.1.dev0
  • Datasets 2.18.0
  • torch 2.2.0
  • accelerate 0.27.2
Downloads last month
24
Safetensors
Model size
7.11B params
Tensor type
F32
Β·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for QueryloopAI/AlphaMonarch-dora

Finetuned
(20)
this model

Dataset used to train QueryloopAI/AlphaMonarch-dora