merge
This is a merge of pre-trained language models created using mergekit.
Merge Details
Merge Method
This model was merged using the DARE TIES merge method using Qwen/Qwen2.5-14B as a base.
Models Merged
The following models were included in the merge:
- allknowingroger/QwenSlerp6-14B
- allknowingroger/QwenStock3-14B
- CultriX/SeQwence-14B-EvolMerge
- CultriX/Qwen2.5-14B-Wernicke
- VAGOsolutions/SauerkrautLM-v2-14b-DPO
Configuration
The following YAML configuration was used to produce this model:
### CONFIG SuperiorMerge-14B-From-2-to-10 ###
models:
- model: VAGOsolutions/SauerkrautLM-v2-14b-DPO
parameters:
weight: 0.25 # Prioritize top IFEval
density: 0.6 # Keep a large portion for strong factual baseline
- model: allknowingroger/QwenSlerp6-14B
parameters:
weight: 0.25 # High weight for MATH and balanced reasoning
density: 0.6 # Retain robust reasoning capabilities
- model: CultriX/SeQwence-14B-EvolMerge
parameters:
weight: 0.20 # Important for best BBH and near-top MUSR
density: 0.5 # Moderate density to ensure these strengths blend well
- model: CultriX/Qwen2.5-14B-Wernicke
parameters:
weight: 0.15 # Adds top GPQA performance
density: 0.5 # Sufficient to preserve QA strengths
- model: allknowingroger/QwenStock3-14B
parameters:
weight: 0.15 # For top MMLU-PRO, enhancing domain knowledge
density: 0.5 # Balanced integration of diverse subject expertise
base_model: Qwen/Qwen2.5-14B
merge_method: dare_ties
parameters:
normalize: true # Ensures parameter scaling compatibility
int8_mask: true # Memory and computational efficiency
dtype: bfloat16
tokenizer_source: Qwen/Qwen2.5-14B-Instruct
### END OF CONFIG SuperiorMerge-14B-From-2-to-10 ###
- Downloads last month
- 353
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for CultriX/Qwen2.5-14B-Wernickev3
Merge model
this model