KytheraMix-7B is crafted using semi-automated merges YAML templates. As with AgoraMix, two DELLA merge trees converge: one for instruction following, and one for reason. A SLERP merge blends them with a gradient, and a TIES merge normalizes the weights.
Ancestor Models
newsbang/Homer-v0.5-Qwen2.5-7B - The strongest contributor to the instruction-following side of NestorMix.
sethuiyer/Qwen2.5-7B-Anvita - Well-rounded for both instruction following and reasoning.
jeffmeloy/jeffmeloy_Qwen2.5-7B-minperplexity-1 - Strong knowledge and recall, thanks to the composition of the layers from many models with the least perplexity.
jeffmeloy/Qwen2.5-7B-nerd-uncensored-ties - A model_stock and TIES merge of jeffmeloy/Qwen2.5-7B-nerd-uncensored-v0.9, jeffmeloy/Qwen2.5-7B-nerd-uncensored-v1.0, and jeffmeloy/Qwen2.5-7B-nerd-uncensored-v1.8. These models are themselves the product of ner_merge, which chooses select layers from many other merges.
fblgit/cybertron-v4-qw7B-UNAMGS - Strong coding and knowledge representation.
- Downloads last month
- 68