45 4 155

sometimesanotion PRO

sometimesanotion

https://ko-fi.com/sometimesanotion

AI & ML interests

Agentic LLM services, model merging, finetunes, distillation

Recent Activity

new activity 4 days ago

wanlige/li-14b-v0.4-slerp0.1:Fusion vs. SLERP?

updated a model 5 days ago

sometimesanotion/Lamarck-14B-v0.7

replied to their post 6 days ago

I have tracked down a blocker preventing Lamarck releases to a della_linear bug in newer mergekit versions. If you use slices in della_linear merges that have multiple models - as you'd expect of a merge! - an attempt to load the output model in torch will get you: ``` ValueError: Trying to set a tensor of shape torch.Size([1, 5120]) in "weight" (which has shape torch.Size([5120])), this looks incorrect. ``` This strategy was key to Lamarck v0.6 and v0.7's success. Their merge recipes haven't been working with newer mergekits. These work: ```yaml models: - model: sometimesanotion/Qwen2.5-14B-Vimarckoso-v3 - model: sthenno-com/miscii-14b-0218 ``` ```yaml slices: - sources: - { layer_range: [ 0, 2 ], model: sometimesanotion/Qwen2.5-14B-Vimarckoso-v3 } - sources: - { layer_range: [ 2, 6 ], model: sometimesanotion/Qwen2.5-14B-Vimarckoso-v3 } ``` This does not: ```yaml slices: - sources: - { layer_range: [ 0, 2 ], model: sometimesanotion/Qwen2.5-14B-Vimarckoso-v3 } - { layer_range: [ 0, 2 ], model: sthenno-com/miscii-14b-0218 } - sources: - { layer_range: [ 2, 6 ], model: sometimesanotion/Qwen2.5-14B-Vimarckoso-v3 } - { layer_range: [ 2, 6 ], model: sthenno-com/miscii-14b-0218 } ``` @Crystalcareai, do you know of any work on this? Will @arcee-ai need a detailed report? These della_linear recipes used to work. Overall, thank you for all the cool work, I hope to get this fixed!

View all activity

Organizations

sometimesanotion's activity

liked a model 7 days ago

Lunzima/NQLSG-Qwen2.5-14B-MegaFusion-v8

Text Generation • Updated about 21 hours ago • 187 • 2

liked a model 8 days ago

TimeLordRaps/DS-R1-Lamarckvergence-14B-1M-test3

Text Generation • Updated 9 days ago • 9 • 1

liked 6 models 9 days ago

liked a model 11 days ago

wanlige/li-14b-v0.4

Text Generation • Updated 10 days ago • 1.17k • 14

liked 4 models 13 days ago

CultriX/Qwen2.5-14B-ReasoningMerge

Text Generation • Updated 20 days ago • 269 • 3

YOYO-AI/Qwen2.5-14B-1M-YOYO-V3

Text Generation • Updated 7 days ago • 191 • 3

mlx-community/Lamarck-14B-v0.7-6bit

Text Generation • Updated 19 days ago • 35 • 1

mlx-community/Lamarck-14B-v0.7-4bit

Text Generation • Updated 19 days ago • 35 • 1

liked 4 models 14 days ago

mradermacher/Lamarck-14B-v0.7-Fusion-GGUF

Updated 14 days ago • 402 • 3

mistralai/Pixtral-12B-Base-2409

Updated Feb 2 • 94

MaziyarPanahi/Lamarck-14B-v0.7-Fusion-GGUF

Text Generation • Updated 14 days ago • 421 • 1

mradermacher/Lamarck-14B-v0.7-Fusion-i1-GGUF

Updated 14 days ago • 1.58k • 2

liked a model 15 days ago

CultriX/Qwen2.5-14B-DeepResearch

Text Generation • Updated 20 days ago • 49 • 3

liked 2 models 17 days ago

arcee-ai/Arcee-Blitz

Text Generation • Updated 11 days ago • 2.73k • 62

arcee-ai/Arcee-Maestro-7B-Preview

Text Generation • Updated 17 days ago • 4.6k • 36