45 4 155

sometimesanotion PRO

sometimesanotion

https://ko-fi.com/sometimesanotion

AI & ML interests

Agentic LLM services, model merging, finetunes, distillation

Recent Activity

new activity 5 days ago

wanlige/li-14b-v0.4-slerp0.1:Fusion vs. SLERP?

updated a model 5 days ago

sometimesanotion/Lamarck-14B-v0.7

replied to their post 6 days ago

I have tracked down a blocker preventing Lamarck releases to a della_linear bug in newer mergekit versions. If you use slices in della_linear merges that have multiple models - as you'd expect of a merge! - an attempt to load the output model in torch will get you: ``` ValueError: Trying to set a tensor of shape torch.Size([1, 5120]) in "weight" (which has shape torch.Size([5120])), this looks incorrect. ``` This strategy was key to Lamarck v0.6 and v0.7's success. Their merge recipes haven't been working with newer mergekits. These work: ```yaml models: - model: sometimesanotion/Qwen2.5-14B-Vimarckoso-v3 - model: sthenno-com/miscii-14b-0218 ``` ```yaml slices: - sources: - { layer_range: [ 0, 2 ], model: sometimesanotion/Qwen2.5-14B-Vimarckoso-v3 } - sources: - { layer_range: [ 2, 6 ], model: sometimesanotion/Qwen2.5-14B-Vimarckoso-v3 } ``` This does not: ```yaml slices: - sources: - { layer_range: [ 0, 2 ], model: sometimesanotion/Qwen2.5-14B-Vimarckoso-v3 } - { layer_range: [ 0, 2 ], model: sthenno-com/miscii-14b-0218 } - sources: - { layer_range: [ 2, 6 ], model: sometimesanotion/Qwen2.5-14B-Vimarckoso-v3 } - { layer_range: [ 2, 6 ], model: sthenno-com/miscii-14b-0218 } ``` @Crystalcareai, do you know of any work on this? Will @arcee-ai need a detailed report? These della_linear recipes used to work. Overall, thank you for all the cool work, I hope to get this fixed!

View all activity

Organizations

sometimesanotion's activity

New activity in wanlige/li-14b-v0.4-slerp0.1 5 days ago

Fusion vs. SLERP?

#2 opened 9 days ago by

sometimesanotion

New activity in djuna/TEST-Q2.5-Lenned-14B 13 days ago

I think what you're doing here is really helpful

#2 opened 13 days ago by

sometimesanotion

New activity in sometimesanotion/Lamarck-14B-v0.7 16 days ago

Excellent model!

#3 opened about 1 month ago by

nixudos

New activity in suayptalha/Lamarckvergence-14B 23 days ago

This merge makes sense

#1 opened about 1 month ago by

sometimesanotion

New activity in open-llm-leaderboard/comparator 23 days ago

The bar graphs are a bit suspect

#4 opened 23 days ago by

sometimesanotion

New activity in open-llm-leaderboard/open_llm_leaderboard 23 days ago

MATH results have changed

#1102 opened 23 days ago by

sometimesanotion

New activity in CultriX/MergeStage2v3 26 days ago

This is starting to look a bit like the Lamarck process

#1 opened 26 days ago by

sometimesanotion

New activity in FINGU-AI/Chocolatine-Fusion-14B 26 days ago

Impressive fusion

#2 opened 26 days ago by

jpacifico

New activity in suayptalha/Luminis-phi-4 29 days ago

Congratulations!

#2 opened about 1 month ago by

sometimesanotion

New activity in CultriX/Qwen2.5-14B-Ultimav2 about 1 month ago

No, this is promising

#1 opened about 1 month ago by

CultriX

New activity in jpacifico/Chocolatine-2-14B-Instruct-v2.0.3 about 1 month ago

A tour of 14B finetuning

#1 opened about 1 month ago by

sometimesanotion

New activity in sometimesanotion/Lamarck-14B-v0.7 about 1 month ago

Censored

#2 opened about 1 month ago by

jongames

New activity in sometimesanotion/Qwenvergence-14B-v11 about 1 month ago

What is the instruct template?

#1 opened about 1 month ago by

Poro7

New activity in CultriX/Qwen2.5-14B-Qwentangledv2 about 1 month ago

This is promising

#1 opened about 1 month ago by

sometimesanotion

New activity in sometimesanotion/Lamarck-14B-v0.7 about 1 month ago

C4ai-command-r-plus Tokenizing?

#1 opened about 1 month ago by

Reithan

New activity in sometimesanotion/Qwenvergence-14B-v12-Prose-DS about 1 month ago

How are its various parameters

#1 opened about 1 month ago by

Inschrift-Spruch-Raum

New activity in sometimesanotion/Lamarck-14B-v0.6 about 1 month ago

This release? No Deepseek R1

#1 opened about 1 month ago by

sometimesanotion

New activity in sthenno/tempesthenno-nuslerp-001 about 1 month ago

Nuslerp parameters?

#1 opened about 2 months ago by

sometimesanotion

New activity in bamec66557/Qwen-2.5-14B-MINUS about 1 month ago

Extra SLERP parameters

#1 opened 2 months ago by

sometimesanotion

New activity in Sakalti/ultiima-14B-v0.2 about 1 month ago

Goals and outcome

#1 opened about 1 month ago by

sometimesanotion