sometimesanotion PRO
sometimesanotion
AI & ML interests
Agentic LLM services, model merging, finetunes, distillation
Recent Activity
new activity
5 days ago
wanlige/li-14b-v0.4-slerp0.1:Fusion vs. SLERP?
updated
a model
5 days ago
sometimesanotion/Lamarck-14B-v0.7
replied to
their
post
6 days ago
I have tracked down a blocker preventing Lamarck releases to a della_linear bug in newer mergekit versions.
If you use slices in della_linear merges that have multiple models - as you'd expect of a merge! - an attempt to load the output model in torch will get you:
```
ValueError: Trying to set a tensor of shape torch.Size([1, 5120]) in "weight" (which has shape torch.Size([5120])), this looks incorrect.
```
This strategy was key to Lamarck v0.6 and v0.7's success. Their merge recipes haven't been working with newer mergekits.
These work:
```yaml
models:
- model: sometimesanotion/Qwen2.5-14B-Vimarckoso-v3
- model: sthenno-com/miscii-14b-0218
```
```yaml
slices:
- sources:
- { layer_range: [ 0, 2 ], model: sometimesanotion/Qwen2.5-14B-Vimarckoso-v3 }
- sources:
- { layer_range: [ 2, 6 ], model: sometimesanotion/Qwen2.5-14B-Vimarckoso-v3 }
```
This does not:
```yaml
slices:
- sources:
- { layer_range: [ 0, 2 ], model: sometimesanotion/Qwen2.5-14B-Vimarckoso-v3 }
- { layer_range: [ 0, 2 ], model: sthenno-com/miscii-14b-0218 }
- sources:
- { layer_range: [ 2, 6 ], model: sometimesanotion/Qwen2.5-14B-Vimarckoso-v3 }
- { layer_range: [ 2, 6 ], model: sthenno-com/miscii-14b-0218 }
```
@Crystalcareai, do you know of any work on this? Will @arcee-ai need a detailed report? These della_linear recipes used to work. Overall, thank you for all the cool work, I hope to get this fixed!
Organizations
sometimesanotion's activity
Fusion vs. SLERP?
10
#2 opened 9 days ago
by
sometimesanotion

I think what you're doing here is really helpful
1
#2 opened 13 days ago
by
sometimesanotion

Excellent model!
16
#3 opened about 1 month ago
by
nixudos
This merge makes sense
4
#1 opened about 1 month ago
by
sometimesanotion

The bar graphs are a bit suspect
1
#4 opened 23 days ago
by
sometimesanotion

MATH results have changed
2
#1102 opened 23 days ago
by
sometimesanotion

This is starting to look a bit like the Lamarck process
#1 opened 26 days ago
by
sometimesanotion

Impressive fusion
1
#2 opened 26 days ago
by
jpacifico

Congratulations!
#2 opened about 1 month ago
by
sometimesanotion

No, this is promising
6
#1 opened about 1 month ago
by
CultriX
A tour of 14B finetuning
1
#1 opened about 1 month ago
by
sometimesanotion

Censored
8
#2 opened about 1 month ago
by
jongames
What is the instruct template?
1
#1 opened about 1 month ago
by
Poro7

This is promising
2
#1 opened about 1 month ago
by
sometimesanotion

C4ai-command-r-plus Tokenizing?
3
#1 opened about 1 month ago
by
Reithan
How are its various parameters
5
#1 opened about 1 month ago
by
Inschrift-Spruch-Raum
This release? No Deepseek R1
#1 opened about 1 month ago
by
sometimesanotion

Nuslerp parameters?
2
#1 opened about 2 months ago
by
sometimesanotion

Extra SLERP parameters
7
#1 opened 2 months ago
by
sometimesanotion

Goals and outcome
1
#1 opened about 1 month ago
by
sometimesanotion
