Alex Daminger's picture

Alex Daminger

Handgun1773
Β·

AI & ML interests

None yet

Recent Activity

Organizations

None yet

Handgun1773's activity

New activity in Qwen/Qwen2.5-Coder-7B-Instruct about 2 months ago

The updated weights

25
#12 opened about 2 months ago by
QuantPanda
New activity in bartowski/SuperNova-Medius-GGUF 2 months ago

exl2

2
#1 opened 2 months ago by
Handgun1773
New activity in opendatalab/PDF-Extract-Kit 3 months ago

Dev/Test Help

1
#3 opened 4 months ago by
yismet
replied to bartowski's post 3 months ago
view reply

Yes, exactly. When converting from float16 to float32 for fine-tuning (as I thought), we need to fill 13 bits of the mantissa and 3 bits of the exponent with zeros, rather than simply filling the last 16 bits.

Ok I get your point now.

replied to bartowski's post 3 months ago
view reply

I don't understand much about this, but maybe the model in F32 is just redundant. Maybe the other half of most weights are filled with zeros. It was scaled this way to fine-tune it or to make it impossible for people with few resources to run it😁

32 and 16 is the memory that each weight takes, not a number of weights. You can look into float point 32 and 16 in computer science to better grasp what does it mean.

New activity in LoneStriker/Yi-Coder-9B-Chat-8.0bpw-h8-exl2 4 months ago

Do you plan to do 1.5B ?

#1 opened 4 months ago by
Handgun1773