Pablo Dias's picture

15 13

Pablo Dias

Anderson452

·

AI & ML interests

None yet

Recent Activity

liked a model 2 months ago

mradermacher/Darkest-muse-v1-GGUF

liked a model 2 months ago

sam-paech/Darkest-muse-v1

View all activity

Organizations

None yet

Anderson452's activity

liked 2 models 2 months ago

mradermacher/Darkest-muse-v1-GGUF

Updated Oct 24 • 1.1k • 3

sam-paech/Darkest-muse-v1

Updated Oct 25 • 268 • 28

New activity in ifable/gemma-2-Ifable-9B 3 months ago

Thank you very much, this model is the best

#3 opened 3 months ago by

New activity in mattshumer/Reflection-Llama-3.1-70B 4 months ago

Please, 8B version

#8 opened 4 months ago by

New activity in lemon07r/Gemma-2-Ataraxy-9B 4 months ago

I ask you for a huge favor

#5 opened 4 months ago by

liked 5 Spaces 4 months ago

Running on CPU Upgrade

Open LLM Leaderboard

Track, rank and evaluate open LLMs and chatbots

Running on L40S

CogVideoX-5B

Text-to-Video

MagicAnimate

Running on Zero

Cinemo

Multimodal Image-to-Video

FLUX.1 [Inpainting]

reacted to bartowski's post with ❤️ 4 months ago

Post

10046

So turns out I've been spreading a bit of misinformation when it comes to imatrix in llama.cpp

It starts true; imatrix runs the model against a corpus of text and tracks the activation of weights to determine which are most important

However what the quantization then does with that information is where I was wrong.

I think I made the accidental connection between imatrix and exllamav2's measuring, where ExLlamaV2 decides how many bits to assign to which weight depending on the goal BPW

Instead, what llama.cpp with imatrix does is it attempts to select a scale for a quantization block that most accurately returns the important weights to their original values, ie minimizing the dequantization error based on the importance of activations

The mildly surprising part is that it actually just does a relatively brute force search, it picks a bunch of scales and tries each and sees which one results in the minimum error for weights deemed important in the group

But yeah, turns out, the quantization scheme is always the same, it's just that the scaling has a bit more logic to it when you use imatrix

Huge shoutout to @compilade for helping me wrap my head around it - feel free to add/correct as well if I've messed something up

5 replies

·

updated a Space 5 months ago

Google Gemma 2b

New activity in Sao10K/L3-8B-Stheno-v3.2 5 months ago

So hear me out...

#7 opened 6 months ago by

InvictusCreations

New activity in microsoft/Phi-3-small-128k-instruct 7 months ago

Please, add GGUF version!

#2 opened 7 months ago by

liked a model 8 months ago

Dampfinchen/Llama-3-8B-Ultra-Instruct

Text Generation • Updated May 11 • 133 • 15

New activity in Dampfinchen/Llama-3-8B-Ultra-Instruct 8 months ago

Great Model!!

#4 opened 8 months ago by

New activity in vicgalle/Roleplay-Llama-3-8B 8 months ago

Great Model!

#1 opened 8 months ago by

New activity in Undi95/Llama-3-Unholy-8B-GGUF 8 months ago

It's an excellent version, but...

#1 opened 8 months ago by

New activity in Muhammad2003/Llama3-8B-OpenHermes-DPO 8 months ago

Error

#1 opened 8 months ago by

New activity in lmstudio-community/Meta-Llama-3-8B-Instruct-GGUF 8 months ago

Thank you so much!

#1 opened 8 months ago by