Corwin Black's picture

6

Corwin Black

Mescalamba

·

AI & ML interests

image denoising

Recent Activity

replied to m-ric's post 5 days ago

After 6 years, BERT, the workhorse of encoder models, finally gets a replacement: 𝗪𝗲𝗹𝗰𝗼𝗺𝗲 𝗠𝗼𝗱𝗲𝗿𝗻𝗕𝗘𝗥𝗧! 🤗 We talk a lot about ✨Generative AI✨, meaning "Decoder version of the Transformers architecture", but this is only one of the ways to build LLMs: encoder models, that turn a sentence in a vector, are maybe even more widely used in industry than generative models. The workhorse for this category has been BERT since its release in 2018 (that's prehistory for LLMs). It's not a fancy 100B parameters supermodel (just a few hundred millions), but it's an excellent workhorse, kind of a Honda Civic for LLMs. Many applications use BERT-family models - the top models in this category cumulate millions of downloads on the Hub. ➡️ Now a collaboration between Answer.AI and LightOn just introduced BERT's replacement: ModernBERT. 𝗧𝗟;𝗗𝗥: 🏛️ Architecture changes: ⇒ First, standard modernizations: - Rotary positional embeddings (RoPE) - Replace GeLU with GeGLU, - Use Flash Attention 2 ✨ The team also introduced innovative techniques like alternating attention instead of full attention, and sequence packing to get rid of padding overhead. 🥇 As a result, the model tops the game of encoder models: It beats previous standard DeBERTaV3 for 1/5th the memory footprint, and runs 4x faster! Read the blog post 👉 https://huggingface.co./blog/modernbert

replied to mlabonne's post 5 days ago

✂️ Uncensor any LLM with abliteration I wrote an article about abliteration and how NeuralDaredevil-8B was created. Beyond removing alignment, I believe it's an interesting technique with a lot of potential. It's basically fine-tuning without retraining. In this article, we see how it works, implement it in Google Colab, and heal the abliterated model to recover the performance drop due to this technique. The final model is an uncensored and high-quality model with the highest MMLU score on the Open LLM Leaderboard (8B category). https://huggingface.co./blog/mlabonne/abliteration

new activity 11 days ago

kxdw2580/Qwen2.5-3B-Instruct-Uncensored-Test:Very good!

View all activity

Organizations

None yet

Mescalamba's activity

New activity in kxdw2580/Qwen2.5-3B-Instruct-Uncensored-Test 11 days ago

Very good!

#1 opened 11 days ago by

New activity in Djrango/Qwen2vl-Flux 29 days ago

It does seem really cool but..

#9 opened 29 days ago by

New activity in migtissera/Llama-3-8B-Synthia-v3.5 about 1 month ago

Its really good but..

#3 opened about 1 month ago by

New activity in nyanko7/flux-dev-de-distill 2 months ago

It's a great experimental project! how can I run the model on ComfyUI and get the best result?

#1 opened 3 months ago by

New activity in Freepik/flux.1-lite-8B-alpha 2 months ago

Possible to work with 8GB VRAM and 16GB RAM?

#7 opened 2 months ago by

New activity in city96/flux.1-lite-8B-alpha-gguf 2 months ago

need fp8 for speed

#1 opened 2 months ago by