66 2 11

Régis Pierrard

regisss

regisss

AI & ML interests

None yet

Recent Activity

posted an update 7 days ago

Nice to see day 1 support of Falcon 3 on Gaudi with Optimum Habana! 👉 https://www.intel.com/content/www/us/en/developer/articles/technical/intel-ai-solutions-support-falcon-3-fdn-models.html

new activity 7 days ago

AIEnergyScore/2024_Leaderboard:Tradeoff energy bost / performance

liked a Space 7 days ago

AIEnergyScore/2024_Leaderboard

View all activity

Articles

Organizations

regisss's activity

posted an update 7 days ago

Post

868

Nice to see day 1 support of Falcon 3 on Gaudi with Optimum Habana!

👉 https://www.intel.com/content/www/us/en/developer/articles/technical/intel-ai-solutions-support-falcon-3-fdn-models.html

New activity in AIEnergyScore/2024_Leaderboard 7 days ago

Tradeoff energy bost / performance

#2 opened 9 days ago by

cerisara

liked a Space 7 days ago

Running

🌟

🌟 AI Energy Score Leaderboard - v.0 (Fall 2024) 🌟

New activity in Habana/mamba 19 days ago

Upload 2 files

#2 opened 19 days ago by

zzhang37

New activity in AIEnergyScore/README 21 days ago

Expanding AI Energy Score Research to other GPUs

#1 opened 21 days ago by

PranavViswanath

New activity in Habana/mamba 23 days ago

Upload 2 files

#1 opened 23 days ago by

zzhang37

updated a model 23 days ago

Habana/stable-diffusion-xl

Updated 23 days ago

updated a dataset 2 months ago

regisss/benchmarks

Preview • Updated Oct 21 • 7

reacted to onekq's post with 🔥 2 months ago

Post

1848

I'm now working on finetuning of coding models. If you are GPU-hungry like me, you will find quantized models very helpful. But quantization for finetuning and inference are different and incompatible. So I made two collections here.

Inference (GGUF, via Ollama, CPU is enough)
onekq-ai/ollama-ready-coding-models-67118c3cfa1af2cf04a926d6

Finetuning (Bitsandbytes, QLora, GPU is needed)
onekq-ai/qlora-ready-coding-models-67118771ce001b8f4cf946b2

For quantization, the inference models are far more popular on HF than finetuning models. I use https://huggingface.co./QuantFactory to generate inference models (GGUF), and there are a few other choices.

But there hasn't been such a service for finetuning models. DIY isn't too hard though. I made a few myself and you can find the script in the model cards. If the original model is small enough, you can even do it on a free T4 (available via Google Colab).

If you know a (small) coding model worthy of quantization, please let me know and I'd love to add it to the collections.

posted an update 2 months ago

Post

1378

Interested in performing inference with an ONNX model?⚡️

The Optimum docs about model inference with ONNX Runtime is now much clearer and simpler!

You want to deploy your favorite model on the hub but you don't know how to export it to the ONNX format? You can do it in one line of code as follows:

from optimum.onnxruntime import ORTModelForSequenceClassification

# Load the model from the hub and export it to the ONNX format
model_id = "distilbert-base-uncased-finetuned-sst-2-english"
model = ORTModelForSequenceClassification.from_pretrained(model_id, export=True)

Check out the whole guide 👉 https://huggingface.co./docs/optimum/onnxruntime/usage_guides/models