37 12 60

Ilyas Moutawwakil

IlyasMoutawwakil

IlyasMoutawwakil

AI & ML interests

Optimization, LLMs, Hardware, Backends, ..

Recent Activity

updated a model about 9 hours ago

optimum-internal-testing/tiny-random-whisper

upvoted an article about 11 hours ago

Welcome to Inference Providers on the Hub 🔥

upvoted an article 1 day ago

Timm ❤️ Transformers: Use any timm model with transformers

View all activity

Articles

Organizations

IlyasMoutawwakil's activity

upvoted an article about 11 hours ago

Article

Welcome to Inference Providers on the Hub 🔥

2 days ago

• 145

upvoted an article 1 day ago

Article

Timm ❤️ Transformers: Use any timm model with transformers

14 days ago

• 36

upvoted an article 13 days ago

Article

Introducing multi-backends (TRT-LLM, vLLM) support for Text Generation Inference

14 days ago

• 61

upvoted an article 5 months ago

Article

The 5 Most Under-Rated Tools on Hugging Face

Aug 22, 2024

• 86

upvoted a paper 8 months ago

Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation

Paper • 2406.06525 • Published Jun 10, 2024 • 67

upvoted an article 8 months ago

Article

CPU Optimized Embeddings with 🤗 Optimum Intel and fastRAG

Mar 15, 2024

• 8

upvoted a collection 9 months ago

Fast-RAG Inference Endpoints

Collection

An extremely easy to deploy RAG Pipeline using Inference Endpoints • 3 items • Updated Jun 3, 2024 • 1

upvoted an article 9 months ago

Article

Binary and Scalar Embedding Quantization for Significantly Faster & Cheaper Retrieval

Mar 22, 2024

• 70

upvoted a collection about 1 year ago

Neural Network Compression & Quantization

Collection

Tracks papers and links about neural network compression and quantization technics • 4 items • Updated Sep 22, 2023 • 1

upvoted 2 papers over 1 year ago

BitNet: Scaling 1-bit Transformers for Large Language Models

Paper • 2310.11453 • Published Oct 17, 2023 • 96

LeanDojo: Theorem Proving with Retrieval-Augmented Language Models

Paper • 2306.15626 • Published Jun 27, 2023 • 17

Ilyas Moutawwakil

AI & ML interests

Recent Activity

Articles

Benchmarking Language Model Performance on 5th Gen Xeon at GCP

AMD + 🤗: Large Language Models Out-of-the-Box Acceleration with AMD GPU

Overview of natively supported quantization schemes in 🤗 Transformers

Organizations

IlyasMoutawwakil's activity

Welcome to Inference Providers on the Hub 🔥

Timm ❤️ Transformers: Use any timm model with transformers

Introducing multi-backends (TRT-LLM, vLLM) support for Text Generation Inference

The 5 Most Under-Rated Tools on Hugging Face

CPU Optimized Embeddings with 🤗 Optimum Intel and fastRAG

Binary and Scalar Embedding Quantization for Significantly Faster & Cheaper Retrieval