@burtenshaw on Hugging Face: "NEW UNIT in the Hugging Face Reasoning course. We dive deep into the algorithm…"

Hugging Face

Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Back to feed

burtenshaw

posted an update 6 days ago

Post

2327

NEW UNIT in the Hugging Face Reasoning course. We dive deep into the algorithm behind DeepSeek R1 with an advanced and hands-on guide to interpreting GRPO.

🔗

reasoning-course

This unit is super useful if you’re tuning models with reinforcement learning. It will help with:

- interpreting loss and reward progression during training runs
- selecting effective parameters for training
- reviewing and defining effective reward functions

This unit also works up smoothly toward the existing practical exercises form @mlabonne and Unsloth.

📣 Shout out to @ShirinYamani who wrote the unit. Follow for more great content.

Alian95

4 days ago

realy?

In this post

burtenshaw ben burtenshaw
Alian95 Alifian candra