ben burtenshaw's picture

ben burtenshaw

burtenshaw

AI & ML interests

None yet

Recent Activity

updated a dataset 16 minutes ago
agents-course/certificates
updated a dataset about 4 hours ago
reasoning-course/certificates
updated a dataset about 4 hours ago
reasoning-course/certificates
View all activity

Organizations

Hugging Face's profile picture Hugging Face Course's profile picture Argilla's profile picture Blog-explorers's profile picture MLX Community's profile picture distilabel-internal-testing's profile picture Data Is Better Together's profile picture Social Post Explorers's profile picture Hugging Face Discord Community's profile picture argilla-internal-testing's profile picture Open Human Feedback's profile picture Argilla Warehouse's profile picture open/ acc's profile picture Data Is Better Together Contributor's profile picture Open Source AI Research Community's profile picture FeeL (Feedback Loop)'s profile picture Hugging Face Agents Course's profile picture Agents Course Students's profile picture Agents Course Finishers's profile picture Open R1's profile picture Hugging Face Reasoning Course's profile picture

Posts 20

view post
Post
3106
I’m super excited to work with @mlabonne to build the first practical example in the reasoning course.

🔗 https://huggingface.co./reasoning-course

Here's a quick walk through of the first drop of material that works toward the use case:

- a fundamental introduction to reinforcement learning. Answering questions like, ‘what is a reward?’ and ‘how do we create an environment for a language model?’

- Then it focuses on Deepseek R1 by walking through the paper and highlighting key aspects. This is an old school way to learn ML topics, but it always works.

- Next, it takes to you Transformers Reinforcement Learning and demonstrates potential reward functions you could use. This is cool because it uses Marimo notebooks to visualise the reward.

- Finally, Maxime walks us through a real training notebook that uses GRPO to reduce generation length. I’m really into this because it works and Maxime took the time to validate it share assets and logging from his own runs for you to compare with.

Maxime’s work and notebooks have been a major part of the open source community over the last few years. I, like everyone, have learnt so much from them.

Articles 12

Article
9

❤️ a love letter to the Open AI inference client