Haitham Bou Ammar PRO

hba123

AI & ML interests

LLMs, VLMs, Robotics, Reinforcement Learning, Bayesian Optimisation

Recent Activity

Articles

Organizations

None yet

Posts 1

view post
Post
498
Blindly applying algorithms without understanding the math behind them is not a good idea frmpv. So, I am on a quest to fix this!

I wrote my first hugging face article on how you would derive closed-form solutions for KL-regularised reinforcement learning problems - what is used for DPO.


Check it out: https://huggingface.co./blog/hba123/derivingdpo

models

None public yet

datasets

None public yet