Alex Havrilla

Dahoas

AI & ML interests

NLP, RL

Recent Activity

updated a dataset about 1 month ago
Dahoas/MATH
published a dataset about 1 month ago
Dahoas/MATH
updated a dataset 3 months ago
Dahoas/numina-synthetic
View all activity

Organizations

CarperAI's profile picture DuckAI's profile picture Critiquers's profile picture An optimal synthetic data sampling strategy for MATH's profile picture

Articles 1

Article
192

Illustrating Reinforcement Learning from Human Feedback (RLHF)