OpenRLHF

community

https://github.com/OpenRLHF

AI & ML interests

None defined yet.

Recent Activity

Longhui98 authored a paper about 1 month ago

Kimi k1.5: Scaling Reinforcement Learning with LLMs

Longhui98 authored a paper about 1 month ago

Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision

Longhui98 authored a paper about 1 month ago

Forward-Backward Reasoning in Large Language Models for Mathematical Verification

View all activity

models 10

OpenRLHF/Llama-3-8b-rm-mixture

Updated Nov 30, 2024 • 6k

OpenRLHF/Llama-2-7b-rm-anthropic_hh-lmsys-oasst-webgpt

Updated Nov 30, 2024 • 26 • 1

OpenRLHF/Llama-3-8b-rm-700k

Updated Nov 30, 2024 • 2.91k • 3

OpenRLHF/Mistral-7b-PRM-Math-Shepherd

Updated Oct 30, 2024 • 18 • 1

OpenRLHF/Llama-3-8b-iter-dpo-179k

Text Generation • Updated Jul 28, 2024 • 21

OpenRLHF/Llama-3-8b-rlhf-100k

Text Generation • Updated Jun 24, 2024 • 475 • 4

OpenRLHF/Llama-3-8b-sft-mixture

Text Generation • Updated Jun 14, 2024 • 22.5k • 1

OpenRLHF/Llama-2-7b-sft-model-ocra-500k

Text Generation • Updated Jun 9, 2024 • 33

OpenRLHF/Llama-2-13b-rm-anthropic_hh-lmsys-oasst-webgpt

Updated Jan 24, 2024 • 22

OpenRLHF/Llama-2-13b-sft-model-ocra-500k

Text Generation • Updated Jan 5, 2024 • 310 • 1

datasets 4

OpenRLHF/prompt-collection-v0.1-dev-100k

Viewer • Updated Dec 13, 2024 • 102k • 126

OpenRLHF/preference_700K

Viewer • Updated Jul 13, 2024 • 700k • 142 • 1

OpenRLHF/prompt-collection-v0.1

Viewer • Updated Jun 14, 2024 • 179k • 4.43k • 7

OpenRLHF/preference_dataset_mixture2_and_safe_pku

Viewer • Updated Jun 14, 2024 • 555k • 1.05k • 4