Rui Yang's picture

Rui Yang

Ray2333

·

https://yangrui2015.github.io

YangRui2015

AI & ML interests

Deep Reinforcement Learning

Recent Activity

upvoted a paper 1 day ago

The BrowserGym Ecosystem for Web Agent Research

upvoted a paper 6 days ago

Predictive Data Selection: The Data That Predicts Is the Data That Teaches

new activity 9 days ago

microsoft/Magma-8B:generation_args in the example

View all activity

Organizations

Collections 1

Papers 4

arxiv:2502.09560

arxiv:2411.00836

arxiv:2406.10216

arxiv:2402.10207

models 15

Ray2333/Gemma-2B-rewardmodel-baseline

Text Classification • Updated Feb 5 • 818

Ray2333/GRM-llama3-8B-distill

Text Classification • Updated Feb 5 • 137 • 6

Ray2333/reward-model-Mistral-7B-instruct-Unified-Feedback

Text Classification • Updated Feb 5 • 1.98k • 11

Ray2333/GRM-Gemma-2B-rewardmodel-ft

Updated Feb 5 • 70 • 1

Ray2333/Gemma-2B-rewardmodel-ft

Updated Feb 5 • 30 • 1

Ray2333/GRM-llama3.2-3B-sftreg

Text Classification • Updated Feb 5 • 474 • 1

Ray2333/GRM-Gemma-2B-sftreg

Text Classification • Updated Feb 5 • 260 • 3

Ray2333/GRM-llama3-8B-sftreg

Text Classification • Updated Feb 5 • 277 • 5

Ray2333/GRM-Gemma2-2B-sftreg

Text Classification • Updated Feb 5 • 21 • 1

Ray2333/GRM-gemma2-2B-rewardmodel-ft

Text Classification • Updated Feb 5 • 1.04k • 6

datasets 1

Ray2333/RiC_harmless_helpful

Viewer • Updated Jul 12, 2024 • 291k • 97