Huaijie Wang's picture

1 1

Huaijie Wang

jwhj

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 3 days ago

Offline Reinforcement Learning for LLM Multi-Step Reasoning

authored a paper 3 days ago

Offline Reinforcement Learning for LLM Multi-Step Reasoning

updated a model 14 days ago

jwhj/Qwen2.5-Math-1.5B-OREO-Value

View all activity

Organizations

None yet

Papers 2

arxiv:2412.16145

arxiv:2310.11453

models 3

jwhj/Qwen2.5-Math-1.5B-OREO-Value

Updated 14 days ago • 26

jwhj/Qwen2.5-Math-1.5B-OREO

Text Generation • Updated 14 days ago • 32

jwhj/Qwen2.5-Math-1.5B-SFT

Text Generation • Updated 16 days ago • 291

datasets

None public yet