-
Med-RLVR: Emerging Medical Reasoning from a 3B base model via reinforcement Learning
Paper • 2502.19655 • Published -
MedVLM-R1: Incentivizing Medical Reasoning Capability of Vision-Language Models (VLMs) via Reinforcement Learning
Paper • 2502.19634 • Published • 56 -
R1-T1: Fully Incentivizing Translation Capability in LLMs via Reasoning Learning
Paper • 2502.19735 • Published • 7 -
AlphaMaze: Enhancing Large Language Models' Spatial Intelligence via GRPO
Paper • 2502.14669 • Published • 11
Deping Zhang
Deping
AI & ML interests
Deep Reinforcement Learning, Computer Vision, Large Language Models ( especially their "emergence" capabilities), Theoretical Condensed Matter Physics ( superconductivity, ferromagnetism)
Recent Activity
updated
a collection
7 days ago
LLM_VLM_R1
updated
a collection
7 days ago
LLM_VLM_R1
updated
a collection
7 days ago
LLM_VLM_R1
Organizations
None yet
Collections
10
models
None public yet
datasets
None public yet