PRefLexOR Collection PRefLexOR: Preference-based Recursive Language Modeling for Exploratory Optimization of Reasoning and Agentic Thinking • 4 items • Updated Oct 25 • 2
Fine-Grained Verifiers: Preference Modeling as Next-token Prediction in Vision-Language Alignment Paper • 2410.14148 • Published Oct 18 • 1