siyeng feng's picture

428 189

siyeng feng

siyengfeng

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 2 days ago

Token-Efficient Long Video Understanding for Multimodal LLMs

reacted to onekq's post with 👍 2 days ago

QwQ-32B is amazing! It ranks below o1-preview, but beats DeepSeek v3 and all Gemini models. https://huggingface.co./spaces/onekq-ai/WebApp1K-models-leaderboard Now we have such a powerful model that can fit into a single GPU, can someone finetune a web app model to push SOTA of my leaderboard? 🤗

upvoted a paper 2 days ago

Dedicated Feedback and Edit Models Empower Inference-Time Scaling for Open-Ended General-Domain Tasks

View all activity

Organizations

None yet

siyengfeng's activity

upvoted 4 papers 2 days ago

Token-Efficient Long Video Understanding for Multimodal LLMs

Paper • 2503.04130 • Published 4 days ago • 65

Dedicated Feedback and Edit Models Empower Inference-Time Scaling for Open-Ended General-Domain Tasks

Paper • 2503.04378 • Published 4 days ago • 6

FuseChat-3.0: Preference Optimization Meets Heterogeneous Model Fusion

Paper • 2503.04222 • Published 4 days ago • 12

LINGOLY-TOO: Disentangling Memorisation from Reasoning with Linguistic Templatisation and Orthographic Obfuscation

Paper • 2503.02972 • Published 5 days ago • 23

upvoted 14 papers 3 days ago

START: Self-taught Reasoner with Tools

Paper • 2503.04625 • Published 3 days ago • 76

Can Large Language Models Detect Errors in Long Chain-of-Thought Reasoning?

Paper • 2502.19361 • Published 11 days ago • 26

Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems

Paper • 2502.19328 • Published 11 days ago • 21

Rank1: Test-Time Compute for Reranking in Information Retrieval

Paper • 2502.18418 • Published 12 days ago • 25

Towards an AI co-scientist

Paper • 2502.18864 • Published 12 days ago • 41

CODESYNC: Synchronizing Large Language Models with Dynamic Code Evolution at Scale

Paper • 2502.16645 • Published 14 days ago • 21

FINEREASON: Evaluating and Improving LLMs' Deliberate Reasoning through Reflective Puzzle Solving

Paper • 2502.20238 • Published 10 days ago • 24

R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts

Paper • 2502.20395 • Published 10 days ago • 43

SoS1: O1 and R1-Like Reasoning LLMs are Sum-of-Square Solvers

Paper • 2502.20545 • Published 10 days ago • 20

Multi-Turn Code Generation Through Single-Step Rewards

Paper • 2502.20380 • Published 10 days ago • 29

Chain of Draft: Thinking Faster by Writing Less

Paper • 2502.18600 • Published 12 days ago • 44

Predictive Data Selection: The Data That Predicts Is the Data That Teaches

Paper • 2503.00808 • Published 8 days ago • 51

KodCode: A Diverse, Challenging, and Verifiable Synthetic Dataset for Coding

Paper • 2503.02951 • Published 5 days ago • 25

Process-based Self-Rewarding Language Models

Paper • 2503.03746 • Published 4 days ago • 33

upvoted 2 papers 5 days ago

IterPref: Focal Preference Learning for Code Generation via Iterative Debugging

Paper • 2503.02783 • Published 5 days ago • 5

MPO: Boosting LLM Agents with Meta Plan Optimization

Paper • 2503.02682 • Published 5 days ago • 23