Motoki Wu's picture

Motoki Wu

tokestermw

·

https://motoki.co

AI & ML interests

None yet

Recent Activity

liked a model 1 day ago

Qwen/Qwen2.5-1.5B-Instruct

liked a model 3 days ago

ai21labs/AI21-Jamba-Mini-1.6

liked a model 3 days ago

ai21labs/AI21-Jamba-Large-1.6

View all activity

Organizations

tokestermw's activity

upvoted a collection 4 days ago

Light-R1

Surpassing R1-Distill from Scratch* with 70k Math Data through Curriculum SFT & DPO • 3 items • Updated 6 days ago • 9

upvoted a collection 5 days ago

Hallucination detection

Trained ModernBERT (base and large) for detection hallucinations in LLM responses. The models are trained as token classifications. • 4 items • Updated 5 days ago • 14

upvoted a paper 9 days ago

Rank1: Test-Time Compute for Reranking in Information Retrieval

Paper • 2502.18418 • Published 12 days ago • 25

upvoted a paper 11 days ago

Make LoRA Great Again: Boosting LoRA with Adaptive Singular Values and Mixture-of-Experts Optimization Alignment

Paper • 2502.16894 • Published 14 days ago • 26

upvoted a paper 12 days ago

SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution

Paper • 2502.18449 • Published 12 days ago • 67

upvoted 2 papers 13 days ago

Expect the Unexpected: FailSafe Long Context QA for Finance

Paper • 2502.06329 • Published 28 days ago • 126

InterFeedback: Unveiling Interactive Intelligence of Large Multimodal Models via Human Feedback

Paper • 2502.15027 • Published 17 days ago • 7

upvoted a paper 14 days ago

SIFT: Grounding LLM Reasoning in Contexts via Stickers

Paper • 2502.14922 • Published 18 days ago • 29

upvoted a collection 14 days ago

Sky-T1-7B

A series of 7B models trained with different recipes and the corresponding training data. • 8 items • Updated 24 days ago • 6

upvoted a collection 18 days ago

Process Reward Models

Model and Datasets for Qwen 2.5 Math PRM 7B • 6 items • Updated 19 days ago • 2

upvoted a paper 21 days ago

MM-RLHF: The Next Step Forward in Multimodal LLM Alignment

Paper • 2502.10391 • Published 23 days ago • 31

upvoted a paper 24 days ago

Distillation Scaling Laws

Paper • 2502.08606 • Published 25 days ago • 46

upvoted 2 papers 27 days ago

Agency Is Frame-Dependent

Paper • 2502.04403 • Published Feb 6 • 22

ARR: Question Answering with Large Language Models via Analyzing, Retrieving, and Reasoning

Paper • 2502.04689 • Published about 1 month ago • 7

upvoted an article 27 days ago

Article

Open R1: Update #2

By

and 6 others •

28 days ago

• 197

upvoted a paper 28 days ago

Step Back to Leap Forward: Self-Backtracking for Boosting Reasoning of Language Models

Paper • 2502.04404 • Published Feb 6 • 23

upvoted 2 papers about 1 month ago

Scaling Embedding Layers in Language Models

Paper • 2502.01637 • Published Feb 3 • 24

Llasa: Scaling Train-Time and Inference-Time Compute for Llama-based Speech Synthesis

Paper • 2502.04128 • Published Feb 6 • 24

upvoted an article about 1 month ago

Article

Open-R1: Update #1

By

and 7 others •

Feb 2

• 293

upvoted a paper about 1 month ago

DeepRAG: Thinking to Retrieval Step by Step for Large Language Models

Paper • 2502.01142 • Published Feb 3 • 24