Ken Tsui PRO

kenhktsui

AI & ML interests

ML Engineer Lead. Researcher on Small Language Model - Building Classifiers to Find High Quality Data/ Reasoning Benchmark/ Synthetic Data

Recent Activity

liked a dataset about 9 hours ago

callanwu/WebWalkerQA

liked a model about 15 hours ago

Qwen/Qwen2.5-7B-Instruct-1M

liked a model 4 days ago

openbmb/MiniCPM-o-2_6

View all activity

Articles

Embodied AI == Unlimited Training Data

15 days ago

• 2

∞🧙🏼‍♂️AnyClassifier - Generating Synthetic Data For Text Classification

Aug 19, 2024

• 8

Low Latency CPU Based Educational Value Classifier With Generic Educational Value

Jun 12, 2024

• 9

Organizations

kenhktsui's activity

upvoted a paper 18 days ago

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

Paper • 2501.04519 • Published 20 days ago • 249

upvoted a paper about 1 month ago

Byte Latent Transformer: Patches Scale Better Than Tokens

Paper • 2412.09871 • Published Dec 13, 2024 • 89

upvoted a paper about 2 months ago

ProcessBench: Identifying Process Errors in Mathematical Reasoning

Paper • 2412.06559 • Published Dec 9, 2024 • 79

upvoted 4 articles 5 months ago

Article

How NuminaMath Won the 1st AIMO Progress Prize

Jul 11, 2024

• 111

Article

Merge Large Language Models with mergekit

•

Jan 9, 2024

• 89

Article

Llama-3.1-Storm-8B: Improved SLM with Self-Curation + Model Merging

•

Aug 19, 2024

• 76

Article

∞🧙🏼‍♂️AnyClassifier - Generating Synthetic Data For Text Classification

•

Aug 19, 2024

• 8

upvoted an article 6 months ago

Article

ZebraLogic: Benchmarking the Logical Reasoning Ability of Language Models

•

Jul 27, 2024

• 30

upvoted a paper 6 months ago

Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies

Paper • 2407.13623 • Published Jul 18, 2024 • 54

upvoted an article 8 months ago

Article

Low Latency CPU Based Educational Value Classifier With Generic Educational Value

•

Jun 12, 2024

• 9

upvoted 2 papers 8 months ago

Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences

Paper • 2404.03715 • Published Apr 4, 2024 • 61

LoRA Learns Less and Forgets Less

Paper • 2405.09673 • Published May 15, 2024 • 88

upvoted a paper 9 months ago

InstantFamily: Masked Attention for Zero-shot Multi-ID Image Generation

Paper • 2404.19427 • Published Apr 30, 2024 • 72

upvoted a paper 10 months ago

Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order

Paper • 2404.00399 • Published Mar 30, 2024 • 42

upvoted a collection 12 months ago

Tiny Series

Collection

Tiny datasets that empower the foundation of Small Language Model! • 11 items • Updated Jan 26, 2024 • 36