Yizhe Xiong

Bostoncake

Bostoncake

AI & ML interests

None yet

Recent Activity

authored a paper 9 days ago

PYRA: Parallel Yielding Re-Activation for Training-Inference Efficient Task Adaptation

authored a paper 9 days ago

Scaffold-BPE: Enhancing Byte Pair Encoding with Simple and Effective Scaffold Token Removal

authored a paper 9 days ago

MaskMoE: Boosting Token-Level Learning via Routing Mask in Mixture-of-Experts

View all activity

Organizations

None yet

Bostoncake's activity

authored 7 papers 9 days ago

PYRA: Parallel Yielding Re-Activation for Training-Inference Efficient Task Adaptation

Paper • 2403.09192 • Published Mar 14, 2024

Scaffold-BPE: Enhancing Byte Pair Encoding with Simple and Effective Scaffold Token Removal

Paper • 2404.17808 • Published Apr 27, 2024

MaskMoE: Boosting Token-Level Learning via Routing Mask in Mixture-of-Experts

Paper • 2407.09816 • Published Jul 13, 2024 • 1

LBPE: Long-token-first Tokenization to Improve Large Language Models

Paper • 2411.05504 • Published Nov 8, 2024 • 1

CartesianMoE: Boosting Knowledge Sharing among Experts via Cartesian Product Routing in Mixture-of-Experts

Paper • 2410.16077 • Published Oct 21, 2024 • 1

Breaking the Stage Barrier: A Novel Single-Stage Approach to Long Context Extension for Large Language Models

Paper • 2412.07171 • Published Dec 10, 2024 • 1

Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey

Paper • 2412.18619 • Published 27 days ago • 52

liked 2 datasets 9 months ago

PY007/slimpajama_llama_tokenized_upsample_4096_chunk_1M

Viewer • Updated Apr 19, 2024 • 5.04k • 90 • 2

PY007/slimpajama_llama_tokenized_upsample_4096_chunk_256K

Viewer • Updated Apr 19, 2024 • 3.94k • 62 • 1

liked a model over 1 year ago

baichuan-inc/Baichuan-7B

Text Generation • Updated Jan 9, 2024 • 16.3k • 838

liked 3 Spaces over 1 year ago

Runtime error

149

🏢

SegGPT

Running

108

💩

ChatReviewer

Build error

💩

ChatAssistant

updated a Space over 1 year ago

Build error

💩