2 2

Max Ryabinin

mryab

https://mryab.github.io/

AI & ML interests

Distributed training, natural language generation, efficient architectures for DL

Recent Activity

authored a paper about 1 month ago

RedPajama: an Open Dataset for Training Large Language Models

View all activity

Articles

Deep Learning over the Internet: Training Language Models Collaboratively

Jul 15, 2021

• 4

Organizations

mryab's activity

authored a paper about 1 month ago

RedPajama: an Open Dataset for Training Large Language Models

Paper • 2411.12372 • Published Nov 19 • 47

authored a paper 6 months ago

Distributed Methods with Compressed Communication for Solving Variational Inequalities, with Theoretical Guarantees

Paper • 2110.03313 • Published Oct 7, 2021 • 1

upvoted a paper 6 months ago

SpecExec: Massively Parallel Speculative Decoding for Interactive LLM Inference on Consumer Devices

Paper • 2406.02532 • Published Jun 4 • 13

authored 5 papers 8 months ago

The Hallucinations Leaderboard -- An Open Effort to Measure Hallucinations in Large Language Models

Paper • 2404.05904 • Published Apr 8 • 8

authored a paper 10 months ago

Mind Your Format: Towards Consistent Evaluation of In-Context Learning Improvements

Paper • 2401.06766 • Published Jan 12 • 2

authored 3 papers about 1 year ago

Distributed Inference and Fine-tuning of Large Language Models Over The Internet

Paper • 2312.08361 • Published Dec 13, 2023 • 25

Training Transformers Together

Paper • 2207.03481 • Published Jul 7, 2022 • 5

Hypernymy Understanding Evaluation of Text-to-Image Models via WordNet Hierarchy

Paper • 2310.09247 • Published Oct 13, 2023 • 3

upvoted a paper about 1 year ago

Hypernymy Understanding Evaluation of Text-to-Image Models via WordNet Hierarchy

Paper • 2310.09247 • Published Oct 13, 2023 • 3

authored 4 papers over 1 year ago

FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU

Paper • 2303.06865 • Published Mar 13, 2023 • 1

SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient

Paper • 2301.11913 • Published Jan 27, 2023 • 1

Distributed Deep Learning in Open Collaborations

Paper • 2106.10207 • Published Jun 18, 2021 • 2

It's All in the Heads: Using Attention Heads as a Baseline for Cross-Lingual Transfer in Commonsense Reasoning

Paper • 2106.12066 • Published Jun 22, 2021 • 1

updated 2 models about 2 years ago

mryab/test-bloomd-560m-int8

Updated Dec 23, 2022

mryab/test-bloomd-560m-fp16

Feature Extraction • Updated Dec 23, 2022 • 7

updated a model over 2 years ago

yandex/RuLeanALBERT

Fill-Mask • Updated Sep 15, 2022 • 60 • 33