19 8 164

Nathan Cooper

ncoop57

https://nathancooper.io

AI & ML interests

The intersection of Software Engineering, Deep Learning, NLP, and Graph Networks.

Recent Activity

liked a dataset about 2 hours ago

simplescaling/s1K

liked a model 2 days ago

Qwen/Qwen2.5-Coder-0.5B-Instruct

upvoted a paper 2 days ago

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

View all activity

Articles

Finally, a Replacement for BERT: Introducing ModernBERT

Dec 19, 2024

• 523

Organizations

ncoop57's activity

upvoted a paper 2 days ago

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published 7 days ago • 95

upvoted an article 7 days ago

Article

Open-R1: a fully open reproduction of DeepSeek-R1

8 days ago

• 613

upvoted an article 27 days ago

Article

Synthetic Data Generation with FastData and Hugging Face

•

28 days ago

• 14

upvoted a collection about 1 month ago

ModernBERT

Collection

Bringing BERT into modernity via both architecture changes and scaling • 3 items • Updated Dec 19, 2024 • 132

upvoted a paper about 2 months ago

Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published Dec 18, 2024 • 125

upvoted an article 6 months ago

Article

Uncensor any LLM with abliteration

•

Jun 13, 2024

• 415

upvoted a collection 8 months ago

Qwen2

Collection

Qwen2 language models, including pretrained and instruction-tuned models of 5 sizes, including 0.5B, 1.5B, 7B, 57B-A14B, and 72B. • 39 items • Updated Nov 28, 2024 • 356

upvoted a paper 11 months ago

Stable LM 2 1.6B Technical Report

Paper • 2402.17834 • Published Feb 27, 2024 • 3