475 103 942

Peter Szemraj PRO

pszemraj

https://pszemraj.carrd.co/

pszemraj

AI & ML interests

metallic intuition

Recent Activity

upvoted a paper 1 day ago

RedPajama: an Open Dataset for Training Large Language Models

upvoted a paper 3 days ago

HoT: Highlighted Chain of Thought for Referencing Supporting Facts from Inputs

liked a model 4 days ago

Qwen/QwQ-32B

View all activity

Organizations

pszemraj's activity

upvoted a paper 1 day ago

RedPajama: an Open Dataset for Training Large Language Models

Paper • 2411.12372 • Published Nov 19, 2024 • 53

upvoted a paper 3 days ago

HoT: Highlighted Chain of Thought for Referencing Supporting Facts from Inputs

Paper • 2503.02003 • Published 6 days ago • 37

liked a model 4 days ago

Qwen/QwQ-32B

Text Generation • Updated 3 days ago • 103k • • 1.69k

upvoted 2 papers 5 days ago

When an LLM is apprehensive about its answers -- and when its uncertainty is justified

Paper • 2503.01688 • Published 6 days ago • 19

From Hours to Minutes: Lossless Acceleration of Ultra Long Sequence Generation up to 100K Tokens

Paper • 2502.18890 • Published 12 days ago • 23

upvoted a collection 6 days ago

ProX Dataset

Collection

a collection of pre-training corpora refined by ProX • 6 items • Updated 23 days ago • 7

liked a model 7 days ago

chandar-lab/NeoBERT

Feature Extraction • Updated 7 days ago • 2.77k • 93

upvoted 2 papers 7 days ago

LongRoPE2: Near-Lossless LLM Context Window Scaling

Paper • 2502.20082 • Published 10 days ago • 31

NeoBERT: A Next-Generation BERT

Paper • 2502.19587 • Published 11 days ago • 38

liked a dataset 10 days ago

gair-prox/DCLM-pro

Viewer • Updated 22 days ago • 366M • 9.89k • 7

upvoted a paper 12 days ago

Thus Spake Long-Context Large Language Model

Paper • 2502.17129 • Published 13 days ago • 67

liked a model 13 days ago

HuggingFaceTB/SmolLM2-1.7B-Instruct-16k

Text Generation • Updated 16 days ago • 1.62k • 6

upvoted 2 papers 14 days ago

How to Get Your LLM to Generate Challenging Problems for Evaluation

Paper • 2502.14678 • Published 17 days ago • 16

SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines

Paper • 2502.14739 • Published 17 days ago • 94

liked a model 15 days ago

Shengkun/DarwinLM-2.7B

Text Generation • Updated 13 days ago • 65 • 1

New activity in pszemraj/xtremedistil-l12-h384-uncased-CoLA 15 days ago

Adding `safetensors` variant of this model

#2 opened 15 days ago by

SFconvertbot

New activity in ml4pubmed/bluebert-pubmed-uncased-L-12-H-768-A-12_pub_section 15 days ago

Adding `safetensors` variant of this model

#1 opened 15 days ago by

SFconvertbot

upvoted 2 papers 17 days ago

Small Models Struggle to Learn from Strong Reasoners

Paper • 2502.12143 • Published 20 days ago • 28

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published 18 days ago • 157

liked a model 20 days ago

tomg-group-umd/huginn-0125

Text Generation • Updated 14 days ago • 8.95k • 242