Jesse Mu's picture

2

Jesse Mu

jayelm

·

http://cs.stanford.edu/~muj

AI & ML interests

None yet

Recent Activity

authored a paper about 1 month ago

Constitutional Classifiers: Defending against Universal Jailbreaks across Thousands of Hours of Red Teaming

authored a paper about 1 year ago

Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training

updated a model almost 2 years ago

jayelm/flan-t5-xxl-neg_control-1

View all activity

Organizations

None yet

Papers 2

arxiv:2501.18837

arxiv:2401.05566

models 6

jayelm/flan-t5-xxl-neg_control-1

Text2Text Generation • Updated Apr 24, 2023 • 14

jayelm/flan-t5-xxl-gist-1

Text2Text Generation • Updated Apr 24, 2023 • 66 • 3

jayelm/flan-t5-xxl-pos_control-1

Text2Text Generation • Updated Apr 24, 2023 • 15

jayelm/llama-7b-neg_control-1

Text Generation • Updated Apr 24, 2023 • 16

jayelm/llama-7b-pos_control-1

Text Generation • Updated Apr 24, 2023 • 17

jayelm/llama-7b-gist-1

Text Generation • Updated Apr 24, 2023 • 209 • 5

datasets 1

jayelm/natural-instructions

Viewer • Updated Jan 29, 2023 • 2.39M • 652 • 4