arxiv:2501.18837
Jesse Mu
jayelm
AI & ML interests
None yet
Recent Activity
authored
a paper
1 day ago
Constitutional Classifiers: Defending against Universal Jailbreaks
across Thousands of Hours of Red Teaming
authored
a paper
about 1 year ago
Sleeper Agents: Training Deceptive LLMs that Persist Through Safety
Training
updated
a model
almost 2 years ago
jayelm/flan-t5-xxl-neg_control-1
Organizations
None yet
Papers
2
models
6
jayelm/flan-t5-xxl-neg_control-1
Text2Text Generation
•
Updated
•
3
jayelm/flan-t5-xxl-gist-1
Text2Text Generation
•
Updated
•
16
•
3
jayelm/flan-t5-xxl-pos_control-1
Text2Text Generation
•
Updated
•
1
jayelm/llama-7b-neg_control-1
Text Generation
•
Updated
•
5
jayelm/llama-7b-pos_control-1
Text Generation
•
Updated
•
8
jayelm/llama-7b-gist-1
Text Generation
•
Updated
•
9
•
5