arxiv:2501.18837
Samuel Bowman
samuelpbowman
AI & ML interests
None yet
Recent Activity
authored
a paper
1 day ago
Constitutional Classifiers: Defending against Universal Jailbreaks
across Thousands of Hours of Red Teaming
authored
a paper
about 2 months ago
Alignment faking in large language models
Organizations
None yet