arxiv:2412.14093
Jared Kaplan
FrizzleFried
AI & ML interests
None yet
Recent Activity
authored
a paper
24 days ago
Alignment faking in large language models
authored
a paper
about 1 year ago
Sleeper Agents: Training Deceptive LLMs that Persist Through Safety
Training
authored
a paper
about 1 year ago
Specific versus General Principles for Constitutional AI
Organizations
None yet
models
None public yet
datasets
None public yet