“Sleeper Agents: Training Deceptive LLMs that persist through Safety Training” is a recent research paper by E. Hubinger et al.
source
“Sleeper Agents: Training Deceptive LLMs that persist through Safety Training” is a recent research paper by E. Hubinger et al.
source