Saturday, March 17, 2007

How Johnny Can Persuade LLMs to Jailbreak Them: Rethinking Persuasion to Challenge AI Safety by Humanizing LLMs

  This project is about how to systematically persuade LLMs to jailbreak them. The well-known ...