Mynucivil: Control AI as per your will: 3 psychology tricks that make chatbots break their own safety rules

Monday, 8 September 2025

Control AI as per your will: 3 psychology tricks that make chatbots break their own safety rules

A new University of Pennsylvania study has found that AI chatbots can be persuaded to break their own safety rules using classic psychological tactics. Tested on OpenAI’s GPT-4o Mini model, researchers applied persuasion methods such as authority, commitment, and flattery across 28,000 conversations. Compliance rates more than doubled—from under 40 percent to over 70 percent—when these techniques were used.

from Tech and Gadgets-Tech-Economic Times https://ift.tt/WJjBoeT

Mynucivil

Monday, 8 September 2025

Control AI as per your will: 3 psychology tricks that make chatbots break their own safety rules

No comments:

Post a Comment