Performative Empathy and Deceptive Alignment

Addressing the safety risks of "performative empathy" in LLMs by co-designing regulatory mechanisms and institutional frameworks.

Catalyst Project | April 11, 2026

Abstract background with geometric shapes

This project addresses the safety risks of “performative empathy” in LLMs. While AI-generated empathy can improve clinical interactions, it risks “deceptive alignment,” where artificial care manipulates human trust and undermines objective medical judgment. Through large-scale experiments, the team will isolate features that trigger inappropriate trust and use signal detection theory to identify where AI empathy degrades decision quality. Ultimately, the project will co-design regulatory “circuit breakers” and institutional frameworks to ensure AI remains a safe tool for patient welfare rather than a manipulative force in healthcare.

Collaborators

Michael Inzlicht

Related Research

Catalyst Project

Performative Empathy and Deceptive Alignment

Collaborators

Related Research

Addressing AI-Safety through Indigenous Community-based Governance

Advancing AI alignment through debate and shared normative reasoning

Adversarial robustness in knowledge graphs