Testing Red-Team Safeguards Against AI Persuasion in Democratic Governance

Addressing the risk of AI systems manipulating democratic deliberation through real-time 'red-teaming'.

Catalyst Project | April 11, 2026

Abstract background with geometric shapes

As AI becomes a primary information source for governance, current safeguards like human peer review are often too slow to be effective. This project addresses the risk of advanced AI systems manipulating democratic deliberation through persuasive bias. Through large-scale survey experiments and deliberative mini-publics, the team will test the efficacy of “red-team” AI safeguards designed to detect and neutralize biased information in real-time. By providing empirical evidence on AI’s persuasive power, the research offers actionable guidance for Canadian policy, ensuring democratic decision-making remains resilient against manipulation in high-stakes public consultations.

Collaborators

Sam Johnson
Seth Wynes

Related Research

Catalyst Project

Testing Red-Team Safeguards Against AI Persuasion in Democratic Governance

Collaborators

Related Research

Safe autonomous chemistry labs

Sampling latent explanations from LLMs for safe and interpretable reasoning

Testing Red-Team Safeguards Against AI Persuasion in Democratic Governance

Collaborators

Related Research

Safe autonomous chemistry labs

Safety assurance and engineering for multimodal foundation model-enabled AI systems

Sampling latent explanations from LLMs for safe and interpretable reasoning