Towards Socially Grounded AI Safety: Integrating Causal and Institutional Reasoning in Language Models

Addressing the risk of "sociologically naïve" AI systems that lack an understanding of cultural norms and institutional contexts.

Catalyst Project | April 11, 2026

Abstract background with geometric shapes

This project addresses the risk of “sociologically naïve” AI systems that lack an understanding of cultural norms and institutional contexts. While current models are technically advanced, they often rely on surface-level correlations that can inadvertently reproduce harmful social biases. By bridging sociology and computer science, the team will develop agentic architectures and causal reasoning modules to embed social context into AI behaviour. Ultimately, this research moves AI safety beyond simple behavioral control toward relational accountability, ensuring that AI systems can navigate complex human ecologies appropriately while fostering more trustworthy human-AI coordination.

Collaborators

Zhijing Jin
Canada CIFAR AI Chair, Vector Institute; University of Toronto
Matt Ratto

Related Research

Catalyst Project

Socio-Technical Solutions to Improve Information Integrity and AI Literacy

Catalyst Project

Towards Socially Grounded AI Safety: Integrating Causal and Institutional Reasoning in Language Models

Collaborators

Related Research

Socio-Technical Solutions to Improve Information Integrity and AI Literacy

Testing Red-Team Safeguards Against AI Persuasion in Democratic Governance