Towards Socially Grounded AI Safety: Integrating Causal and Institutional Reasoning in Language Models
Addressing the risk of "sociologically naïve" AI systems that lack an understanding of cultural norms and institutional contexts.
This project addresses the risk of “sociologically naïve” AI systems that lack an understanding of cultural norms and institutional contexts. While current models are technically advanced, they often rely on surface-level correlations that can inadvertently reproduce harmful social biases. By bridging sociology and computer science, the team will develop agentic architectures and causal reasoning modules to embed social context into AI behaviour. Ultimately, this research moves AI safety beyond simple behavioral control toward relational accountability, ensuring that AI systems can navigate complex human ecologies appropriately while fostering more trustworthy human-AI coordination.
Collaborators
Zhijing Jin
Canada CIFAR AI Chair, Vector Institute; University of Toronto
Matt Ratto

