Formalizing constraints for assessing and mitigating agentic risk

Developing a distributed governance model to mitigate the risks of agentic AI and futher responsible AI deployment in industry.

Catalyst Project | April 11, 2026

Abstract background with geometric shapes

As AI agents are increasingly deployed in organizations in semi-autonomous fashions, concerns about the risks have accompanied their use. Canada CIFAR AI Chair Sheila McIlraith will develop concrete tools for a technical safety solution, combining approaches like context-specific evaluation, reward modeling and alignment. This project focuses on the use of Desired Behavior Specifications, which are encoded in representations in order to derive rules interpretible by humans – such as designing a system that can extract a set of formal rules. Ultimately, by developing a distributed governance model to mitigate the risks of agentic AI, the project aims to further responsible AI deployment in industry.

Collaborators

Sheila McIlraith
Canada CIFAR AI Chair, Vector Institute; University of Toronto

Related Research

Catalyst Project

Formalizing constraints for assessing and mitigating agentic risk

Collaborators

Related Research

Performative Empathy and Deceptive Alignment

Repetition, Resistance, and Reinforcement: Longitudinal Effects of Conversational AI on Political Attitudes

Safe autonomous chemistry labs