Formalizing constraints for assessing and mitigating agentic risk

Developing a distributed governance model to mitigate the risks of agentic AI and futher responsible AI deployment in industry.

Catalyst Project | April 11, 2026

Abstract background with geometric shapes

As AI agents are increasingly deployed in organizations in semi-autonomous fashions, concerns about the risks have accompanied their use. Canada CIFAR AI Chair Sheila McIlraith will develop concrete tools for a technical safety solution, combining approaches like context-specific evaluation, reward modeling and alignment. This project focuses on the use of Desired Behavior Specifications, which are encoded in representations in order to derive rules interpretible by humans – such as designing a system that can extract a set of formal rules. Ultimately, by developing a distributed governance model to mitigate the risks of agentic AI, the project aims to further responsible AI deployment in industry.

Collaborators

Sheila McIlraith
Canada CIFAR AI Chair, Vector Institute; University of Toronto

Related Research

Catalyst Project

Formalizing constraints for assessing and mitigating agentic risk

Collaborators

Related Research

Adversarial robustness of large language model (LLM) safety

Democratic Alignment of LLMs Through Economic Theory: Relative Preferences and Strategic Coordination

Formalizing constraints for assessing and mitigating agentic risk

Collaborators

Related Research

Adversarial robustness of large language model (LLM) safety

CIPHER: Countering influence through pattern highlighting and evolving responses

Democratic Alignment of LLMs Through Economic Theory: Relative Preferences and Strategic Coordination