Formalizing constraints for assessing and mitigating agentic risk
Developing a distributed governance model to mitigate the risks of agentic AI and futher responsible AI deployment in industry.
As AI agents are increasingly deployed in organizations in semi-autonomous fashions, concerns about the risks have accompanied their use. Canada CIFAR AI Chair Sheila McIlraith will develop concrete tools for a technical safety solution, combining approaches like context-specific evaluation, reward modeling and alignment. This project focuses on the use of Desired Behavior Specifications, which are encoded in representations in order to derive rules interpretible by humans – such as designing a system that can extract a set of formal rules. Ultimately, by developing a distributed governance model to mitigate the risks of agentic AI, the project aims to further responsible AI deployment in industry.
Collaborators
Sheila McIlraith
Canada CIFAR AI Chair, Vector Institute; University of Toronto
