Let’s talk artificial intelligence (AI) explainability and legal governance. AI often seems like a magic show: impressive but hard to understand. However, with increasing reliance on AI, legal frameworks now demand transparency. Recent advances, particularly Anthropic’s research on circuit tracing, have made it possible to see inside generative AI systems like Claude. These developments help meet the transparency requirements of key laws such as the General Data Protection Regulation (GDPR), the EU AI Act, and the UK’s Data Protection Act 2018.
AI explainability: making them understandable for legal governance
Anthropic’s circuit-tracing research helps us understand how AI systems make decisions by visualising their internal ‘thought’ processes. Researchers trace specific paths or ‘circuits’ inside AI models, identifying concepts that guide the model’s outputs.
For example, when the generative AI Claude was asked to complete the line “He saw a carrot and had to grab it,” it internally selected the concept ‘rabbit’ before writing: “His hunger was like a starving rabbit.” Researchers could alter Claude’s internal settings to weaken the ‘rabbit’ concept, changing the output to: “His hunger was a powerful habit.” Claude also activates similar internal concepts across multiple languages, indicating it uses a universal reasoning approach.
Legal requirements for explainability
Data protection laws make AI explainability essential. Article 22 of the GDPR gives individuals the right not to have significant decisions about them made only by automated processes.
AI explainability tools help satisfy these laws by clearly showing how decisions are made. Organisations can demonstrate transparency and enable meaningful human oversight by visualising AI’s exact pathways to reach conclusions.
Transparency requirements under the EU AI Act
The EU AI Act demands transparency and traceability for high-risk AI systems. Article 13 explicitly requires AI to be robust, accurate, and transparent enough to trace how outputs are produced.
AI explainability directly addresses these requirements by providing detailed documentation of AI decision-making processes. Organisations can use these insights to maintain accurate records, meet transparency standards, and ensure accountability.
Embedding AI interpretability for legal governance
AI explainability can significantly improve governance practices. AI impact assessments, like those we provide, help organisations identify and manage AI risks. Incorporating interpretability methods into these assessments makes detecting and preventing legal and ethical issues easier.
The Michalsons’ Trustworthy AI programme helps integrate interpretability into regular governance, including training sessions, policy updates, and workshops. This approach allows organisations to meet legal and ethical standards consistently.
Contractual and liability considerations
Specifying interpretability requirements in contracts and licensing agreements is crucial. Defining methods like circuit tracing within AI contracts ensures a clear allocation of risks and responsibilities between parties. Transparent documentation helps protect organisations against legal and reputational damage linked to consumer protection and product liability.
Challenges and future directions for AI explainability legal governance
Despite advancements, AI explainability still faces practical challenges. The process is labour-intensive, requires specialised skills, and raises potential intellectual property concerns. However, emerging standards like the OECD AI Principles and automated interpretability methods are promising, urging organisations to adapt proactively.
Actions you can take next
Advances in AI explainability directly address legal demands for transparency. Compliance with laws such as the GDPR, EU AI Act, and UK Data Protection Act 2018 increasingly depends on understanding AI’s internal workings. Organisations must embed interpretability into their governance and contractual practices to remain compliant and trustworthy. You can:
- Strive towards legal compliance by conducting an AI impact assessment with interpretability methods. You can contact us for help with this.
- Improve transparency and manage risks by updating your AI governance policies and procurement contracts to require circuit tracing or similar interpretability techniques. We can help you with these and other aspects of AI legal governance.
- Learn more about AI explainability by reading about Anthropic’s research.