Explainability
Explainable AI (XAI) is a field dedicated to peering inside the “black box” of complex AI models. In response to legal and regulatory pressure for transparency, AI companies have developed tools that purport to explain their models’ decisions. However, from a security and legal perspective, these “explanations” are often a new, more sophisticated form of obfuscation.
Analogy: The Politician’s Press Secretary
Imagine a powerful, inscrutable politician who makes a sudden, controversial decision. This politician is the black-box AI model. You have no idea what their true motivations are.
Reporters demand an explanation. The politician doesn’t speak. Instead, they send out their press secretary. This press secretary is the Explainable AI (XAI) tool.
- The “Explanation”: The press secretary gives a smooth, polished, and perfectly plausible reason for the politician’s decision. It sounds logical and aligns with public policy.
- The Reality: The press secretary has no idea what the politician was actually thinking. The politician might have made the decision based on a bribe, a personal grudge, or a complete misunderstanding of the facts. The press secretary’s job is not to reveal the truth; it’s to construct a post-hoc rationalization that makes the decision seem legitimate.
This is exactly how most XAI tools work. They don’t have access to the model’s “thought process.” They are separate models that are trained to look at an input and an output, and generate a plausible-sounding story that connects the two.
The Legal and Technical Flaws
The legal system’s demand for transparency is being met with a technical solution that provides the illusion of transparency.
-
Explanations are Not Ground Truth: An “explanation” generated by a tool like LIME or SHAP is not evidence of the model’s reasoning. It is another piece of model-generated output. It can be wrong. It can be biased. It can be manipulated. For example, an attacker could create a deliberately biased model (e.g., one that denies loans based on race) and then train an XAI tool to produce fake explanations that cite only non-discriminatory factors. The explanation itself is a potential smokescreen.
-
Attention is Not Explanation: With transformer models, it’s common to see “attention maps” presented as an explanation. These maps show which words in the input the model “paid attention to.” While this can be a useful diagnostic tool for developers, it is not an explanation of why the model made its decision. A person can pay attention to many things without them being the causal factor for their actions. Presenting an attention map as a legal explanation is a red herring.
-
The Black Box Remains Locked: True explainability would require full access to the model’s architecture, its weights, and the training data that formed those weights. Companies will fight tooth and nail to protect this information as their most valuable trade secret. They will offer up XAI-generated “explanations” as a compromise, hoping that courts and regulators will accept the press secretary’s statement without deposing the politician.
For a litigator, it is critical to challenge the validity of AI-generated explanations. You must demand not just the explanation, but the proof that the explanation is a faithful and complete representation of the model’s actual decision-making process. Anything less is just accepting a plausible lie.