VXAI LogoExplorerDFKI Logo
Counterfactuability
Contextuality
III
Desiderata
Fidelity
Explanation Type
WBS
References:
Pornprasit et al. (2021)
Toggle Text Reference
To assess the expressiveness and impact of rule-based explanations (either generated directly or extracted from surrogate models such as trees), we can use them to guide the generation of counterfactual perturbations. This evaluates whether the rules have predictive leverage and reflect real decision logic, rather than being purely descriptive or spurious.
Specifically, given an input instance that is covered by a rule, [Pornprasit et al. (2021)] perturb the input such that the rule conditions are violated. If the rule truly captures important rationale, breaking it should influence the black-box model's prediction. This can be quantified in two ways:
• By counting the number of perturbed instances that change the predicted class label.
• By measuring the aggregate change in predicted class probability before and after perturbation.