Counterfactuability - Research Metrics Explorer

Back to Overview

Counterfactuability

Contextuality

III

Desiderata

Fidelity

Explanation Type

WBS

References:

Pornprasit et al. (2021)

Toggle Text Reference

To assess the expressiveness and impact of rule-based explanations (either generated directly or extracted from surrogate models such as trees), we can use them to guide the generation of counterfactual perturbations. This evaluates whether the rules have predictive leverage and reflect real decision logic, rather than being purely descriptive or spurious.
Specifically, given an input instance that is covered by a rule, [Pornprasit et al. (2021)] perturb the input such that the rule conditions are violated. If the rule truly captures important rationale, breaking it should influence the black-box model's prediction. This can be quantified in two ways:
• By counting the number of perturbed instances that change the predicted class label.
• By measuring the aggregate change in predicted class probability before and after perturbation.

Guided Perturbation Fidelity

Prediction Neighborhood Continuity