References:
Yeh et al. (2020), Dasgupta et al. (2022)
Toggle Text Reference
An explanation should be complete enough to serve as a sufficient reason for a given model output. Ideally, the explanans alone should allow us to predict the outcome. This property reflects the completeness of the explanans and can be assessed in different ways.
[Yeh et al. (2020)] train a secondary model that maps the explanans back to the black-box's activation space. The predictive performance using the mapped explanans indicates how much information the explanation preserves about the original model's decision process.
Alternatively, [Dasgupta et al. (2022)] evaluate whether explanations are sufficient for consistent outcomes: Given an explanans for an input , we identify other instances whose explanantia are equivalent (or sufficiently similar) to and compute the fraction that shares the same model prediction as . A higher agreement indicates a more sufficient explanation.
[Yeh et al. (2020)] train a secondary model that maps the explanans back to the black-box's activation space. The predictive performance using the mapped explanans indicates how much information the explanation preserves about the original model's decision process.
Alternatively, [Dasgupta et al. (2022)] evaluate whether explanations are sufficient for consistent outcomes: Given an explanans for an input , we identify other instances whose explanantia are equivalent (or sufficiently similar) to and compute the fraction that shares the same model prediction as . A higher agreement indicates a more sufficient explanation.

