VXAI LogoExplorerDFKI Logo
Output Similarity
Contextuality
II
Desiderata
Plausibility
Explanation Type
ExE
References:
Plumb et al. (2020)
Toggle Text Reference
To ensure plausibility, counterfactuals should yield output distribution similar to real instances of the target class. [Plumb et al. (2020)] evaluate whether each counterfactual zz matches the output activation of at least one training sample, i.e.: xXyz : δ(θ(x),θ(z))<ϵ\exists x' \in \mathcal{X}_{y^*_z} ~:~ \delta\big(\theta(x'), \theta(z)\big) < \epsilon
This confirms that the counterfactual aligns with typical model behavior for that class.