References:
Plumb et al. (2020)
Toggle Text Reference
To ensure plausibility, counterfactuals should yield output distribution similar to real instances of the target class. [Plumb et al. (2020)] evaluate whether each counterfactual matches the output activation of at least one training sample, i.e.:
This confirms that the counterfactual aligns with typical model behavior for that class.
This confirms that the counterfactual aligns with typical model behavior for that class.

