How Should AI Apologize?
An apology that deflects blame repairs trust better than one that accepts it.
A new study out in AI & Society (Turle and Cui, 2026) asks a critical question: Can AI systems repair trust with users by offering an apology when an error is made? This relates to broader issues of explanation in AI systems (i.e. to explain why an error was made) and responsibility attribution (i.e. how does the system attribute the cause of the error in the apology).
Apologies are statements that typically express regret and sometimes include an attribution that elaborates why an undesired outcome came about. Their social utility is in acknowledging blame and communicating a willingness to improve in the future, or in explaining why the apologizer couldn’t control the outcome. There are three broad apology types studied in the literature: (1) the basic “we’re sorry” which doesn’t identify the source of the problem, (2) internal blame that acknowledges the role of the apologizer, (3) external blame that attributes factors outside of the apologizer’s control for the cause of the error.
Through a series of experiments with human respondents Turle and Cui find that there are some tasks and types of apologies that do work to repair reliance in the AI system they study. Simple apologies didn’t work at all. But apologies from the system that blamed external factors (i.e. that the data fed into the system was insufficient) repaired reliance better than an internal attribution to the system’s own limitations (i.e. the AI tool lacked capability to do the task). This was true for what the authors call “objective” tasks — in this case estimating a person’s weight from an image.
What I find troubling about the findings is that deflecting the blame worked better than acknowledging it. The risk for accountability is that individuals subject to AI errors might inappropriately brush aside an error rather than contesting it and asking for a deeper explanation. Perhaps AI system apologies should instead locate and identify the human actors in the system who are offering the apology. The findings further remind us of the importance of faithfulness in AI explanations. Given the issues with sycophancy in LLMs it’s not hard to imagine a situation where models might learn to deflect blame in their “apologies” by offering not-so-faithful renditions of why an error occurred due to some external factor beyond the system. While there are certainly legitimate situations where an explanation due to external factors is warranted, policy desperately needs to grapple with the validity of explanations rendered directly by AI systems.
References
Turel O and Cui T (2026) Apologizing artificial intelligence: designing and evaluating effective AI apologies after errors. AI & SOCIETY: 1–18.
