AI Ethics Principles and Accountability
To move from high-level values to effective accountability, we still need to bridge the gap between abstractions and quantifiable, data-driven metrics.
Establishing norms and behavioral standards for AI systems is central to the AI Accountability Problem. Over the years, private companies, government agencies, non-profits, and other organizations have put forth a number of AI ethics principles to serve this purpose. A principle acts as a behavioral guideline—essentially a value defining what is “good” or “desirable” (van de Poel, 2020). In assessing AI behavior, such principles help define what is (in)appropriate and thus what behavior might call for accountability, either retrospectively for some observed AI failure, or prospectively towards preventing undesirable outcomes.
Early analyses of the numerous published AI guidelines have identified a few core principles. These include privacy, fairness/justice, accountability/responsibility/explicability, transparency, beneficence, non-malfeasance/safety, and human autonomy (Jobin et al, 2019; Hagendorff 2020; Floridi et al, 2018). Despite mostly stable underlying ideas the exact terminology can vary and leads to a lack of clarity (Morley et al, 2021). A longer tail of principles includes ideas like trust, sustainability, dignity, and solidarity (Jobin et al, 2019).
Principles can come from different sources and so be biased in different ways, such as towards ideas in dominant geographies or from power holders such as experts or companies (Hickok, 2021). They can come from researchers and experts in the field (Floridi, 2018), from professional codes of conduct in domains of practice (Diakopoulos et al, 2024), from broad consensus documents like the UN declaration of human rights (Latonero, 2018), and be further informed from public evaluations (Kieslich et al, 2024). What’s the most legitimate source of principles for AI accountability? While a treaty like the Framework Convention on Artificial Intelligence has reached broad consensus, large swaths of the world still haven’t signed on. Achieving truly global principles will require ongoing political work.
Besides their potential to reflect biases, AI principles are also hard to actually implement in practice. Big abstractions need to be translated into concrete operationalizations (Hagendorff, 2020) if they are going to be used to measure AI system failures or guide AI system design to support prevention. Moreover, abstractions like fairness can hide contested ideas with conflicting perspectives (Mittelstadt, 2019) underlining the need to consider context-specific tradeoffs.
Prem (2023) analyzed more than 100 approaches from the literature for bridging the gap between principles and implementation. These include things like AI ethics criteria/checklists, metrics, process models, codes of practice, etc. He distinguishes approaches used during the design of a system (ex-ante), and those that are applied to an AI system after development or perhaps iteratively during development (ex-post). Ex-ante methods are relevant to prospective accountability, whereas ex-post methods are geared towards retrospective accountability (and also prospective if used iteratively during development). He notes that “Generally, there is a strong focus on those aspects for which technical solutions can be built,” exposing a further bias in the research on this topic.
Whereas designers and developers can adopt approaches to help prevent negative outcomes, AI system behavior itself should also be measured to assess adherence to principles. The idea of Ethics Based Auditing (EBA) applies the logic of auditing to the challenge of assessing system behavior “for consistency with relevant principles or norms.” (Mökander et al, 2021). This starts to get at a core issue of operationalizing principles into metrics that can evaluate (mis)alignment with a value. Principles just set the direction; effective accountability requires quantifiable performance metrics. This in turn requires supporting data access to inform those measurements.
Rismani and colleagues (2025) reviewed hundreds of these measures in the literature as they relate to different system components, hazards, harms, and principles. 90% of the measures they found were related to just four principles: fairness, transparency, privacy, and trust. To be useful for accountability metrics need to define some threshold of the metric which indicates the principle has been violated, that the system may create a hazard, and therefore warrants a call for accountability. Thresholds may be context-dependent, vary based on domain, and are subject to the risk tolerance of different stakeholders, but are rarely discussed in the literature (Rismani et al, 2025). This returns us to the normative question: How do you define an acceptable vs. unacceptable level of a measure of a principle? At what level might reasonable people agree there should be accountability? Public perceptions of acceptability may play a role here.
Principles serve as orienting ideas for what is valued. They can be used to determine what constitutes inappropriate behavior, necessitating accountability either retrospectively (blame for failure) or prospectively (prevention of harm). Bringing them into formal accountability forums (e.g. administrative, legal) hinges on mitigating biases in their enumeration and reaching a high degree of consensus. But implementing them in practice remains a challenge. They need to be translated into practices that designers and developers can use to mitigate the hazards created by an AI system, or to metrics with clear thresholds that can measure AI system behavior for signs of deviation. Policy should support the development of context- and domain-specific operationalizations of metrics and thresholds that are indicative of violations of principles by AI systems, as well as the data access provisions that would enable those measurements by the relevant accountability forums.
References
Diakopoulos N, Trattner C, Jannach D, et al. (2024) Leveraging Professional Ethics for Responsible AI. Communications of the ACM.
Floridi L, Cowls J, Beltrametti M, et al. (2018) AI4People—An Ethical Framework for a Good AI Society: Opportunities, Risks, Principles, and Recommendations. Minds and Machines 28(4): 689–707.
Hagendorff T (2020) The Ethics of AI Ethics: An Evaluation of Guidelines. Minds and Machines 30(1): 99–120.
Hickok M (2021) Lessons learned from AI ethics principles for future actions. AI and Ethics 1(1): 41–47.
Jobin A, Ienca M and Vayena E (2019) The global landscape of AI ethics guidelines. Nature Machine Intelligence 1(9): 389–399
Kieslich K, Helberger N and Diakopoulos N (2024) My Future with My Chatbot: A Scenario-Driven, User-Centric Approach to Anticipating AI Impacts. Conference on Fairness, Accountability, and Transparency: 2071–2085.
Latonero M (2018) Governing Artificial Intelligence: Upholding Human Rights & Dignity. Data & Society. https://datasociety.net/library/governing-artificial-intelligence/
Mittelstadt B (2019) Principles alone cannot guarantee ethical AI. Nature Machine Intelligence 1(11): 501–507.
Morley J, Kinsey L, Elhalal A, et al. (2021) Operationalising AI ethics: barriers, enablers and next steps. AI & SOCIETY 38(1): 411–423.
Mökander J, Morley J, Taddeo M, et al. (2021) Ethics-Based Auditing of Automated Decision-Making Systems: Nature, Scope, and Limitations. Science and Engineering Ethics 27(4): 44.
Poel I van de (2020) Embedding Values in Artificial Intelligence (AI) Systems. Minds and Machines 30(3): 385–409.
Prem E (2023) From ethical AI frameworks to tools: a review of approaches. AI and Ethics 3(3): 699–716.
Rismani S, Shelby R, Davis L, et al. (2025) Measuring What Matters: Connecting AI Ethics Evaluations to System Attributes, Hazards, and Harms. Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society 8(3): 2199–2213.
