Abstract
Background: Value alignment in computer science research is often used to refer to the process of aligning the behaviour of artificial intelligence systems with humans’ desires, but the way the phrase is used often lacks precision.
Objectives: In this paper, we conduct a systematic literature review to advance the understanding of value alignment in artificial intelligence by characterising the topic in the context of its research literature. We use this to suggest a more precise definition of the term.
Methods: We analyse the abstracts, introductions and conclusions of 172 value alignment research articles that have been published in recent years and synthesise their content using thematic analysis. From these 172 papers we select 85 papers using a structured criteria for a deep analysis, coding these papers in full.
Results: Our analysis leads to six themes: value alignment drivers & approaches; challenges in value alignment; values in value alignment; cognitive processes in humans and AI; human-agent teaming; and designing and developing value-aligned systems.
Conclusions: By analysing these themes in the context of the literature, we define value alignment as an ongoing process between humans and autonomous agents that aims to express and implement abstract values in diverse contexts, while managing the cognitive limits of both humans and AI agents and also balancing the conflicting ethical and political demands generated by the values in different groups. Our analysis gives rise to a set of research challenges and opportunities in the
field of value alignment for future work.
Objectives: In this paper, we conduct a systematic literature review to advance the understanding of value alignment in artificial intelligence by characterising the topic in the context of its research literature. We use this to suggest a more precise definition of the term.
Methods: We analyse the abstracts, introductions and conclusions of 172 value alignment research articles that have been published in recent years and synthesise their content using thematic analysis. From these 172 papers we select 85 papers using a structured criteria for a deep analysis, coding these papers in full.
Results: Our analysis leads to six themes: value alignment drivers & approaches; challenges in value alignment; values in value alignment; cognitive processes in humans and AI; human-agent teaming; and designing and developing value-aligned systems.
Conclusions: By analysing these themes in the context of the literature, we define value alignment as an ongoing process between humans and autonomous agents that aims to express and implement abstract values in diverse contexts, while managing the cognitive limits of both humans and AI agents and also balancing the conflicting ethical and political demands generated by the values in different groups. Our analysis gives rise to a set of research challenges and opportunities in the
field of value alignment for future work.
| Original language | English |
|---|---|
| Number of pages | 41 |
| Journal | Journal of Artificial Intelligence Research |
| DOIs | |
| Publication status | Acceptance date - 20 Nov 2025 |
Data Availability Statement
The data used in this paper is publicly available at https://github.com/JamMack/Understanding-Value-Alignment-as-a-Process-a-Survey?tab=readme-ov-file.Keywords
- Value alignment
- artificial intelligence
- Human-AI Interaction
- systematic literature review
Fingerprint
Dive into the research topics of 'Understanding the Process of Human-AI Value Alignment'. Together they form a unique fingerprint.Cite this
- APA
- Standard
- Harvard
- Vancouver
- Author
- BIBTEX
- RIS