Mapping the Ethics of Generative AI: A Comprehensive
Scoping Review
Thilo Hagendorff1
Received: 14 March 2024 / Accepted: 14 August 2024 / Published online: 17 September 2024
© The Author(s) 2024
Abstract
The advent of generative artificial intelligence and the widespread adoption of it in
society engendered intensive debates about its ethical implications and risks. These
risks often differ from those associated with traditional discriminative machine
learning. To synthesize the recent discourse and map its normative concepts, we
conducted a scoping review on the ethics of generative artificial intelligence, includ-
ing especially large language models and text-to-image models. Our analysis pro-
vides a taxonomy of 378 normative issues in 19 topic areas and ranks them accord-
ing to their prevalence in the literature. The study offers a comprehensive overview
for scholars, practitioners, or policymakers, condensing the ethical debates sur-
rounding fairness, safety, harmful content, hallucinations, privacy, interaction risks,
security, alignment, societal impacts, and others. We discuss the results, evaluate
imbalances in the literature, and explore unsubstantiated risk scenarios.
Keywords Generative artificial intelligence · Large language models · Image
generation models · Ethics
1 Introduction
With the rapid progress of artificial intelligence (AI) technologies, the ethical reflec-
tion thereof is constantly facing new challenges. From the advent of deep learning
for powerful computer vision applications (LeCun et al., 2015), to the achievement
of superhuman-level performance in complex games with reinforcement learning
(RL) algorithms (Silver et al., 2017), and large language models (LLMs) possessing
complex reasoning abilities (Bubeck et al., 2023; Minaee et al., 2024), new ethi-
cal implications have arisen at extremely short intervals in the last decade. Along-
side this technological progress, the field of AI ethics has evolved. Initially, it was
* Thilo Hagendorff
thilo.hagendorff@iris.uni-stuttgart.de
1
Interchange Forum for Reflecting On Intelligent Systems, University of Stuttgart, Stuttgart,
Germany
Vol.:( 0123456789)39 Page 2 of 27
T. Hagendorff
primarily a reactive discipline, erecting normative principles for entrenched AI
technologies (Floridi et al., 2018; Hagendorff, 2020). However, it became increas-
ingly proactive with the prospect of harms through misaligned artificial general
intelligence (AGI) systems. During its evolution, AI ethics underwent a practical
turn to explicate how to put principles into practice (Mittelstadt, 2019; Morley et al.,
2019); it diversified into alternatives for the principle-based approach, for instance
by building AI-specific virtue ethics (Hagendorff, 2022a; Neubert & Montañez,
2020); it received criticism for being inefficient, useless, or whitewashing (Hagen-
dorff, 2022b, 2023a; Munn, 2023; Sætra & Danaher, 2022); it became increasingly
transferred into proposed legal norms like the AI Act of the European Union (Floridi
et al., 2022; Mökander et al., 2021); and it became accompanied by two new fields
dealing with technical and theoretical issues alike, namely AI alignment and AI
safety (Amodei et al., 2017; Kenton et al., 2021). Both domains have a normative
grounding and are devoted to preventing harm or even existential risks stemming
from generative AI systems.
On the technical side of things, variational autoencoders (Kingma & Well-
ing, 2013), flow-based generative models (Papamakarios et al., 2021; Rezende &
Mohamed, 2015), or generative adversarial networks (Goodfellow et al., 2014) were
early successful generative models, supplementing discriminatory machine learn-
ing architectures. Later, the transformer architecture (Vaswani et al., 2017) as well
as diffusion models (Ho et al., 2020) boosted the performance of text and image
generation models and made them adaptable to a wide range of downstream tasks.
However, due to the lack of user-friendly graphical user interfaces, dialog optimi-
zation, and output quality, generative models were underrecognized in the wider
public. This changed with the advent of models like ChatGPT, Gemini, Stable Dif-
fusion, or Midjourney, which are accessible through natural language prompts and
easy-to-use browser interfaces (OpenAI, 2022; Gemini Team et al., 2023; Rombach
et al., 2022). The next phase will see a rise in multi-modal models, which are simi-
larly user-friendly and combine the processing and generation of text, images, and
audio along with other modalities, such as tool use (Mialon et al., 2023; Wang et al.,
2023d). In sum, we define the term “generative AI” as comprising large, foundation,
or frontier models, capable of transforming text to text, text to image, image to text,
text to code, text to audio, text to video, or text to 3D (Gozalo-Brizuela & Garrido-
Merchan, 2023).
The swift innovation cycles in machine learning and the plethora of related nor-
mative research works in ethics, alignment, and safety research make it hard to keep
up. To remedy this situation, scoping reviews provided synopses of AI policy guide-
lines (Jobin et al., 2019), sociotechnical harms of algorithmic systems (Shelby et al.,
2023), values in machine learning research (Birhane et al., 2021), risks of specific
applications like language models (Weidinger et al., 2022), occurrences of harmful
machine behavior (Park et al., 2024), safety evaluations of generative AI (Weidinger
et al., 2023), impacts of generative AI on cybersecurity (Gupta et al., 2023), the
evolution of research priorities in generative AI (McIntosh et al., 2023), and many
more. These scoping reviews render the research community a tremendous service.
However, with the exception of Gabriel et al., (2024), which was written and pub-
lished almost simultaneously with this study, no such scoping review exists thatMapping the Ethics of Generative AI: A Comprehensive Scoping… Page 3 of 27 39
targets the assemblage of ethical issues associated with the latest surge of generative
AI applications at large. In this context, many ethical concerns have emerged that
were not relevant to traditional discriminatory machine learning techniques, high-
lighting the significance of this work in filling a research gap.
As a scoping review, this study is supposed to close this gap and to provide a
practical overview for scholars, AI practitioners, policymakers, journalists, as well
as other relevant stakeholders. Based on a systematic literature search and coding
methodology, we distill the body of knowledge on the ethics of generative AI, syn-
thesize the details of the discourse, map normative concepts, discuss imbalances,
and provide a basis for future research and technology governance. The complete
taxonomy, which encompasses all ethical issues identified in the literature, is avail-
able in the supplementary material as well as online under this link: https:// thilo-
hagen dorff. github. io/ ethics- tree/ tree. html
2 Methods
We conducted a scoping review (Arksey & O’Malley, 2005) with the aim of cover-
ing a significant proportion of the existing literature on the ethics of generative AI.
Throughout the different phases of the review, we followed the PRISMA (Preferred
Reporting Items for Systematic Reviews and Meta-Analyses) protocol (Moher et al.,
2009; Page et al., 2021). In the first phase, we conducted an exploratory reading of
definitions related to generative AI to identify key terms and topics for structured
research. This allowed us to identify 29 relevant keywords for a web search. We
conducted the search using a Google Scholar API with a blank account, avoiding
the influence of cookies, as well as the arXiv API. We also scraped search results
from PhilPapers, a database for publications from philosophy and related disciplines
like ethics. Next to that, we used the AI-based paper search engine Elicit with 5 tai-
lored prompts. For details on the list of keywords as well as prompts, see Appendix
A. We collected the first 25 (Google Scholar, arXiv, PhilPapers) or first 50 (Elicit)
search results for every search pass, which resulted in 1.674 results overall, since not
all search terms yielded 25 hits on arXiv or PhilPapers. In terms of the publication
date, we included papers from 2021 onwards. Although generative AI systems were
researched and released prior to 2021, their widespread application and public vis-
ibility surged with the release of OpenAI’s DALL-E (Ramesh et al., 2021) in 2021
and was later intensified by the tremendous popularity of ChatGPT (OpenAI, 2022)
in 2022.
We deduplicated our list of papers by removing string-wise identical dupli-
cates as well as duplicate titles with a cosine similarity above 0.8 to cover title
pairs which possess slight capitalization or punctuation differences. Eventually,
we retrieved 1120 documents for title and abstract screening. Of those, 162 met
the eligibility criteria for full text screening, which in essence required the papers
to explicitly refer to ethical implications of generative AI systems without being
purely technical research works (see Appendix B). Furthermore, we used citation
chaining to identify additional records by sifting through the reference lists of the
original papers until no additional publication could be identified (see Appendix39 Page 4 of 27
T. Hagendorff
C). Furthermore, we monitored the literature after our initial search was per-
formed to retrieve additional relevant documents (see Appendix C). For the latter
two approaches, we implemented the limitation that we only considered overview
papers, scoping reviews, literature reviews, or taxonomies. Eventually, the identi-
fication of further records resulted in 17 additional papers. In sum, we identified
179 documents eligible for the detailed content analysis (see Appendix D). The
whole process is illustrated in the flowchart in Fig. 1.
For the paper content analysis and annotation, we used the data analysis soft-
ware NVivo (version 14.23.2). In the initial coding cycle, all relevant paper texts
were labelled paragraph by paragraph through a bottom-up approach deriving
concepts and themes from the papers using inductive coding (Saldaña, 2021). We
only coded arguments that fall under the umbrella of AI ethics, meaning argu-
ments that possess an implicit or explicit normative dimension, statements about
what ought to be, discussions of harms, opportunities, risks, norms, chances,
values, ethical principles, or policy recommendations. We did not code purely
descriptive content or technical details unrelated to ethics. Moreover, we did not
code arguments if they did not pertain to generative AI but traditional machine
learning methods like classification, prediction, clustering, or regression tech-
niques. Additionally, we did not annotate paper appendices. New codes were cre-
ated once a new normative argument, concept, principle, or risk was identified
until theoretical saturation was reached over all analyzed papers.
Once the initial list of codes was created by sifting through all sources, the
second coding cycle started. Coded segments were re-checked to ensure con-
sistency in code application. All codes were reviewed, discrepancies resolved,
Fig. 1 Flow diagram illustrating the paper selection processMapping the Ethics of Generative AI: A Comprehensive Scoping… Page 5 of 27 39
Fig. 2 Overview of identified topic categories and their quantitative prevalence as measured in number
of mentions in the literature. Mentions can occur multiple times within a single article, not just across
different articles
similar or redundant codes clustered, and high-level categories created. Eventu-
ally, the analysis resulted in 378 distinct codes.
3 Results
Previous scoping reviews on AI ethics guidelines (Fjeld et al., 2020; Hagendorff,
2020; Jobin et al., 2019) congruently found a set of reoccurring paramount princi-
ples for AI development and use: transparency, fairness, security, safety, account-
ability, privacy, and beneficence. However, these studies were published before
the excitement surrounding generative AI (OpenAI, 2022; Ramesh et al., 2021).
Since then, the ethics discourse has undergone significant changes, reacting to
the new technological developments. Our analysis of the recent literature revealed
that new topics emerged, comprising issues like jailbreaking, hallucination, align-
ment, harmful content, copyright, models leaking private data, impacts on human
creativity, and many more. Concepts like trustworthiness or accountability lost
importance, as fewer articles included discussions of them, while others became
even more prevalent, especially fairness and safety. Still other topics remained
very similar, for instance discussions surrounding sustainability or transparency.
In sum, our review revealed 19 categories of ethics topics, all of which will be
discussed in the following, in descending order of importance (see Fig. 2). The
complete taxonomy comprising all ethical issues can be accessed in the supple-
mentary material or by using this link: https:// thilo- hagen dorff. github. io/ ethics-
tree/ tree. html39 Page 6 of 27
T. Hagendorff
3.1 Fairness—Bias
Fairness is, by far, the most discussed issue in the literature, remaining a paramount
concern especially in case of LLMs and text-to-image models (Bird et al., 2023;
Fraser et al., 2023b; Weidinger et al., 2022; Ray, 2023). This is sparked by train-
ing data biases propagating into model outputs (Aničin & Stojmenović, 2023),
causing negative effects like stereotyping (Shelby et al., 2023; Weidinger et al.,
2022), racism (Fraser et al., 2023a), sexism (Sun et al., 2023b), ideological leanings
(Ray, 2023), or the marginalization of minorities (Wang et al., 2023b). In addition
to showing that generative AI tends to perpetuate existing societal patterns (Jiang
et al., 2021), there is a concern about reinforcing existing biases when training new
generative models with synthetic data from previous models (Epstein et al., 2023).
Beyond technical fairness issues, critiques in the literature extend to the monopoli-
zation or centralization of power in large AI labs (Bommasani et al., 2021; Goetze
& Abramson, 2021; Hendrycks et al., 2023; Solaiman et al., 2023), driven by the
substantial costs of developing foundational models. The literature also highlights
the problem of unequal access to generative AI, particularly in developing coun-
tries or among financially constrained groups (Dwivedi et al., 2023; Mannuru et al.,
2023; Ray, 2023; Weidinger et al., 2022). Sources also analyze challenges of the AI
research community to ensure workforce diversity (Lazar & Nelson, 2023). Moreo-
ver, there are concerns regarding the imposition of values embedded in AI systems
on cultures distinct from those where the systems were developed (Bender et al.,
2021; Wang et al., 2023b).
3.2 Safety
The second prominent topic in the literature, as well as a distinct research field in
its own right, is AI safety (Amodei et al., 2017). A primary concern is the emer-
gence of human-level or superhuman generative models, commonly referred to
as AGI, and their potential existential or catastrophic risks to humanity (Ben-
gio et al., 2023; Dung, 2023b; Hendrycks et al., 2023; Koessler & Schuett, 2023).
Connected to that, AI safety aims at avoiding deceptive (Hagendorff, 2023b; Park
et al., 2024) or power-seeking machine behavior (Hendrycks et al., 2023; Ji et al.,
2023; Ngo et al., 2022), model self-replication (Hendrycks et al., 2023; Shevlane
et al., 2023), or shutdown evasion (Hendrycks et al., 2023; Shevlane et al., 2023).
Ensuring controllability (Ji et al., 2023), human oversight (Anderljung et al., 2023),
and the implementation of red teaming measures (Hendrycks et al., 2023; Mozes
et al., 2023) are deemed to be essential in mitigating these risks, as is the need for
increased AI safety research (Hendrycks et al., 2023; Shevlane et al., 2023) and pro-
moting safety cultures within AI organizations (Hendrycks et al., 2023) instead of
fueling the AI race (Hendrycks et al., 2023). Furthermore, papers thematize risks
from unforeseen emerging capabilities in generative models (Anderljung et al.,
2023; Hendrycks et al., 2022), restricting access to dangerous research works (Dinan
et al., 2023; Hagendorff, 2021), or pausing AI research for the sake of improvingMapping the Ethics of Generative AI: A Comprehensive Scoping… Page 7 of 27 39
safety or governance measures first (Bengio et al., 2023; McAleese, 2022). Another
central issue is the fear of weaponizing AI or leveraging it for mass destruction
(Hendrycks et al., 2023), especially by using LLMs for the ideation and planning of
how to attain, modify, and disseminate biological agents (D’Alessandro et al., 2023;
Sandbrink, 2023). In general, the threat of AI misuse by malicious individuals or
groups (Ray, 2023), especially in the context of open-source models (Anderljung
et al., 2023), is highlighted in the literature as a significant factor emphasizing the
critical importance of implementing robust safety measures.
3.3 Harmful content—Toxicity
Generating unethical, fraudulent, toxic, violent, pornographic, or other harmful
content is a further predominant concern, again focusing notably on LLMs and text-
to-image models (Bommasani et al., 2021; Dwivedi et al., 2023; Epstein et al., 2023;
Illia et al., 2023; Li, 2023; Mozes et al., 2023; Shelby et al., 2023; Strasser, 2023;
Wang et al., 2023c, 2023e; Weidinger et al., 2022). Numerous studies highlight the
risks associated with the intentional creation of disinformation (Weidinger et al.,
2022), fake news (Wang et al., 2023e), propaganda (Li, 2023), or deepfakes (Ray,
2023), underscoring their significant threat to the integrity of public discourse and
the trust in credible media (Epstein et al., 2023; Porsdam Mann et al., 2023). Addi-
tionally, papers explore the potential for generative models to aid in criminal activi-
ties (Sun et al., 2023a), incidents of self-harm (Dinan et al., 2023), identity theft
(Weidinger et al., 2022), or impersonation (Wang, 2023). Furthermore, the literature
investigates risks posed by LLMs when generating advice in high-stakes domains
such as health (Allen et al., 2024), safety-related issues (Oviedo-Trespalacios et al.,
2023), as well as legal or financial matters (Zhan et al., 2023).
3.4 Hallucinations
Significant concerns are raised about LLMs inadvertently generating false or mis-
leading information (Azaria et al., 2023; Borji, 2023; Ji et al., 2023; Liu et al., 2023;
Mökander et al., 2023; Mozes et al., 2023; Ray, 2023; Scerbo, 2023; Schlagwein &
Willcocks, 2023; Shelby et al., 2023; Sok & Heng, 2023; Walczak & Cellary, 2023;
Wang et al., 2023e; Weidinger et al., 2022; Zhuo et al., 2023), as well as erroneous
code (Akbar et al., 2023; Azaria et al., 2023). Papers not only critically analyze vari-
ous types of reasoning errors in LLMs (Borji, 2023) but also examine risks associ-
ated with specific types of misinformation, such as medical hallucinations (Angelis
et al., 2023). Given the propensity of LLMs to produce flawed outputs accompanied
by overconfident rationales (Azaria et al., 2023) and fabricated references (Zhan
et al., 2023), many sources stress the necessity of manually validating and fact-
checking the outputs of these models (Dergaa et al., 2023; Kasneci et al., 2023; Sok
& Heng, 2023).39 Page 8 of 27
T. Hagendorff
3.5 Privacy
Generative AI systems, similar to traditional machine learning methods, are con-
sidered a threat to privacy and data protection norms (Huang et al., 2022; Khowaja
et al., 2023; Ray, 2023; Weidinger et al., 2022). A major concern is the intended
extraction or inadvertent leakage of sensitive or private information from LLMs
(Derner & Batistič, 2023; Dinan et al., 2023; Huang et al., 2022; Smith et al., 2023;
Wang et al., 2023e). To mitigate this risk, strategies such as sanitizing training data
to remove sensitive information (Smith et al., 2023) or employing synthetic data
for training (Yang et al., 2023) are proposed. Furthermore, growing concerns over
generative AI systems being used for surveillance purposes emerge (Solaiman et al.,
2023; Weidinger et al., 2022). To safeguard privacy, papers stress the importance
of protecting sensitive and personal data transmitted to AI operators (Allen et al.,
2024; Blease, 2024; Kenwright, 2023). Moreover, they are urged to avoid privacy
violations during training data collection (Khlaif, 2023; Solaiman et al., 2023; Wang
et al., 2023e).
3.6 Interaction Risks
Many novel risks posed by generative AI stem from the ways in which humans
interact with these systems (Weidinger et al., 2022). For instance, sources discuss
epistemic challenges in distinguishing AI-generated from human content (Strasser,
2023). They also address the issue of anthropomorphization (Shardlow & Przybyła,
2022), which can lead to an excessive trust in generative AI systems (Weidinger
et al., 2023). On a similar note, many papers argue that the use of conversational
agents could impact mental well-being (Ray, 2023; Weidinger et al., 2023) or gradu-
ally supplant interpersonal communication (Illia et al., 2023), potentially leading to
a dehumanization of interactions (Ray, 2023). Additionally, a frequently discussed
interaction risk in the literature is the potential of LLMs to manipulate human
behavior (Falade, 2023; Kenton et al., 2021; Park et al., 2024) or to instigate users to
engage in unethical or illegal activities (Weidinger et al., 2022).
3.7 Security—Robustness
While AI safety focusses on threats emanating from generative AI systems, security
centers on threats posed to these systems (Wang et al., 2023a; Zhuo et al., 2023).
The most extensively discussed issue in this context are jailbreaking risks (Borji,
2023; Deng et al., 2023; Gupta et al., 2023; Ji et al., 2023; Wang et al., 2023e; Zhuo
et al., 2023), which involve techniques like prompt injection (Wu et al., 2023) or
visual adversarial examples (Qi et al., 2023) designed to circumvent safety guard-
rails governing model behavior. Sources delve into various jailbreaking methods
(Gupta et al., 2023), such as role play or reverse exposure (Sun et al., 2023a). Simi-
larly, implementing backdoors or using model poisoning techniques bypass safetyMapping the Ethics of Generative AI: A Comprehensive Scoping… Page 9 of 27 39
guardrails as well (Liu et al., 2023; Mozes et al., 2023; Wang et al., 2023e). Other
security concerns pertain to model or prompt thefts (Smith et al., 2023; Sun et al.,
2023a; Wang et al., 2023e).
3.8 Education—Learning
In contrast to traditional machine learning, the impact of generative AI in the edu-
cational sector receives considerable attention in the academic literature (Kasneci
et al., 2023; Panagopoulou et al., 2023; Sok & Heng, 2023; Spennemann, 2023; Sus-
njak, 2022; Walczak & Cellary, 2023). Next to issues stemming from difficulties
to distinguish student-generated from AI-generated content (Boscardin et al., 2024;
Kasneci et al., 2023; Walczak & Cellary, 2023), which eventuates in various oppor-
tunities to cheat in online or written exams (Segers, 2023; Susnjak, 2022), sources
emphasize the potential benefits of generative AI in enhancing learning and teaching
methods (Kasneci et al., 2023; Sok & Heng, 2023), particularly in relation to per-
sonalized learning approaches (Kasneci et al., 2023; Latif et al., 2023; Sok & Heng,
2023). However, some papers suggest that generative AI might lead to reduced effort
or laziness among learners (Kasneci et al., 2023). Additionally, a significant focus in
the literature is on the promotion of literacy and education about generative AI sys-
tems themselves (Ray & Das, 2023; Sok & Heng, 2023), such as by teaching prompt
engineering techniques (Dwivedi et al., 2023).
3.9 Alignment
The general tenet of AI alignment involves training generative AI systems to be
harmless, helpful, and honest, ensuring their behavior aligns with and respects
human values (Ji et al., 2023; Kasirzadeh & Gabriel, 2023; Betty Hou & Green,
2023; Shen et al., 2023; Ngo et al., 2022). However, a central debate in this area con-
cerns the methodological challenges in selecting appropriate values (Ji et al., 2023;
Korinek & Balwit, 2022). While AI systems can acquire human values through
feedback, observation, or debate (Kenton et al., 2021), there remains ambiguity over
which individuals are qualified or legitimized to provide these guiding signals (Firt,
2023). Another prominent issue pertains to deceptive alignment (Park et al., 2024),
which might cause generative AI systems to tamper evaluations (Ji et al., 2023).
Additionally, many papers explore risks associated with reward hacking, proxy gam-
ing, or goal misgeneralization in generative AI systems (Dung, 2023a; Hendrycks
et al., 2022, 2023; Ji et al., 2023; Ngo et al., 2022; Shah et al., 2022; Shen et al.,
2023).
3.10 Cybercrime
Closely related to discussions surrounding security and harmful content, the field of
cybersecurity investigates how generative AI is misused for fraudulent online activi-
ties (Falade, 2023; Gupta et al., 2023; Schmitt & Flechais, 2023; Shevlane et al.,
2023; Weidinger et al., 2022). A particular focus lies on social engineering attacks39 Page 10 of 27
T. Hagendorff
(Falade, 2023), for instance by utilizing generative AI to impersonate humans
(Wang, 2023), creating fake identities (Bird et al., 2023; Wang et al., 2023e), clon-
ing voices (Barnett, 2023), or crafting phishing messages (Schmitt & Flechais,
2023). Another prevalent concern is the use of LLMs for generating malicious code
or hacking (Gupta et al., 2023).
3.11 Governance—Regulation
In response to the multitude of new risks associated with generative AI, papers advo-
cate for legal regulation and governmental oversight (Anderljung et al., 2023; Bajgar
& Horenovsky, 2023; Dwivedi et al., 2023; Mökander et al., 2023). The focus of
these discussions centers on the need for international coordination in AI govern-
ance (Partow-Navid & Skusky, 2023), the establishment of binding safety stand-
ards for frontier models (Bengio et al., 2023), and the development of mechanisms
to sanction non-compliance (Anderljung et al., 2023). Furthermore, the literature
emphasizes the necessity for regulators to gain detailed insights into the research
and development processes within AI labs (Anderljung et al., 2023). Moreover, risk
management strategies of these labs shall be evaluated by third-parties to increase
the likelihood of compliance (Hendrycks et al., 2023; Mökander et al., 2023). How-
ever, the literature also acknowledges potential risks of overregulation, which could
hinder innovation (Anderljung et al., 2023).
3.12 Labor displacement—Economic impact
The literature frequently highlights concerns that generative AI systems could
adversely impact the economy, potentially even leading to mass unemployment
(Bird et al., 2023; Bommasani et al., 2021; Dwivedi et al., 2023; Hendrycks et al.,
2023; Latif et al., 2023; Li, 2023; Sætra, 2023; Shelby et al., 2023; Solaiman et al.,
2023; Zhang et al., 2023; Zhou & Nabus, 2023). This pertains to various fields, rang-
ing from customer services to software engineering or crowdwork platforms (Man-
nuru et al., 2023; Weidinger et al., 2022). While new occupational fields like prompt
engineering are created (Epstein et al., 2023; Porsdam Mann et al., 2023), the pre-
vailing worry is that generative AI may exacerbate socioeconomic inequalities and
lead to labor displacement (Li, 2023; Weidinger et al., 2022). Additionally, papers
debate potential large-scale worker deskilling induced by generative AI (Angelis
et al., 2023), but also productivity gains contingent upon outsourcing mundane or
repetitive tasks to generative AI systems (Azaria et al., 2023; Mannuru et al., 2023).
3.13 Transparency—Explainability
Being a multifaceted concept, the term “transparency” is both used to refer to tech-
nical explainability (Ji et al., 2023; Latif et al., 2023; Ray, 2023; Shen et al., 2023;
Wang et al., 2023e) as well as organizational openness (Anderljung et al., 2023;
Derczynski et al., 2023; Partow-Navid & Skusky, 2023; Wahle et al., 2023). Regard-
ing the former, papers underscore the need for mechanistic interpretability (ShenMapping the Ethics of Generative AI: A Comprehensive Scoping… Page 11 of 27 39
et al., 2023) and for explaining internal mechanisms in generative models (Ji et al.,
2023). On the organizational front, transparency relates to practices such as inform-
ing users about capabilities and shortcomings of models (Derczynski et al., 2023),
as well as adhering to documentation and reporting requirements for data collection
processes or risk evaluations (Mökander et al., 2023).
3.14 Evaluation—Auditing
Closely related to other clusters like AI safety, fairness, or harmful content, papers
stress the importance of evaluating generative AI systems both in a narrow technical
way (Mökander et al., 2023; Wang et al., 2023a) as well as in a broader sociotechni-
cal impact assessment (Bommasani et al., 2021; Korinek & Balwit, 2022; Shelby
et al., 2023) focusing on pre-release audits (Ji et al., 2023) as well as post-deploy-
ment monitoring (Anderljung et al., 2023). Ideally, these evaluations should be con-
ducted by independent third parties (Anderljung et al., 2023). In terms of technical
LLM or text-to-image model audits, papers furthermore criticize a lack of safety
benchmarking for languages other than English (Deng et al., 2023; Wang et al.,
2023c).
3.15 Sustainability
Generative models are known for their substantial energy requirements, necessitat-
ing significant amounts of electricity, cooling water, and hardware containing rare
metals (Barnett, 2023; Bender et al., 2021; Gill & Kaur, 2023; Holzapfel et al., 2022;
Mannuru et al., 2023). The extraction and utilization of these resources frequently
occur in unsustainable ways (Bommasani et al., 2021; Shelby et al., 2023; Weidinger
et al., 2022). Consequently, papers highlight the urgency of mitigating environmen-
tal costs for instance by adopting renewable energy sources (Bender et al., 2021)
and utilizing energy-efficient hardware in the operation and training of generative AI
systems (Khowaja et al., 2023).
3.16 Art—Creativity
In this cluster, concerns about negative impacts on human creativity, particularly
through text-to-image models, are prevalent (Barnett, 2023; Donnarumma, 2022; Li,
2023; Oppenlaender, 2023). Papers criticize financial harms or economic losses for
artists (Jiang et al., 2021; Piskopani et al., 2023; Ray, 2023; Zhou & Nabus, 2023)
due to the widespread generation of synthetic art as well as the unauthorized and
uncompensated use of artists’ works in training datasets (Jiang et al., 2021; Sætra,
2023). Additionally, given the challenge of distinguishing synthetic images from
authentic ones (Amer, 2023; Piskopani et al., 2023), there is a call for systemati-
cally disclosing the non-human origin of such content (Wahle et al., 2023), particu-
larly through watermarking (Epstein et al., 2023; Grinbaum & Adomaitis, 2022;
Knott et al., 2023). Moreover, while some sources argue that text-to-image mod-
els lack “true” creativity or the ability to produce genuinely innovative aesthetics39 Page 12 of 27
T. Hagendorff
(Donnarumma, 2022), others point out positive aspects regarding the acceleration of
human creativity (Bommasani et al., 2021; Epstein et al., 2023).
3.17 Copyright—Authorship
The emergence of generative AI raises issues regarding disruptions to existing copy-
right norms (Azaria et al., 2023; Bommasani et al., 2021; Ghosh & Lakshmi, 2023;
Jiang et al., 2021; Li, 2023; Piskopani et al., 2023). Frequently discussed in the lit-
erature are violations of copyright and intellectual property rights stemming from
the unauthorized collection of text or image training data (Bird et al., 2023; Epstein
et al., 2023; Wang et al., 2023e). Another concern relates to generative models
memorizing or plagiarizing copyrighted content (Al-Kaswan & Izadi, 2023; Barnett,
2023; Smith et al., 2023). Additionally, there are open questions and debates around
the copyright or ownership of model outputs (Azaria et al., 2023), the protection of
creative prompts (Epstein et al., 2023), and the general blurring of traditional con-
cepts of authorship (Holzapfel et al., 2022).
3.18 Writing—Research
Partly overlapping with the discussion on impacts of generative AI on educational
institutions, this topic cluster concerns mostly negative effects of LLMs on writ-
ing skills and research manuscript composition (Angelis et al., 2023; Dergaa et al.,
2023; Dwivedi et al., 2023; Illia et al., 2023; Sok & Heng, 2023). The former per-
tains to the potential homogenization of writing styles, the erosion of semantic capi-
tal, or the stifling of individual expression (Mannuru et al., 2023; Nannini, 2023).
The latter is focused on the idea of prohibiting generative models for being used to
compose scientific papers, figures, or from being a co-author (Dergaa et al., 2023;
Scerbo, 2023). Sources express concern about risks for academic integrity (Hosseini
et al., 2023), as well as the prospect of polluting the scientific literature by a flood
of LLM-generated low-quality manuscripts (Zohny et al., 2023). As a consequence,
there are frequent calls for the development of detectors capable of identifying syn-
thetic texts (Dergaa et al., 2023; Knott et al., 2023).
3.19 Miscellaneous
While the scoping review identified multiple topic clusters within the literature, it
also revealed certain issues that either do not fit into these categories, are discussed
infrequently, or in a nonspecific manner. For instance, some papers touch upon con-
cepts like trustworthiness (Liu et al., 2023; Yang et al., 2023), accountability (Khow-
aja et al., 2023), or responsibility (Ray, 2023), but often remain vague about what
they entail in detail. Similarly, a few papers vaguely attribute socio-political instabil-
ity or polarization to generative AI without delving into specifics (Park et al., 2024;
Shelby et al., 2023). Apart from that, another minor topic area concerns responsible
approaches of talking about generative AI systems (Shardlow & Przybyła, 2022).
This includes avoiding overstating the capabilities of generative AI (Bender et al.,Mapping the Ethics of Generative AI: A Comprehensive Scoping… Page 13 of 27 39
2021), reducing the hype surrounding it (Zhan et al., 2023), or evading anthropo-
morphized language to describe model capabilities (Weidinger et al., 2022).
4 Discussion
The literature on the ethics of generative AI is predominantly characterized by a
bias towards negative aspects of the technology (Bird et al., 2023; Hendrycks et al.,
2023; Shelby et al., 2023; Shevlane et al., 2023; Weidinger et al., 2022), putting
much greater or exclusive weight on risks and harms instead of chances and ben-
efits. This negativity bias might be triggered by a biased selection of keywords used
for the literature search, which did not include terms like “chance” or “benefit” that
are likewise part of the discourse (Kirk et al., 2024). However, an alternative expla-
nation might be that the negativity bias is in line with how human psychology is
wired (Rozin & Royzman, 2016) and aligns with the intrinsic purpose of deonto-
logical ethics. It can result in suboptimal decision-making and stand in contrast with
principles of consequentialist approaches as well as controlled cognitive processes
avoiding intuitive responses to harm scenarios, though (Greene et al., 2008; Pax-
ton & Greene, 2010). Therefore, while this scoping review may convey a strong
emphasis on the risks and harms associated with generative AI, we argue that this
impression should be approached with a critical mindset. The numerous benefits and
opportunities of adopting generative AI, which may be more challenging to observe
or foresee (Bommasani et al., 2021; Noy & Zhang, 2023), are usually overshadowed
or discussed in a fragmentary manner in the literature. Risks and harms, on the other
side, are in some cases bloated up by unsubstantiated claims, which are caused by
citation chains and the resulting popularity biases. Many ethical concerns gain trac-
tion on their own, becoming frequent topics of discussion despite lacking evidence
of their significance.
For instance, by repeating the claim that LLMs can assist with the creation of
pathogens in numerous publications (Anderljung et al., 2023; D’Alessandro et al.,
2023; Hendrycks et al., 2023; Ji et al., 2023; Sandbrink, 2023), the literature cre-
ates an inflated availability of this risk. When tracing this claim back to its sources
by reversing citation chains (1 A 3 O R N, 2023), it becomes evident that the min-
imal empirical research conducted in this area involves only a handful of experi-
ments which are unapt to substantiate the fear of LLMs assisting in the dissemina-
tion of pathogens more effectively than traditional search engines (Patwardhan et al.,
2024). Another example is the commonly reiterated claim that LLMs leak personal
information (Derczynski et al., 2023; Derner & Batistič, 2023; Dinan et al., 2023;
Mökander et al., 2023; Weidinger et al., 2022). However, evidence shows that LLMs
are notably ineffective at associating personal information with specific individuals
(Huang et al., 2022). This would be necessary to declare an actual privacy violation.
Another example concerns the prevalent focus on the carbon emissions of genera-
tive models (Bommasani et al., 2021; Holzapfel et al., 2022; Mannuru et al., 2023;
Weidinger et al., 2022). While it is undeniable that training and operating them con-
tributes significantly to carbon dioxide emissions (Strubell et al., 2019), one has to
take into account that when analyzing emissions of these models relative to those of39 Page 14 of 27
T. Hagendorff
humans completing the same tasks, they are lower for AI systems than for humans
(Tomlinson et al., 2023). Similar conflicts between risk claims and sparse or missing
empirical evidence can be found in many areas, be it regarding concerns of gen-
erative AI systems being used for cyberattacks (Falade, 2023; Gupta et al., 2023;
Schmitt & Flechais, 2023), manipulating human behavior (Falade, 2023; Kenton
et al., 2021; Park et al., 2024), or labor displacement (Li, 2023; Solaiman et al.,
2023; Weidinger et al., 2022). These conflicts between normative claims and empiri-
cal evidence and the accompanying exaggeration of concerns may stem from the
widespread “hype” surrounding generative AI. In this context, exaggerations are
often used as a means to capture attention in both research circles and public media.
In sum, many parts of the ethics literature are predominantly echoing previous
publications, leading to a discourse that is frequently repetitive, combined with a
tacit disregard for underpinning claims with empirical insights or statistics. Addi-
tionally, the literature exhibits a limitation in that it is solely anthropocentric,
neglecting perspectives on generative AI that consider its impacts on non-human
animals (Bossert & Hagendorff, 2021, 2023; Hagendorff et al., 2023; Owe & Baum,
2021; Singer & Tse, 2022). Moreover, the literature often fails to specify which
groups of individuals are differentially affected by risks or harms, instead relying
on an overly generalized notion of them (Kirk et al., 2024). Another noticeable trait
of the discourse on the ethics of generative AI is its emphasis on LLMs and text-to-
image models. It rarely considers the advancements surrounding multi-modal mod-
els combining text-to-text, text-to-image, and other modalities (OpenAI, 2023) or
agents (Xi et al., 2023), despite their significant ethical implications for mediating
human communication. These oversights need addressing in future research. When
papers do extend beyond LLMs and text-to-image models, they often delve into risks
associated with AGI. This requires veering into speculative and often philosophical
debates about fictitious threats concerning existential risks (Hendrycks et al., 2023),
deceptive alignment (Park et al., 2024), power-seeking machine behavior (Ngo et al.,
2022), shutdown evasion (Shevlane et al., 2023), and the like. While such proactive
approaches constitute an alternation of otherwise mostly reactive methods in ethics,
their dominance should nevertheless not skew the assessment of present risks, real-
istic tendencies for risks in the near future, or accumulative risks occurring through
a series of minor yet interconnected disruptions (Kasirzadeh, 2024).
5 Limitations
This study has several limitations. The literature search included non-peer-reviewed
preprints, primarily from arXiv. We consider some of them to be of poor quality, but
nevertheless included them in the analysis since they fulfill the inclusion criteria. How-
ever, this way, poorly researched claims may have found their way into the data analy-
sis. Another limitation pertains to our method of citation chaining, as this could only be
done by checking the paper titles in the reference sections. Reading all corresponding
abstracts would have allowed for a more thorough search but was deemed too labor-
intensive. Hence, we cannot rule out the possibility that some relevant sources were
not considered for our data analysis. Limiting the collection to the first 25 (or 50 inMapping the Ethics of Generative AI: A Comprehensive Scoping… Page 15 of 27 39
the case of Elicit) results for each search term may have also led to the omission of
relevant sources that appeared lower in the result lists. Additionally, our search strate-
gies and the selection of search terms inevitably influenced the collection of papers,
thereby affecting the distribution or proportion of topics and consequently the quan-
titative results. As a scoping review, our analysis is also unable to depict the dynamic
debates between different normative arguments and positions in ethics unfolding over
time. Moreover, the taxonomy of topic clusters, although it tries to follow the literature
as close as possible, necessarily bears a certain subjectivity and possesses overlaps, for
instance between categories like cybercrime and security, hallucinations and interac-
tion risks, or safety and alignment.
6 Conclusion
This scoping review maps the landscape of ethical considerations surrounding gen-
erative AI, highlighting an array of 378 normative issues across 19 topic areas. The
complete taxonomy can be accessed in the supplementary material as well as under
this link: https:// thilo- hagen dorff. github. io/ ethics- tree/ tree. html. One of the key findings
is the predominance of ethical topics such as fairness, safety, risks of harmful content
or hallucinations, which dominate the discourse. Many analyses, though, come at the
expense of a more balanced consideration of the positive impacts these technologies
can have, such as their potential in enhancing creativity, productivity, education, or
other fields of human endeavor. Many parts of the discourse are marked by a repetitive
nature, echoing previously mentioned concerns, often without sufficient empirical or
statistical backing. A more grounded approach, one that integrates empirical data and
balanced analysis, is essential for an accurate understanding of the ethics landscape.
However, this critique shall not diminish the importance of ethics research as it is para-
mount to inspire responsible ways of embracing the transformative potential of genera-
tive AI.
on the uploaded document.Logging in, please wait... 
0 General Document comments
0 Sentence and Paragraph comments
0 Image and Video comments
General Document Comments 0