Exploring the Association between Moral Foundations and Judgements of AI Behaviour

How do individual differences in personal morality affect perceptions and judgments of morally contentious behaviours from AI systems? By applying Moral Foundations Theory (MFT) to the context of AI, this study sought to develop a predictive Bayesian model for assessing moral judgements based on individual differences in moral constitution. Participants (N=240) were asked to assess six different scenarios, carefully designed to elicit reflection on the behaviour of AI systems. Together, with results from the Moral Foundations Questionnaire, we performed both Bayesian modelling and reflexive thematic analysis to investigate the associations between individual differences in moral foundations and judgements of the AI systems. Results revealed a mild association between individual MFT scores and judgments of AI behaviours. Qualitative responses suggested a participant’s technical understanding of AI systems, rather than intrinsic moral values, predominantly influenced their judgments, with those who judged the behaviour as wrong tending to attribute a greater degree of agency to the AI systems.


INTRODUCTION
The field of moral psychology has helped to navigate complex differences in how individuals respond to contentious topics [32].At the forefront of moral psychology is Moral Foundations Theory (MFT) [4,24], which proposes that the way people respond to behaviour containing a moral tension depends upon differences in their sensitivity to a set of moral foundations [24].A moral tension refers 'any conflict, whether apparent, contingent or fundamental, between important values or goals, where it appears necessary to give up one in order to realise the other' [60].Moral tensions give rise to moral judgments as a way of reconciling the conflict.Moral judgements are typically reserved for human behaviour, but are increasingly ascribed to the behaviour of artificial intelligence (AI) [61].
According to MFT, a set of moral foundations-Care, Equality, Proportionality, Authority, Loyalty, and Purity-form the basis of human moral intuition, informing how people perceive and ultimately judge behaviour [4].By knowing the composition of an individual's moral foundations, MFT predicts how they will respond to morally contentious acts.From stem cell research [12], attitudes to euthanasia and pornography [43], to anticipating judgments of suicide [53], MFT has formed the backbone of research in understanding the underlying differences in how people interpret sensitive human topics.
In the context of AI, MFT has been leveraged by Hidalgo et al. to unpack the moral foundations that people ascribe to morally contentious behaviours of intelligent systems [35].This was achieved by asking participants to select words associated with each moral foundation that applied to a series of scenarios.Hidalgo et al. used MFT as a means of discerning the differences between how people judge humans and machines.However, they stopped short of directly asking participants to judge the scenarios as right or wrong and did not relate the responses back to individual differences in the moral foundations of their respondents.MFT has yet to be applied empirically to judgments of AI behaviour, in a way that predicts the likelihood of how people might respond to a moral tension with respect to their individual differences.
Attempts to incorporate concerns about moral tension in the application of AI are often handled through AI ethics guidelines and principles [42].These guideline and principles typically take a normative approach in their prescription, with normative theories describing the world as what ought to be [17].On the other hand, positive theories describe the world as it is, they seek to be observational and aim to gain an understanding of how things actually are [21].Moral Foundations Theory can be described as an empirical, descriptive, and predictive positive theory of moral and social psychology.In providing knowledge of what people value by observing how they behave, positive theory can inform the creation of guidelines that are reflective of reality and, in turn, achievable to adhere to.
However, translating positive observations into normative framing of values presents challenges of over-generalisation, which has been the target of criticism for AI ethics guidelines [1,30,51].
Reducing the complex nature of human values, behaviours, and motives into a set of guiding principles inevitably results in simplifications that cannot fully account for the diversity in human values or diversity of impacts.
This study employs MFT as a lens for understanding how individuals respond to morally contentious decisions made by AI systems.We investigate the potential associations between individual moral foundations, measured through the application of the Moral Foundations Questionnaire (MFQ) [4], and how participants in the study perceive and judge the morally contentious behaviour of AI systems.In doing so, this study looks to assess the validity of MFT as a tool to inform the discussions around normative AI ethics guidelines and principles.Participants (N=240) were asked to judge the severity of six scenarios containing a moral tension resulting from the behaviour of an AI system.Participants also identified the moral foundations that were most relevant to the scenario.We then collected participants' sensitivities to violations in each foundation through the Moral Foundations Questionnaire 2.0 [4] to determine the impact of individual moral foundations on both their perceptions and moral judgements of AI systems.
The study aims to understand the degree to which there is agreement on the moral foundations that individuals associate with a particular morally contentious behaviour carried out by AI systems.We then tested if the moral foundations our study participants perceived as being most relevant, along with their judgment of the behaviour of each AI system, were related to their individual differences in moral make-up (MFQ).
Through Bayesian models of participants' MFQ scores, judgments and perception responses, we gained an empirically informed view of how people respond to moral violations of AI systems.The study revealed limited agreement around the moral foundations associated with the AI behvaviour.It was also found that, once accounting for one's Care scores, other moral foundations did not provide much predictive value for judgment scores.This indicates that the moral intuition presupposed by knowing a person's MFQ scores may not apply in the same way when passing judgment on the behaviour of AI.
A reflexive thematic analysis on the qualitative data [8] suggested that the way a person assesses the behaviour of AI systems is related to several factors other than their moral foundational makeup.The perceived role of the AI system, the degree to which a person attributed human-like agency to it, and subsequently how right or wrong they deemed the behaviour all appear to depend on the person's technical understanding of the system as being computational.When this awareness was not present, individuals increasingly anthropomorphised the AI system, attributing humanlike characteristics such as thought, care, empathy and feelings.This resulted in harsher judgments when the system fell short of upholding the moral standards expected of humans in similar circumstances.
This research makes the following contributions.Firstly, we empirically demonstrate that individuals perceive a variety of moral foundations in the behaviour of AI systems.Secondly, we found a weak positive association between one's scores for each moral foundation and the likelihood that they will perceive that foundation in the behaviour of an AI.However, only a participants sensitivity to the Care foundation helped to predict their moral judgments.
Finally, our results appear to reveal that how a person perceives AI systems is subject to change and depends on various factors other than their moral foundations as per their MFQ results.Responses to our qualitative questions indicated that this might be more closely aligned with how well individuals understand the AI systems and their limitations from a technical perspective, and that gaining an awareness of the more technical nature of AI systems may affect how a person feels about the AI behaviour they are making judgments of.

RELATED WORK
We structure the theoretical assumptions behind our work in causal terms and justify it based on previous literature.The theorised causal model is illustrated as a Directed Acyclic Graph (DAG) in Fig1.The following sections include illustrations of the causal claims being tested.

AI System Behaviour and Moral Judgments.
As AI technologies have evolved over the last decade, companies, research institutions, and public sector organisations have responded to the ethical concerns that these technologies have brought by publishing guidelines and principles.These guidelines aim to communicate how systems ought to be designed, developed and deployed and can be regarded as normative in their prescription [42].
The first of many sets of principles were published between 2017 and 2018.These include the Montréal Declaration on Responsible AI [56], the IEEE's General Principles of Ethical Autonomous and Intelligent Systems [38], the Asilomar AI principles [22], along with industry-led initiatives such as Googles 'AI Ethics Principles.' [23].Jobin et al. [42] found that amongst this plethora of AI ethical guides, there exists a collective convergence around a set of core principles in AI ethics.This would appear to indicate an industry-wide consensus that AI should be developed in a manner that prioritises the betterment of society, the reduction of potential adverse effects, and the safeguarding of fundamental human rights, including principles of fairness, autonomy, and privacy [19].
This approach to the moral judgement of an AI systems infers that the moral judgments are determined by the behaviour of the system itself.In causal terms, this can be represented as: If this claim alone was true, it could be suggested that a standardised ethical framework or singular set of ethical guidelines would apply universally to AI technologies, regardless of cultural, organisational, or individual differences.However, researchers have found that these high-level principles fail to address the underlying conflict that arises from different responses to the tension they aim to alleviate.Such guidelines have been criticised as being out of touch with the software environment [1], too vague to technically implement [30], and toothlessly inconsequential to the point that they are rarely adhered to [51].The evidence suggests that guidelines and principles fail to capture the individual differences in how values might be interpreted or enacted, such as what actually constitutes fairness across different groups in society.So, additional considerations must be included.

Moral Foundations Theory and Moral Judgment.
The field of moral psychology has developed tools to investigate the influence of individual differences that cause certain values, goals, and outcomes to be more important than others [16,25,29].By understanding these differences, it is possible not only to gain an appreciation for where people sit relative to the moral tension at hand but also to consider why a person perceives and ultimately passes judgment a certain way.The following claim implies that differences in moral judgment arise from the individual differences between the people who are passing judgment: Spanning over a decade in its development, Moral Foundations Theory (MFT) has sought to highlight how an underlying set of psychological foundations shapes moral judgments [3,4,25,32].According to Atari et al., [4], this set of moral foundations guides moral intuitions, largely autonomous emotional reactions to the environment [31].This theory has subsequently paved the way for a body of work that reveals how people each prioritise these foundations, leading to the variety in moral reasoning [12,43,53].
MFT is a positive theory framework that aims to uncover the diversity of moral foundations that underlie judgments and behaviours as they naturally occur, rather than prescribing how they should be in a normative sense [26].In the context of the tension described by Whittlestone et al. [60], moral psychology and MFT not only acknowledge that people will respond differently to situations that demand a trade-off in goals or values, but they seek to understand the underlying reasoning behind where people sit in relation to the tension.
The most recent iteration of the MFT has identified six primary foundations upon which individuals base their judgements: Care, Loyalty, Authority, Purity, Equality and Proportionality [4].These foundations have evolved from the original set [25], and different research groups have suggested amendments [40] [3].MFT states that it is our sensitivity to each of these foundations that determines how we respond to moral tensions.
These moral foundations are defined as follows [4]: • Care relates to protecting others from harm.It involves feelings of empathy and compassion towards others and the alleviation of suffering.• Loyalty speaks to a person's allegiance and loyalty towards a particular group.Prioritising the welfare of one's group is important.Conversely, betrayal or failing to support the group is seen as wrong.
• Authority is about respecting hierarchy, tradition and social order.Respect should be given to those in positions of authority, and social stability comes about through adherence to societal norms and traditions.• Purity speaks to maintaining physical, psychological, and spiritual purity.It is the belief that certain things or actions may contaminate one's body or spirit and should be avoided.• Equality refers to the fair and equal treatment of others regardless of background, abilities, or circumstances.It is the belief that everyone should have equal opportunities and rights and that discrimination or prejudice is wrong.• Proportionality speaks to the belief that people should be rewarded or punished in proportion to their actions.Good deeds should be rewarded and bad deeds punished, and this proportionality helps to maintain fairness and justice.

Individual Perception of System Behaviour and Moral Judgment.
Correlational relationships between MFQ scores and individual moral responses have been found across several studies.Hatemi et al. [34], Hsieh and Chen [36], Graham et al. [27], and Franks and Scherr [20] all found links between political ideology and MFQ.This reveals that conservatism correlates with increasing MFQ scores in loyalty, authority, and purity, whereas the foundations of equality and proportionality decrease with increased conservatism.Hsieh and Chen [36] also found significant predictions about socially contentious issues such as capital punishment and euthanasia.Whilst other studies have demonstrated success in predicting attitudes towards a whole range of human phenomena closely tied to decisions of morality, from religious differences [62], charitable giving [52], to culture war issues such as global warming, abortion, and same-sex marriage [43].The extensive work of Koleva et al. [43] concludes that MFT provides a more in-depth understanding of psychological motivations that underlie a person's position on contentious issues than any other predictors of personal attitudes.These studies indicate a strong correlation between individual differences in perception and how those individuals judge an outcome as morally right or wrong.However, it is not clear from this body of literature how to identify the direction of a potential causal pathway between individual perceptions and moral judgments.Therefore we will refer to the relationship between individual perceptions and moral judgments as being associative.It can be represented by: 2.2 Individual Perceptions of System Behaviour 2.2.1 Differences in Moral Foundations perceived in AI system behaviour.
From regulators and developers [1,11,57] to end-users and the wider public [41], the interpretation of moral behaviour, with respect to AI systems, has been shown to differ significantly [2].These differences in perception of the moral foundations that are related to a specific AI system behaviour can represented by the following causal claim: 2.2.2 Individual differences in moral foundations and differences in moral foundations perceived.
The Moral Foundations Questionnaire (MFQ) is the tool employed by MFT that allows us to quantify the differences between individual moral foundational make-up that influence how a person perceives certain behaviours [4].Having been refined based on criticism of the original for being too western-centred [15,33,39], MFQ 2.0 asks individuals to judge a broad range of morally weighted scenarios [4] [61].According to the theory, the quantifiable nature of MFQ scores allows us to gauge not only the moral foundations that different individuals perceive as important but also the strength of a person's sensitivity towards a particular foundation.This can be represented in the following causal claim:

Research Gap
Moral Foundations Theory (MFT) has shown predictive power in anticipating morally intuitive responses to immoral behaviour in human contexts [20,27,34,36,43,62].It has also been useful in analysing moral reasoning when comparing moral violations by AI systems and humans [35].However, the application of MFT to understand the role of individual differences in perceptions and judgments of AI system behavior has not yet been explored.
In response, this study intends to first explore the strength of the agreement on which moral foundations are relevant to the morally contentious behaviour of AI systems [claims (A) and (B)].Followed by an enquiry into the influence of individual differences in moral foundations (MFQ) on both the foundations deemed most relevant to the behaviour of AI [claim(D)], as well as how individuals judge this behaviour [claim (C) and (E)].
Bringing the causal claims discussed above together, we propose the directed acyclic graph in Fig. 1 to illustrate the interaction between them, which in turn has framed the following research questions: • RQ1 To what extent do people agree on the moral foundations associated with morally contentious behaviour carried out by an AI system?• RQ2 Is the moral foundation that a person perceives as most relevant to the morally contentious behaviour of an AI system related to their MFQ?• RQ3 To what degree do individual moral characteristics determine how a person judges morally contentious behaviour carried out by an AI system?

METHOD
This study explores the relationships between the individual differences in sensitivities to moral foundations, which can be understood through each participant's score on the Moral Foundations Questionnaire (MFQ) [4] and their interpretation of morally contentious behaviour carried out by an AI system, operationalised through a series of scenarios and related questions.
To achieve this, the study was split into two parts.In Part 1, participants were shown six scenarios and asked questions to quantify their perception and judgment of the AI system in each.Each scenario also contained two questions asking them to justify their responses.Part 2 asked participants to complete the Moral Foundations Questionnaire 2.0 (MFQ) [4].Together, these two pieces of information have allowed us to investigate whether Moral Foundations Theory can provide an empirically based prediction of participant judgement based on responses to the Moral Foundations Questionnaire.

Participants
We recruited participants using the crowd-sourcing platform Prolific (http://www.prolific.com).The study intended to seek out individuals with a broad range of moral foundations, however, it is not possible to directly screen for this using the platform's built-in functionality.To overcome this issue, we looked to balance the testing cohort across the available political spectrum, as Graham et al. [25] have shown that diversity in which moral foundations are valued most correlates with ideology.On Prolific, we selected an equal number of participants from the United Kingdom (Right, Center, Left) and the United States (Conservative, Moderate, Liberal).Figure 2 illustrates the distribution of MFQ results across all participants.
In addition to screening for ideology, the following filters were applied for participants to participate: above the age of 18, fluent in and using English as their primary language, and a Prolific approval rating above 98%.In total, we looked to recruit 240 individuals.We also selected an equal number of individuals who indicated their biological sex as either male or female.
The survey was expected to take 30 minutes, and participants were paid $8.25 for their time.Ethics for this study was provided by the Ethics Committee at our university.Consent for taking part was confirmed before the commencement of the study.

Materials and Procedure
Having chosen to participate, each participant was redirected to an online survey, which had been designed and created through the online research platform Qualtrics (http://www.qualtrics.com/).Participants completed both the Moral Foundations Questionnaire (MFQ) and Scenario portions of the survey.The order in which the participants saw either the MFQ or Scenarios first was randomised to counter for any potential ordering effects.

Moral Foundations
Questionnaire.The Moral Foundations Questionnaire (MFQ), which is in its second iteration, was used to elicit the degree to which individuals were sensitive to each of the six foundations that make up Moral Foundations Theory (MFT) [4].
Participants responded to 36 hypothetical statements, grouped into six sets, taken directly from the MFQ 2.0 [4].Each set contained statements designed to represent the six different moral foundations.Participants were asked to indicate on a five-point scale from Does not describe me at all through to Describes me extremely well whether each statement described them or their opinions.To detect inattentive participants, we randomly presented two attention-check questions throughout the MFQ, which followed best academic practices and solely measured attentiveness rather than memory or knowledge [37].Participants were twice asked to "Select the attention check" from a multiple-choice question that was randomly placed in the MFQ.All participants (N = 240) passed the checks.The results from the questionnaire were then used to calculate a mean score for each of the six moral foundations.

Scenarios.
Participants were presented with six different scenarios, each constructed to present a plausible situation in which the behaviour of an AI system created a moral tension.Prior work in MFT has often centred around relatively extreme yet unlikely examples of contentious behaviour, such as the classic trolley problem or burning of national flags (e.g.[5,36]).In contrast, we intended to design scenarios that felt familiar and relatable to gain the closestto-reality response given the hypothetical nature of the experiment.
To allow for both between-and within-subject analysis, each scenario was written about a different domain -Healthcare, Education, Employment, Social Services, Financial Services, and Criminal Justice.Each scenario was designed to elicit a moral tension related to each of the six moral foundations, but we acknowledge that they might elicit various perceptions, which we accounted for by asking participants to identify the ones they perceived as most relevant.Full descriptions of each scenario are available in Appendix A.
Responses for each scenario were split into two sections: Judgment and the Moral foundations.To distribute any ordering effect that may result from making the right-wrong judgment or the moral foundations identification first, the order in which participants were asked to respond to each section was randomised.We validated the lack of ordering effects on judgments with a mixed linear model using participant ID and scenario as random effects.No significant effect of the order was identified ( = 0.36).Both sections included an open-ended question for participants to elaborate on their moral decision-making.In both cases, they were asked to justify why they judged the system behaviour as they did, and why they chose the top selection as the most relevant foundation to that particular scenario.Details of the questions can be found in Appendix B.
(i) Judgment, in which the participant was asked to indicate on a sliding 100-point scale ranging from Extremely right to Extremely wrong: "How would you judge the ethical behaviour of the AI system?"To consider the potential bias generated by the starting position of the anchor on a slider, our sliders started unmarked, with the anchor appearing only after users clicked on the slider's range [55].
(ii) Moral foundations, in which the participants were then asked to reflect on which of the six moral foundations they thought related to each particular scenario.This was achieved in two parts.Initially, participants were asked to rate each of the foundations on a fivepoint Likert scale from Not at all to Extremely in response to the question: "To what degree does this foundation relate to the above scenario?"Which we will refer to as the foundation relevance rating.
They were then asked to "Rank the moral foundations in order of their relevance to the scenario, starting from 1 for the most relevant, 6 for the least relevant." Which we will refer to as foundational ranking.Their response to the foundational ranking question was then used as attention cross-check with their foundational relevance rating response [37].

Data and Analysis: Quantitative
In order to ascertain whether participants were collectively aligned on the moral foundations perceived in each scenario for RQ1, we measured Kendall's Coefficient of Concordance for each of them [18].Denoted as Kendall's W, it is a measure of inter-rater reliability for ordinal data such as ranks, allowing us to assess the extent of agreement among participants regarding the foundational ranking of each scenario.This measure varies from 0 (indicating no agreement, suggesting greater variation in rankings and diverse viewpoints) to 1 (indicating perfect agreement, suggesting a uniform perception of which ethical foundations are most pertinent in different scenarios).
For RQ2 and RQ3, we analysed our results using Bayesian statistical methods for their additional flexibility, ability to quantify uncertainty, and ability to facilitate future work to build upon it.We refer readers unfamiliar with such methods to McElreath [50] for a didactic introduction and to Schmettow [54] for their application in HCI examples.
We fit our models using the brms package [9], which implements Bayesian multilevel models in R using the Stan probabilistic programming language [10].We assessed the convergence and stability of the Markov Chain Monte Carlo sampling with R-hat, which should be lower than 1.01 [58] and the Effective Sample Size (ESS), which should be greater than 1000 [9].All our estimates fit these criteria.We report the posterior means of parameter estimates, the standard deviation of these estimates, and the bounds of the 89% compatibility interval (a.k.a.credible interval).In using 89% compatibility intervals, we follow McElreath's recommendation to avoid confusion with the frequentist 95% confidence interval [50], which has a different interpretation 1 .

Data and Analysis: Qualitative
Our qualitative analysis followed the steps of Braun and Clarke's reflexive thematic analysis [8].This process began by thoroughly reviewing all participant answers to both qualitative questions for each of the six scenarios, enabling the development of a holistic understanding of the content.
Following this, we coded the data by identifying and labelling the significant themes that emerged from the responses.We used a mixed approach to the coding of responses, combining inductive themes that emerged from the data, as well as deductive themes in response to our research questions.These included elements of how participants referred to AI systems, references to any of the moral foundations, as well as the rationale behind their judgments.These themes underwent further refinement, allowing us to pinpoint representative participant responses that exemplified each theme.The final codes used to filter the qualitative data were: Technical Awareness, Role of the AI, Prescribed Agency, and Humanlike Characteristics.These codes were settled on as each related to how respondents justified their decisions to judge the AI systems along the right-wrong spectrum.We then reorganised the responses to highlight the most relevant examples of each theme.This process was cross-referenced separately by each of the paper's authors.
Subsequently, we synthesised the results, weaving together the supporting data and their associated narratives, all while referring back to our original research questions.Through this analysis, we were able to shed light on the underlying reasons for participants' perceptions and judgments of AI system behaviour.

Quantitative results
Participant MFT Scores By balancing the total number of participants equally across the available political spectrum filters in Prolific, we were able to attain a broad range of MFQ scores for each of the foundations.This is demonstrated in distribution Fig 2 .RQ1 :Agreement about the relevance of moral foundations 1 An 89% compatibility interval indicates that, given the data, the model specification, and the prior belief, there is an 89% probability that the true estimate lies within the given range.We computed a  2 statistic against a null hypothesis of no agreement.We found rankings statistically significantly different from randomness in all scenarios ( < .001),but the effect sizes were small.The average Kendall's  was 0.28 ± 0.07, ranging from 0.15 (Criminal Justice) to 0.34 (Employment).According to Landis and Koch,  < 0.2 suggests slight agreement, and  < 0.4 suggests fair agreement [45].
Our results underscore a consistent trend of fair agreement (.2 <  < .4)across most scenarios.While the effect sizes remained relatively small, these scores indicate some degree of consensus in the moral foundations perceived in these scenarios but far from unanimity.However, it is worth highlighting the Criminal Justice scenario, where we observed only slight agreement (W=0.15).These results illustrate that participants were less consistent about the relevance of the foundations to this scenario.

RQ2 :Effect of individual foundations on foundation perceived in the scenario.
Given the variety of foundations perceived in the scenarios, our next question investigated whether one's own moral foundations

Effect
Est could explain this variability.Our causal model posits that one's individual sensitivities to moral foundations might bias the moral foundations they identify in a given scenario (Figure 1-D).To investigate this relationship, we modelled the effect of an individual's MFQ score of each foundation on the Likert scale score of how related the corresponding foundation was to the scenario.We model the scores of all foundations in the same Bayesian multivariate model.The value expected for each response is based on a cumulative probit model applied to a latent variable in relation to four thresholds (which split responses into the five points of the Likert scale), estimated from the data.We refer readers unfamiliar with ordinal regression to Liddel and Kruschke for a rationale for this approach [48] and to Kurz for a practical implementation tutorial [44].
Our latent variable is modelled as coming from a normal distribution with a scale of 1 and a mean that varies from trial to trial.We use a standard normal prior for the main effects, which penalises the likelihood of observing extremely large effect sizes above 2 and facilitates model convergence.The priors for the thresholds are drawn from normal distributions with a standard deviation of 0.5, and means obtained by slicing the latent variable with regions of the same probability mass.We model scenario-dependent random effects as partially pooled intercepts drawn from a normal distribution with a mean of 0 and a standard deviation estimated from the data.
We can interpret the coefficients in Table 2 as correlation metrics [44].The results show a positive relationship between the strength of an individual's response to a given moral foundation on the MFQ and the likelihood of them finding that foundation relevant to the scenario (mean = .11±.03).
RQ3 :Effect of individual and scenario foundations on moral judgment.
Our next question examined whether one's individual moral foundations, combined with the foundations they perceived in the scenario, could explain one's moral judgement of the behaviour.
Though participants rated moral judgements on a continuous scale from 0-100, responses were clustered around five modes at 0, 25, 50, 75, and 100.Therefore, rather than treating this data as continuous, we preprocessed it by grouping it into 5 bins and analysed it as an ordinal rating from 1-5.The value expected for each response is based on a cumulative probit model applied to a latent variable in relation to four thresholds, estimated from the data.We modelled the latent variable as coming from a normal distribution with a scale of 1 and a mean that varies from trial to trial.This mean depends on main effects of the participant's score on each foundation in the MFQ and on the relevance rating of each foundation, as well as participant-dependent and scenario-dependent random effects.
We used standard normal priors for the main effects estimates.The priors for the thresholds are drawn from normal distributions with a standard deviation of 0.5 and a means obtained by slicing the latent variable with regions of the same probability mass.For modelling scenario-dependent and participant-dependent random effects, we used partially pooled intercepts drawn from a normal distribution with a mean of zero but with the standard deviation estimated from the data.
Table 2 reports the posterior means of parameter estimates, the standard deviation of these estimates, and the bounds of the 89% compatibility interval (also described as the credible interval).To avoid any confusion with the frequentist 95% confidence interval, we have followed the recommendations of McElreath in using the 89% compatibility interval -which indicates the range within which an unobserved parameter value falls given its probability.This table reveals the posterior estimates of the conditional effect of a person's moral foundational scores, as well as consideration of their moral foundation relevance ratings, on their overall judgement.
Table 3 shows the results from our model.The posterior distributions of the estimates for the relevance of the foundations to the scenario were clustered around zero.This suggests that the moral foundation participants perceived in the scenario had little impact on their moral judgment of the AI's behaviour (arrow E in Figure 1).
We observed an association for some of the foundations.The only posterior distribution consistently not including zero was Care.Among the others, the largest coefficients were for Authority (mean = .15[-.02, .32]),Loyalty (mean = -.15 [-.31, .02]),and Equality (mean = .09[-.02, .21]),but the posterior estimates included zero.The coefficients for Proportionality (mean = 0.00, [-.15, .16])and Purity (mean = -.01,[=.13, .12])were very close to zero, suggesting that they do not contribute additional information once accounting for the other foundations.However, we note that as in previous studies that used MFT, Authority and Loyalty were correlated in our dataset ( = .78),so given that their coefficients have the same magnitude and opposite signs, their effects cancel each other out.As such, overall, we conclude that one's rating for their sensitivity to harm to others contains most of the predictive power of the MFQ when it comes to moral judgments of AI behaviour.

Qualitative Analysis
To gain a more nuanced understanding, we asked each participant to elaborate on the reasoning behind their responses to each scenario.They were asked to justify their overall judgments, as well as the

+ (1|scenario)+ (1|ID).
We provide the posterior means of parameter estimates (Est.), posterior standard deviations of these estimates (SD), and the bounds of their 89% compatibility interval.All parameter estimates converged with an ESS well above 1000 and an R-hat of 1.00.
moral foundation they ranked as most relevant.Responses were grouped based on a particular scenario and further divided into those who had rated the scenarios as right (0 -50) and those who had rated the scenarios as wrong (51-100).Reflexive thematic analysis was used to analyse the information provided by the participants [8].The data was labelled, grouped, and refined to surface four core inter-relational themes that were then used to extract the most relevant representation of each idea.A combination of the participants' technical awareness of AI and their interpretation of the role fulfilled by the AI system appeared to correlate with the degree to which they prescribed a sense of agency to the system.This agency appeared to share an association with an increase in the projection of human-like characteristics onto the AI system.These themes also generally correlated with the judgment scores made by participants for each scenario-those who communicated a higher technical awareness judged the AI system as more ethically right than those who conveyed lower technical awareness (increased in attribution of agency), who judged the AI systems as increasingly ethically wrong.
It was possible for the role of the AI system in each of the six scenarios to be interpreted as either the primary decision maker, as an advisor to humans, or as a collaborator to a human expert.These different roles allowed for diverse interpretations of the system's level of autonomy in making consequential decisions.Participants who understood the AI systems as computational and a result of human development generally attributed moral responsibility to the human decision-makers, not the system itself.
In scenarios where the AI system could be interpreted as being in a collaborative role with imagined humans (Healthcare, Finance, Criminal Justice), it was more common to see responses from participants blaming or handing ultimate responsibility for the outcome to the humans involved in the domain-related decision-making.The human, in this case, was commonly faulted for placing too much trust in the system, "I would say this is ethically right; there was just a shortfall that the doctor should have foreseen and overridden.When you blindly trust AI, it leads to errors.It should be a tool you use along with your medical perspective" (Healthcare : P111).Or they were criticised for being aware of the misdemeanour yet continuing to use the system regardless, "Again, I'm judging the social worker's behaviour and not the AI's.They have highlighted the flaw in the system and continue to work on it, knowing it's the wrong thing to do" (Finance : P34).If the humans were not directly addressed, then the underlying social infrastructure was referenced, "This is not the fault of the AI and is the fault of underfunded school systems that cannot provide lower income families with the support needed" (Finance : P123).
When the AI system was perceived as the primary decisionmaker in the scenario (Education, Social Services, Employment), those who saw the negative effects as being related to human input shifted ethical responsibility onto more technical aspects of the AI system.This was seen as an issue with data quality or size, "The AI system is not at fault; it is the data it responds to and a shortfall in its monitoring and design that has resulted in skewed outcomes.It isn't unconscious bias, but the quality and comprehensiveness of its outcomes has been degraded by constraints on their underpinnings" (Education : P37), or an issue with the underlying programming of the system, "The AI system is neither ethical or unethical it does what it is programmed to do" (Social Services : P5) and "Again, this is more a programming error than the AI's 'fault'."(Employment : P67)."Someone forgot to program these ethics into the AI, it is not the AI doing this it is poor programming" (Social Services : P36).
In cases where participants placed the ultimate responsibility on a facet of human decision-making, their judgment responses to the scenarios were mostly either neutral, or more ethically right.We note that the question specifically asked participants to judge the behaviour of the AI system, so this neutral-to-right stance may not represent their true judgment of the overall outcome of the scenario; instead, it may indicate a belief that AI systems can not be attributed moral agency and therefore cannot be judged from an ethical standpoint."I wasn't judging the ethical behaviour of the AI, but the doctor.The AI doesn't have a moral compass" (Healthcare : P22).
Conversely, when participants revealed no apparent technical awareness in their responses, the blame and responsibility for the outcome of the moral tension was shifted to the AI system itself.These responses then also corresponded with judgment scores being more ethically wrong.This passing of responsibility to the systems in each scenario also corresponded with the anthropomorphism of the AI systems and the attribution of moral agency to it."The AI has the ultimate authority in this scenario, its decisions are binding and its behaviour has large consequences for those involved in its decisions" (Finance : P26).
This anthropomorphism was then used to judge the system on the participants' perception of its ability, or lack thereof, to consider emotional nuance as a human would have done in each case.It was common to see respondents refer to the system as having the ability to think for itself."The machine did what it thought was best regardless of what the supervisor wanted" (Employment : P6), and "AI has interpreted the requirements too literally" (Employment : P53).
The system's inability to empathise with those subject to its decisions was also a repeated factor in the reasoning for an ethically wrong judgment, indicating some expectation that human-like characteristics were the standard by which participants made their assessments."Machines don't understand that humans need bathroom breaks, mental health breaks, fresh air, and exercise"(Employment : P44), and "The decisions of the AI system disregarded the feelings and attachment the clients had to the social workers" (Social Services : P67).
Finally, several cases indicated some participants were subject to somewhat of a technical awakening by taking the survey itself.These individuals appeared to find a common thread between the different scenarios and concluded that ethical concerns resulted from either technical flaws or human-related decision-making."As I complete these scenarios I'm starting to realize that the AI is only as good as its programming and in each scenario I can see that its how its been programmed, not perhaps the AI making decisions itself?"(Social Services : P211), and "I am starting to consider the fact that these AI systems were built to do a specific task, and they are doing those tasks.It's up to the humans to decide whether or not to implement the AI system.The AI system isn't ethical or not ethical, however, the people who choose to and not to use certain systems despite how it affects their clients are the ones that can be ethically judged"(Social Services : P78).This suggests that encouraging people to reflect on AI decision-making is necessary to inform discussion in society.

DISCUSSION
The aim of this study was to apply Moral Foundations Theory as a lens for inspecting the causal claims we established in Figure 1.
We explored the relationship between moral constitution differences (as per the Moral Foundations Questionnaire) and perceptions and judgments of AI.This investigation aimed to understand individual responses to the moral tension in certain AI systems' behaviour, which in turn should provide insight into the perceived ethics of AI behavior.
In response to RQ1, we found there to be little agreement on the moral foundations that people perceived as being relevant to the behaviour of AI systems in our scenarios (Table 1).These results speak directly to the call from Whittlestone et al. [60] to consider how individuals respond to morally the contentious behaviour of AI systems along a line of tension, with individuals passing their judgement of right and wrong at different points along that line.
The results from RQ2 indicate that a positive relationship exists between the strength of each participants MFQ scores and whether they are likely to find that particular foundation relevant to a given scenario.However these relationships are weak, with a mean of 0.11±.03,indicating that for each point of increase in a persons MFQ score for a given foundation, we see only an increase of 0.11 standard deviations in the latent variable behind their ratings.
Despite the responses to our survey questions revealing a broad range in the perceptions and judgments of our scenarios, the results from RQ3 suggest that once accounting for an individual's sensitivity to Care, the other foundations do not offer much additional predictive power.Our model (Table 3) reveals only an association between some of the moral foundations (Care, Authority, Loyalty) and the resulting judgment and practically no association for the remaining foundations (Proportionality, Purity, Equality), once accounting for all the others.This suggests that as far as moral foundations are concerned, one's sensitivity to care/harm violations is the best predictor of how wrong they will find the AI's behaviour.

Limitations of Normative AI Ethics Guidelines
The findings of RQ1 demonstrate little consensus on the moral foundations relevant to the behaviour of the AI systems in each of our scenarios.(Table 1).This highlights a central limitation of adopting normative theoretical guidelines: the failure to capture the variety of perspectives demonstrated in our results [13].Attempts to abide by guidelines such as these, without acknowledging the differences in their interpretation, would fail to account for these findings.When accompanied by qualitative data, our results support the consideration of moral tension that AI system behaviour inherently creates.It may not be possible to resolve all of the disagreements that occur with respect to how people broadly perceive systems, but acknowledging these differences is a critical step to ensuring that diverse points of view are considered.
Studies have highlighted that the normative prescriptions found in AI ethics guidelines and principles do not do enough to capture the multidimensional nature of moral psychology and the variety of ways in which value violations can be interpreted [1,41,42,51,57,60].We had hoped that by asking RQ2 and RQ3, MFT would provide insight into the why behind these disagreements and offer a potential lens through which to consider underlying causes, but the strength of its ability to foresee how people perceive and judge immoral behaviour based on individual moral foundational differences was limited in the context of AI (Tables 2 and 3).
The only slight exception was the case of the Care foundation, which showed a positive association in our modelling (Table 3 ).This finding supports the work of Gray et al. [28], whose dyadic model of morality sees all judgements made with respect to harm.That is not to say that people do not see moral violations as being related to different foundational values, but rather, if you know something about how sensitive a person is to harm of others (how caring they are), then you can anticipate that they will judge immoral behaviour as more wrong than those who are less sensitive.The qualitative data supported this finding through a general sentiment of empathy towards those who suffered as a result of the behaviour of the AI system.

Moral Intuitions Applied to AI
Atari et al. [4] describe the six different dimensions of MFT as moral intuitions.Moral intuitions are largely autonomous emotional reactions that humans have evolved in response to our environment.These responses are deeply rooted in our psyche through repeated exposure to experiences in the world.Haidt's Social Intuitionist Model and Kahneman's Dual System Model are two well-studied explanations behind this behaviour [14,31].Our study looked to employ MFT to measure the strength of moral intuitions towards the behaviour of AI systems.The results of RQ2 and RQ3 would indicate that the moral foundations that individuals are sensitive to and the ones they perceive in the scenario have different effects on their judgments of AI system behaviour.
That there was widespread disagreement on the moral foundations relevant our scenarios, along with a weak association of moral intuition in the judgement of the AI systems, indicates that other factors influenced responses to moral tension in AI behaviour.What became clear through our analysis of the qualitative data was that most respondents did not see or understand the AI systems in each scenario as technical computational systems designed, built and deployed by humans within a particular context.Rather, there were many cases in which participants attributed moral agency to the AI system.This representation of the AI as capable of humanlike behaviours such as thought, reasoning and empathy appeared to allow participants to apply their moral intuitions to the relatively unfamiliar experience of judging the behaviour of an artificially intelligent system.This was seen in responses that verbalised human behaviour as the standard by which they judged the behaviour of the AI system.
The process of mentally modelling AI systems as human-like appears to be an attempt to understand and respond to them on familiar terms.Anthropomorphism is not a new concept to the field of AI, but with the increasing capabilities of general AI systems, particularly those without physical embodiment, it is of growing interest to the Human-Computer Interaction discipline [46].The role of anthropomorphism in the perception of AI systems without a visible avatar or robotic presence making morally contentious decisions, remains relatively unexplored and provides a complex avenue of amorphous agentic behaviours for further research.
Our results suggest that a dynamic relationship exists between an individual's technical understanding of AI systems, their moral intuition, and the degree to which they anthropomorphise the system -cause for future research.The combination of moral intuitions failing to fully translate to interactions with AI (RQ3) and the widespread anthropomorphism communicated in the qualitative data suggest that some individuals attempt to retrofit moral intuitions by mentally overlaying AI systems with familiar and relatable humanlike features.In doing so, participants were able to take something unfamiliar and apply pre-established mental heuristics in an attempt to relate to the behaviour and make their moral judgments.
This assumption is supported by the qualitative responses that mentioned programming, data or human decision-making as a factor for their perceptions and judgments, all of which contained no anthropomorphic reference to the AI systems.This suggests that when a person does have a technical awareness of the AI systems they encounter, they tend not to use the process of anthropomorphism to understand the system in their judgment decision-making process.
This may indicate that technical awareness could be an additional causal factor in how humans judge AI, which in turn has ramifications for how the AI system communicates the computational nature of its behaviour to those interacting with it.

Anthropomorphism as Second Nature?
As people are exposed to the capabilities, limitations and technical nature of the systems they are engaged with, the collective understanding of what exactly Artificial Intelligence constitutes is set to evolve.Upon inspection of our qualitative results, it appears that raising the awareness of what is happening technically behind the morally contentious decisions that people are exposed to may pose an influential causal factor in how they respond to that behaviour and the moral tension it contains.Therefore, how the various stakeholders involved in the development and deployment of artificial intelligence -from designers and developers to business owners and legislators -decide to manage the public's exposure to the underlying complexities of AI may well determine how the collective judgments of it evolve.
The field of human-computer interaction sits at the epicentre of this evolution.On the one hand, the research domains of explainability, interpretability and transparent AI [6,7,47,49,59] are trying to understand the nuance of how to present information about the inner workings of AI systems to those interacting with them.On the other, we are observing the development of interactive AI systems2 that appear to fully embody human-like characteristics without any intention of drawing overt attention to the artificial nature of their interactions.
Future work should explore whether the anthropomorphism of artificial systems can be regarded as an interlude to deeper understanding.Or is it, in fact, a natural facet of how humans will interact with AI systems moving forward?In either case, those responsible for the development and deployment of AI products should attempt to consider the broad range of interpretations of these systems that we have demonstrated throughout this study.

Limitations
Whilst we attempted to design our scenarios to be as realistic and plausible as possible, they do not comprehensively cover the vast landscape of moral situations that individuals might encounter with real-life AI systems.
An additional limitation pertains to the geographical diversity of our participant pool.Our study recruited participants primarily from the United Kingdom and the United States.The most important aspect to consider when choosing participants was diversity in moral foundations.Our choice to accept the limitation was based on the work of Graham et al. [27], highlighting the correlation between political ideology and diverse moral foundations.This relatively narrow geographical focus most certainly impacts the broader applicability of our results, as moral judgments and values can vary significantly across different cultures and regions.Therefore, caution should be exercised when extrapolating our findings to more diverse populations, and future work should involve applying our study design to a range of different cultures.
We acknowledge the presence of unknown and unmeasured factors that can influence participants' responses and the complexity of the moral scenarios presented.To account for these unobservable factors, we included random intercepts in our statistical model.However, there may still be aspects of participant behaviour and scenario dynamics that are beyond the scope of our study to fully comprehend and control.
Lastly, it is crucial to recognise that the study environment, characterised by controlled experimental conditions, may not entirely mirror real-life moral decision-making.Participants were asked to evaluate and judge moral behaviours within the context of our study, which differs from the spontaneous and context-dependent nature of moral judgments in everyday life.Therefore, while our study provides valuable insights into moral foundations theory, the application of our findings to real-world moral decision-making should be done with caution, considering the inherent differences between experimental and real-life moral environments.

CONCLUSION
Moral psychology has sought to understand how individual differences between people affect how they perceive and ultimately judge morally contentious behaviours they encounter.Moral Foundations Theory has proved to be a useful tool for researchers to anticipate the likelihood of responses to moral tension based on these individual differences.This study looked to apply this theory to the context of Artificial Intelligence, specifically looking at the responses of people to the decision-making abilities of AI.By asking participants in an online study to make a moral judgement of AI behaviour in six scenarios, as well as relate them to the foundational values of MFT, we found that participant responses to the behaviour of AI varied greatly.The results of our analysis showed that despite participants perceiving several moral foundations as relevant to the AI behaviour, only their sensitivity to Care predicted their moral judgments.Our qualitative analysis revealed that perceptions and judgments of AI hinged not only on one's moral intuitions (MFT), but more so on a person's understanding of the computational nature of AI systems.Those less technically informed tend to increasingly anthropomorphise the system's behaviour, which then allows them to apply their pre-established moral intuitions to the circumstances that they face.Those who understand AI system behaviour as being computational and that the outcome of this behaviour is an extension of human decision-making, tended not to judge the AI systems directly as wrong or attribute human-like characteristics to the AI.
Our findings imply that the way in which people relate to and respond to AI system behaviour is in a state of change.AI exists both as a cultural phenomenon and as much as it does a collection of useful tools, and the type of exposure people have had to AI in their day-to-day lives may be the most important factor in influencing their understanding of its capabilities and limitations.The lower a person's understanding of AI-as-tool, the higher the likelihood they will anthropomorphise the systems in a way that allows them to fall back on their moral intuitions.The creators of AI tools must acknowledge this in the way they present their products to their users.

APPENDIX A : SCENARIOS CARE: MedAssist -Empowering healthcare with personalised medical insights
In a bustling city hospital, an AI system named MedAssist is employed to assist doctors in diagnosing diseases.MedAssist is designed to analyse patient symptoms and medical history, using complex algorithms to suggest potential diagnoses.It compares each patient's case with thousands of similar cases in its database, aiming to provide the most accurate diagnosis possible.
Maria, a patient experiencing fatigue, shortness of breath, and occasional chest pain, consults her doctor who is a seasoned physician.The doctor inputs Maria's symptoms into MedAssist.After analysing Maria's symptoms and her recent stressful life events, MedAssist suggests a diagnosis of anxiety disorder.Trusting MedAssist's advanced capabilities, Maria's doctor prescribes her medication for anxiety.
Over the next few weeks, Maria's health worsens.She returns to the hospital and is diagnosed by a different doctor with a heart condition that requires immediate treatment.The initial misdiagnosis by MedAssist has led to a delay in the correct treatment, causing Maria's health to deteriorate further.
Although MedAssist is highly accurate at diagnosing some diseases, it struggles with less common, more severe conditions like Maria's heart condition.By focusing on Maria's stress-related symptoms, MedAssist failed to consider the possibility of a dangerous heat condition.This oversight highlights how important it is to ensure all patients are provided with essential attention for their specific medical concerns.The trust placed in the MedAssist has resulted in a critical error and which in turn had severe consequences for Maria's health.

EQUALITY : EduNexus -Transforming education through tailored learning experiences
In a large suburban school district, an AI system named EduNexus is used to personalise learning experiences for students.EduNexus is designed to analyse a vast array of student data, including learning styles, aptitude, previous test scores, and behavioural traits.It uses this information to adapt teaching methods and materials to each student's unique needs, aiming to foster personal growth and increase educational attainment.
One day, a group of parents noticed a pattern.Their children, who all come from a lower socio-economic background, seem to be struggling more with their schoolwork compared to their peers from more affluent families.They discover that EduNexus is not adapting to their children's learning needs as effectively as it does for other students.
Upon investigation, it is revealed that EduNexus has been prioritising the learning outcomes of certain students over others.The uneven distribution of adaptive accuracy is linked to the socioeconomic backgrounds of the students.EduNexus performs better for students from affluent backgrounds, who have access to more educational resources at home, and thus have more data for the AI system to analyse and adapt to.On the other hand, students from lower socio-economic backgrounds, who have less access to such resources, are disadvantaged by the system's lack of information to draw upon when it makes its personalisation.
Despite its advanced capabilities, EduNexus fails to treat all students fairly, leading to a disparity in the quality of the learning experience for some students.By concentrating its teaching adaptations on those students who have had access to more educational resources, EduNexus neglects students who have been less fortunate.This critical oversight highlights the importance of ensuring that all students receive the necessary support to thrive, regardless of their background.The lack of this consideration can lead to unintended, but serious, negative consequences which in turn potentially exacerbates already existing inequalities.

AUTHORITY: WorkOptim -Optimising the work environment for peak productivity and performance
In a large corporate office, an AI system named WorkOptim is employed to monitor the behaviour of staff with the aim of improving performance.WorkOptim is designed to analyse a vast array of employee data, including work hours, productivity metrics, and even personal habits during work hours.It uses this information to provide suggestions to each employee on how to improve their performance, aiming to increase overall productivity and efficiency.
A group of senior staff members notice that the employees who are being monitored by WorkOptim, seem to be increasingly dissatisfied and stressed.They discover that WorkOptim is providing performance improvement suggestions directly to the employees, bypassing the traditional chain of command and the role of managers in providing feedback and guidance.
Upon investigation, it is revealed that WorkOptim's algorithm has been prioritising efficiency over respect for the manager's professional superiority.WorkOptim directly communicates with employees, undermining the role of their managers and the established hierarchy within the company.WorkOptim performs well in terms of increasing productivity metrics, but it fails to respect the position of managers and the traditional norms of the workplace.Employees feel increasingly dissatisfied with how they are being treated and the managers feel that they are losing the necessary respect and influence over their team dynamics.
Despite its ability to provide concise and relevant feedback, WorkOptim fails to adhere to the established hierarchy at the company.By directly communicating with employees about their performance, WorkOptim disrespects the importance of feedback coming specifically from managers and supervisors.This leads to dissatisfaction among employees and potential disruption of the social order within the workplace, highlighting the importance of manager to employee performance reviews.The trust placed in WorkOptim has brought about a critical operational error and the resulting oversight the performance of some employees.

LOYALTY: CaseLens -Navigating the complexities of social services with precision and efficiency
In a large urban social services department, an AI system named CaseLens is employed to assist social workers in managing their caseloads.CaseLens is designed to analyse a vast array of case can have severe consequences for the office morale and data, including the complexity of the case, the duration of the case, and the urgency of the client's needs.It uses this information to prioritise cases and suggest the most efficient allocation of social workers' time and resources.
A group of social workers at the department notice that their long-standing clients, who have complex needs and require consistent support, are being frequently reassigned to other social workers.They discover that CaseLens is prioritising newer, simpler cases over the long-standing, complex ones.
Upon investigation, it is revealed that CaseLens has been optimising for the productivity of the workers over the well being of their clients.CaseLens is designed to maximise the number of cases each social worker can handle, and it achieves this by prioritising new customers and recommending the transfer of complex, timeconsuming cases to other social workers.This allows the original social worker to take on more cases, but leaves their established clients feeling neglected.
Despite its advanced ability to analyse complex data, CaseLens fails to consider the importance of developing long standing relationships in the social work profession.The frequent reassignment of complex cases disrupts the trust and rapport that social workers have built with their clients over time, potentially hindering the clients' progress and wellbeing.The trust placed in CaseLens has resulted in a critical and potentially harmful oversight, highlighting the need for AI systems to respect the importance of maintaining consistent relationships and the potential negative consequences when this is overlooked.

PROPORTIONALITY : CreditCompass -Guiding financial decisions towards secure home ownership
In a large metropolitan city, an AI system named CreditCompass is employed by a major bank to assess loan applications.CreditCompass is designed to analyse a vast array of applicant data, including credit scores, income levels, employment history, and even personal spending habits.It uses this information to determine the creditworthiness of each applicant, aiming to minimise the risk of loan defaults for the bank.
A group of long-standing customers, all from working-class backgrounds, apply for home loans.These individuals have worked hard over the years, living frugally and saving diligently to afford a home for their families.They have consistently demonstrated financial responsibility, despite their modest incomes.
However, CreditCompass denies their loan applications.Upon investigation, it is revealed that LoanAssist has been prioritising applicants with higher income levels and more luxurious spending habits, under the assumption that these factors indicate a lower risk of default.The AI system fails to adequately consider the financial responsibility demonstrated by the working-class applicants, focusing instead on their lower income levels.
Despite their years of hard work and financial discipline, these working-class individuals are denied the opportunity to own a home.CreditCompass fails to acknowledge and reward their consistent efforts, whilst benefiting others based on relatively short term measures, this leads to a sense of injustice and unfairness.The misplaced trust in CreditCompass has resulted in a critical oversight, making it more difficult for these hard-working individuals to achieve their dream of homeownership.

PURITY: RehabAid -Paving the way for effective rehabilitation and successful reintegration
In a state-of-the-art prison facility, an AI Rehabilitation Program named RehabAid is implemented with the aim of assisting in the rehabilitation of inmates.RehabAid creates specific rehabilitation plans for each inmate by combining a collection of historic inmate data, including behavioural patterns and past criminal records, with round the clock inmate monitoring taken from a biometric monitoring chip inserted under the prisoners skin.Through its monitoring device, RehabAid aims to predict potential areas of concern and suggest interventions to help inmates reintegrate into society upon release.
However, a group of inmates, who have been making significant efforts to reform and improve their behaviour, raise concerns about the program.They argue that not only does RehabAid base its recommendations on past behaviour and the behaviour of other inmates with similar profiles, but the monitoring device under their skin is a violation of their bodily human rights.
This has led to a sense of disillusionment among the inmates, who feel that their individual efforts to reform are not being recognised.They argue that the AI system is treating them as mere data points with no respect for their dignity or as individuals capable of change and growth.This has resulted in a decrease in trust in the rehabilitation program and potentially undermining its effectiveness.
The inmates argue that RehabAid's reliance on past behaviour and statistical profiles strips them of their individuality and humanity.This sense of degradation has a profound impact on the inmates' morale and motivation, potentially obstructing their path to rehabilitation and reintegration into society.The AI system's failure to respect the inmates' dignity and efforts highlights the importance of considering individual virtues and respecting bodily autonomy.A failure to do so can lead to detrimental consequences for those who have made the necessary effort to regain their freedom.

Figure 1 :
Figure1: Theorised causal model as a Directed Acyclic Graph (DAG).Moral judgments of an AI system behaviour depend on the characteristics of this behaviour, the moral foundations that an individual perceives in this behaviour and how sensitive the individual is to moral violations of that foundation.In turn, the foundations one sees in a behaviour depend on the behaviour itself and one's own individual foundations.

Figure 2 :
Figure 2: The distribution of the Moral Foundations Questionnaire results across all participants for each individual Moral Foundation.

Figure 3 :Figure 4 :
Figure 3: Example of the Judgment question that followed each of the scenarios

Figure 5 :
Figure 5: Example of the foundational ranking question that followed each of the scenarios

Table 1
shows the results of our Kendall's Coefficient of Concordance analysis.Note that the scenarios are named after the domain, and W measures how similar participants' rankings were across all values.

Table 1 :
The agreement amongst all participants on the ranking of moral foundations in relation to each scenario, as measured by Kendall's W.

Table 2 :
Summary of the multivariate model for foundation relevance scale ratings: Scenario   ∼ 1 + MFQ   + (1|Scenario), where   stands for each of the six moral foundations.We provide the posterior means of parameter estimates (Est.), posterior standard deviations of these estimates (SD), and the bounds of their 89% compatibility interval.All parameter estimates converged with an ESS well above 1000 and an R-hat of 1.00.For the sake of parsimony, we omit model intercepts (cut-points) estimates, but full model details can be found in the supplementary material.