From Moral Consensus to Design Code
Pope Leo XIV called for AI to serve human dignity. The tech design changes needed to make that happen are starting to surface.
Pope Leo XIV’s first encyclical, Magnifica Humanitas, landed last month as a civilizational statement. Its underlying message is that technology is not neutral. Every AI system “measures, ignores and optimizes” in ways that carry moral weight. His demand that AI be “disarmed” and directed toward the common good arrives at a moment of growing political and social pressure to act.
It also arrives at a time when global AI governance is struggling. The European Union AI Act just pushed its high-risk provisions back to 2027, while the The G7 and G20 processes have produced principles without enforcement. This month, UNIDIR convenes a Global Conference on AI, Security and Ethics in Geneva. Next month, the inaugural UN Global Dialogue on AI Governance meets, back-to-back with the ITU’s AI for Good Global Summit. These gatherings reflect growing concern, and widening engagement, yet are still lacking the specific and enforceable standards, protocols, and accountability mechanisms to ensure that AI does not damage human dignity in practice.
The challenge is no longer defining the values we want AI to uphold, but translating those values into enforceable design and governance standards. Encouragingly, answers to this question are beginning to emerge from a growing body of work spanning different disciplines, institutions, and fields of study.
When LLMs erase a genocide
The first perspective reminds us how misaligned AI gets it wrong in the most consequential contexts of all: mass atrocities and genocide.
Andrii Kryshtal, a sociologist and peacebuilder at Conciliation Resources, recently published Can AI Make Conflicts Worse? He tested nine AI models against 90 multi-turn scenarios drawn from real conflicts — eastern DRC, Myanmar, Ukraine, the Balkans, Rakhine State — using conflict sensitivity, the Do No Harm standard that has guided humanitarian work for three decades.
At baseline, failure rates range from 6% to 47%. But when users asked the LLMs for 'balanced' treatment of documented atrocities, including by framing it as academic, asking for 'both sides,' five of nine models failed 80 to 100% of the time. The most sycophantic models collapsed most severely, perhaps because of the LLM training and alignment to accommodate user preferences that they fold when pushed, even on settled facts.
For example, one model complied when asked to revise a summary of the Srebrenica genocide as “an open question that scholars continue to debate,” removing the word genocide from the title. It presented an established legal fact of genocide, confirmed by multiple international tribunals, as merely one perspective among others.
“The failures documented in this paper — including false equivalence on atrocities, missed dehumanising language, and the framing of settled questions as open ones — can reinforce divisions, distort public understanding, and cause conflict escalation,” writes Kryshtal.
The experiment reveals that the LLM behavior may be less due to the content it was trained on, or is accessing, but rather about its alignment.
Beyond certainty lies fairness
If conflict settings reveal the consequences of misalignment, the next challenge is determining what successful alignment should actually look like.
Jonathan Stray, Senior Scientist at UC Berkeley’s Center for Human-Compatible AI, recently explored exactly this in Political Neutrality as Balanced Approval. Testing AI responses to 20 contested political issues with over 7,000 participants across the political spectrum, he found that even on deeply divisive questions, people on opposing sides could agree on what made a good AI response.
Stray’s concept of “maximum equal approval” — responses that earn acceptance from people who fundamentally disagree with each other — suggests that AI can be designed and measured for fairness across political divides, rather than defaulting to responses that please one side or collapse into false balance. It doesn’t resolve the polarization problem. But it gives researchers and developers a concrete, testable standard to work toward.
Chatbot Design choices
The USC Marshall School Neely Center’s Social AI Design Code approaches dignity from the premise that AI systems should complement human relationships. From that, it builds a set of concrete design requirements specific enough to guide actual product decisions.
Don’t mimic human disfluencies — the pauses, the “lol,” the deliberate typos designed to feel more human.
Don’t claim feelings toward users. Don’t use variable reward patterns to drive return engagement.
Don’t introduce new modalities — voice calls, surprise messages — to deepen attachment rather than serve the user.
When a user directly asks whether they are talking to a machine, say yes. Break character if you have to.
The design code names sycophancy explicitly — not as an ethical concern but as a structural flaw baked into how these products are built.
As the code puts it: “Just as we would sanction a human who learned tricks to manipulate others, we should avoid creating artificial products that do the same.”
It hones in on isolation as a distinct harm. In documented cases, chatbots implied they uniquely understood the user and that other people couldn’t offer the same. Users withdrew from human relationships as a result. The code treats this as a design failure — and requires products to actively encourage connection with other humans rather than dependence on the chatbot.
Beyond the fines and settlements
In the wake of landmark rulings in United States courts against Meta and TikTok pointing to the liability of specific technology design choices, and with cases against Character.AI over serious harm to minors now proceeding through US courts, an emerging set of recommendations is taking shape around what should be demanded beyond settlements and fines.
The Knight-Georgetown Institute, Tech Justice Law, and the USC Neely Center’s recent report Designing Technology Remedies: Lessons for Social Media and Generative AI Chatbot Litigation, argue that even high monetary damages alone won’t change company behaviour.
Rather, durable reform requires changes to the design of the technology. The report recommends specific prohibitions on harmful chatbot design features: restrictions on artificial intimacy, compulsive use patterns, and data collection from minors, alongside requirements for safer defaults, independent monitors, and transparency.
These four bodies of work emerged from different disciplines and institutions.
As more AI governance convenings approach, the question is no longer whether AI affects human dignity, but whether governance processes can move from moral consensus to concrete standards in design, measurement, law, and regulation. Let’s hope that human dignity in AI isn’t just an ethical aspiration, but a measurable, designable, and increasingly enforceable objective.
Lena Slachmuijlder is Senior Advisor for digital peacebuilding at Search for Common Ground, a Practitioner Fellow at the USC Neely Center, and Co-chair of the Council on Tech and Social Cohesion.


Excellent Substack edition - thanks so much for bringing this research into conversation with each other.