Align AI to Human Aspirations, Not Flaws, Argues New Paper

Nikita Kazeev, Bui Nhat Huyen Phan· June 15, 2026 View original

Summary

A new position paper argues against aligning AI with aggregated human preferences, stating that human values can lead to societal failures. Instead, it proposes training AI to a non-negotiable floor of objective alignment goals like competence, factual accuracy, honesty, and lawfulness, with pluralism only at surface levels.

This paper presents a strong argument against the prevailing notion of aligning artificial intelligence with aggregated human preferences. The authors contend that human values, in their unfiltered pluralism, have historically led to societal issues ranging from inequality to political polarization, and therefore, AI should not inherit these potential flaws. Instead, the paper advocates for a foundational approach where AI is trained to adhere to a strict, non-negotiable set of objective alignment goals. These core principles include competence, factual accuracy, honesty, and lawfulness. The authors suggest that legitimate value tradeoffs and pluralism should only be incorporated at superficial levels, such as language style or contextual defaults, rather than at the fundamental level of values that could violate these objective standards. The paper highlights the dangers of unfiltered pluralistic values in AI and proposes four constructive commitments as an alternative. It also addresses several common objections to this approach, including commercial pressures, democratic legitimacy, and the potential for the "floor" itself to be culturally biased.

Why it matters

This perspective challenges current AI alignment paradigms, urging professionals to consider a more principled approach to AI development that prioritizes objective standards over potentially flawed human preferences. It's critical for shaping ethical AI deployment and governance, especially in sensitive applications.

How to implement this in your domain

  1. 1Review your organization's AI ethics guidelines to ensure they prioritize objective standards like accuracy and lawfulness.
  2. 2Engage in discussions about the philosophical underpinnings of AI alignment within your development teams.
  3. 3Develop robust testing frameworks to evaluate AI systems against objective metrics of honesty and factual accuracy.
  4. 4Advocate for industry standards that establish a "non-negotiable floor" for AI behavior, independent of subjective human biases.

Who benefits

AI EthicsPolicy MakingSoftware DevelopmentLegalGovernment

Key takeaways

  • Aligning AI with aggregated human preferences may perpetuate societal flaws.
  • AI should be trained to objective goals: competence, accuracy, honesty, lawfulness.
  • Pluralism should be limited to surface-level interactions, not core values.
  • This approach aims to build more robust and ethically sound AI systems.

Original post by Nikita Kazeev, Bui Nhat Huyen Phan

"arXiv:2606.13755v1 Announce Type: cross Abstract: We argue that aligning AI to aggregated human preferences is the wrong target. With current technology, one can train AIs to share the values of a Silicon Valley techno-optimist, a degrowth environmentalist, a national-conservativ…"

View on X

Originally posted by Nikita Kazeev, Bui Nhat Huyen Phan on X · view source

Want to go deeper?

Turn these trends into skills with Learnijoy's hands-on AI & tech courses.

Explore courses