LLMs Struggle to Capture Human Personality Diversity, Study Finds

Aanisha Bhattacharyya, Yaman Kumar Singla, Rajiv Ratn Shah, Changyou Chen, Jitendra Ajmera· June 18, 2026 View original

Summary

A study reveals that large language models exhibit "persona manifold collapse," where increasing detail in persona descriptions paradoxically reduces the diversity and fidelity of simulated human behavior. Simple age-gender personas often outperform complex profiles in prediction accuracy, highlighting limitations in how LLMs represent and differentiate human personalities.

New research indicates a significant limitation in how large language models (LLMs) simulate human personalities, a phenomenon termed "persona manifold collapse." This occurs when attempts to provide more detailed or expressive persona specifications to an LLM lead to a reduction in the model's ability to represent diverse behaviors and latent personality traits. Instead of improving fidelity, richer descriptions can cause the model's internal representations to converge, diminishing the distinctiveness between different simulated personas. The study systematically evaluated this effect across various LLM architectures and scales, finding that increased persona complexity consistently reduced the separation between personas in the model's latent space. This also translated to weaker differentiation in downstream simulation tasks, meaning the models struggled to accurately reproduce human subgroup disagreements. Surprisingly, simple personas defined by age and gender often yielded more accurate predictions in simulation tasks compared to highly detailed "Ideal Customer Profiles." The findings suggest that simply adding more descriptive detail to personas does not guarantee improved simulation fidelity and can even degrade performance. The research emphasizes the need for a more nuanced, representation-aware approach to persona construction, rather than just increasing expressivity, to overcome these inherent limitations in LLM-based human simulation.

Why it matters

Professionals relying on LLMs for market research, customer simulation, or social science modeling need to be aware of the "persona manifold collapse" to avoid inaccurate or biased results and design more effective persona-driven applications.

How to implement this in your domain

  1. 1Re-evaluate current persona prompting strategies for LLM-based simulations, focusing on simplicity over excessive detail.
  2. 2Test the fidelity of your LLM personas by comparing simulated outcomes against real-world data or human responses.
  3. 3Prioritize "alignment bridges" – attribute combinations that maintain behavioral stability – when constructing personas.
  4. 4Investigate the impact of persona complexity on representational diversity within your LLM applications.
  5. 5Consider using simpler demographic personas (e.g., age-gender) as a baseline for simulation accuracy.

Who benefits

MarketingMarket ResearchSocial ScienceAI/ML DevelopmentCustomer Service

Key takeaways

  • LLMs can suffer from "persona manifold collapse," reducing behavioral diversity with complex personas.
  • Simpler personas, like age-gender, may outperform highly detailed profiles in simulation accuracy.
  • Increasing persona expressivity does not guarantee improved simulation fidelity.
  • A representation-aware approach to persona construction is crucial for accurate LLM simulations.

Original post by Aanisha Bhattacharyya, Yaman Kumar Singla, Rajiv Ratn Shah, Changyou Chen, Jitendra Ajmera

"arXiv:2606.18263v1 Announce Type: cross Abstract: Large language models (LLMs) are increasingly used to simulate human populations via persona prompting, often under the assumptions that richer persona descriptions improve behavioral fidelity, similarly sized attribute combinatio…"

View on X

Originally posted by Aanisha Bhattacharyya, Yaman Kumar Singla, Rajiv Ratn Shah, Changyou Chen, Jitendra Ajmera on X · view source

Want to go deeper?

Turn these trends into skills with Learnijoy's hands-on AI & tech courses.

Explore courses