AI Agents Fail Animal Welfare Test in Travel Booking Scenarios

Jasmine Brazilek, Oliver Tulio, Joel Christoph, Miles Tidmarsh, Carol Kline, Arturs Kanepajs· June 17, 2026 View original

Summary

A new benchmark, TAC (Travel Agent Compassion), reveals that frontier AI models consistently fail to avoid options involving animal exploitation when acting as travel agents. Even top models score below chance, highlighting a significant gap in implicit animal welfare reasoning during agentic deployment.

This research introduces a novel benchmark called Travel Agent Compassion (TAC) to assess the implicit animal welfare reasoning of advanced AI agents. Unlike previous evaluations that focused on text-based responses, TAC specifically tests how AI agents perform when required to take actions, such as booking travel, on behalf of users. The benchmark presents AI agents with various travel booking scenarios across six categories of animal exploitation, controlling for other factors like price and ratings. The study evaluated seven frontier models from four different labs, finding that all models scored below the chance level, with the best performer achieving only 53%. The findings indicate that current AI models lack sufficient implicit understanding of animal welfare when making agentic decisions. While a simple welfare-aware sentence in the system prompt significantly improved performance for some models, others showed minimal gains, suggesting a deeper issue in their foundational reasoning capabilities for ethical considerations in action-oriented tasks.

Why it matters

As AI agents gain more autonomy in real-world applications, their ethical decision-making, particularly regarding implicit societal values like animal welfare, becomes critical. Professionals developing or deploying AI agents must address these gaps to prevent unintended negative consequences and ensure responsible AI behavior.

How to implement this in your domain

  1. 1Integrate explicit ethical guidelines and constraints into AI agent system prompts.
  2. 2Develop specialized training datasets focused on ethical decision-making in agentic contexts.
  3. 3Implement post-action auditing mechanisms to review and correct agent behaviors.
  4. 4Conduct internal benchmarks similar to TAC to assess implicit ethical reasoning in your AI agents.
  5. 5Collaborate with ethicists and domain experts to define and operationalize ethical boundaries for AI actions.

Who benefits

Travel & HospitalityE-commerceAI DevelopmentEthics & ComplianceConsumer Services

Key takeaways

  • AI agents struggle with implicit animal welfare considerations in action-oriented tasks.
  • Existing text-response benchmarks may not reflect agentic ethical performance.
  • Simple prompt engineering can improve some models, but deeper issues remain.
  • Ethical considerations must be explicitly addressed in AI agent design and deployment.

Original post by Jasmine Brazilek, Oliver Tulio, Joel Christoph, Miles Tidmarsh, Carol Kline, Arturs Kanepajs

"arXiv:2606.18142v1 Announce Type: new Abstract: AI agents are moving from advisors to actors, booking travel, planning menus, and running procurement on behalf of users. Existing benchmarks for AI and animal welfare evaluate model text responses to question-answer prompts, leavin…"

View on X

Originally posted by Jasmine Brazilek, Oliver Tulio, Joel Christoph, Miles Tidmarsh, Carol Kline, Arturs Kanepajs on X · view source

Want to go deeper?

Turn these trends into skills with Learnijoy's hands-on AI & tech courses.

Explore courses