ProfiLLM Enhances Ride-Hailing Dispatch with LLM User Profiling

Tengfei Lyu, Zirui Yuan, Xu Liu, Kai Wan, Zihao Lu, Li Ma, Hao Liu· June 18, 2026 View original

Summary

ProfiLLM is an agentic LLM data pipeline that operationalizes utility-aligned user profiling for industrial ride-hailing dispatch systems. It uses tool-augmented global knowledge mining and utility-aligned profile exploration to generate and refine user profiles, significantly improving outcome prediction and dispatching efficiency in production.

Integrating Large Language Models (LLMs) into industrial ride-hailing dispatch systems to extract semantic features from vast behavioral logs presents a compelling challenge. While traditional matching pipelines rely on structured numerical data, crucial behavioral signals, such as a driver's regional preferences, are inherently contextual and can be naturally captured by LLM-generated user profiles. However, scaling such profiling for live, millisecond-latency dispatchers on platforms with millions of daily orders faces several constraints: log data far exceeds LLM context windows, many users have insufficient interaction history for individual profiling, and fluent profiles don't always translate to improved prediction utility. ProfiLLM addresses these issues with a novel agentic LLM data pipeline. ProfiLLM consists of two main modules: Tool-Augmented Global Knowledge Mining, which equips an LLM agent with analytical tools to extract platform-wide insights and adaptive clustering rules; and Utility-Aligned Profile Exploration, which generates, evaluates, and refines candidate profiles for each cluster using a lightweight utility proxy. Deployed on DiDi's production dispatcher, ProfiLLM achieved substantial improvements in outcome prediction, gross merchandise value (GMV) gains in simulation, and positive results in online A/B tests, including increased completion rates and reduced cancellation rates.

Why it matters

This innovation demonstrates a practical and effective way to leverage LLMs for real-time, large-scale industrial applications, particularly in optimizing complex logistics and matching systems. It provides a blueprint for how companies can overcome data scale and utility alignment challenges to deploy advanced AI for tangible business impact.

How to implement this in your domain

  1. 1Evaluate the feasibility of using LLM-based user profiling for your own platform's matching or recommendation systems.
  2. 2Develop an agentic LLM pipeline to mine global knowledge from large-scale behavioral data.
  3. 3Implement utility-aligned evaluation metrics to ensure generated profiles improve downstream prediction tasks.
  4. 4Explore techniques for clustering users and generating profiles for long-tail segments.
  5. 5Conduct A/B tests to measure the real-world impact of LLM-enhanced profiling on key business metrics.

Who benefits

Ride-HailingLogisticsE-commerceDelivery ServicesRecommendation Systems

Key takeaways

  • ProfiLLM uses LLMs for utility-aligned user profiling in ride-hailing.
  • It overcomes challenges of data scale, context windows, and long-tail users.
  • The system employs tool-augmented knowledge mining and utility-aligned profile refinement.
  • Deployment led to significant improvements in dispatch efficiency and business metrics.

Original post by Tengfei Lyu, Zirui Yuan, Xu Liu, Kai Wan, Zihao Lu, Li Ma, Hao Liu

"arXiv:2606.18803v1 Announce Type: new Abstract: Bringing Large Language Models (LLMs) into industrial ride-hailing dispatch as semantic feature extractors over platform-scale behavioral logs is a compelling but under-explored data systems problem. Production matching pipelines re…"

View on X

Originally posted by Tengfei Lyu, Zirui Yuan, Xu Liu, Kai Wan, Zihao Lu, Li Ma, Hao Liu on X · view source

Want to go deeper?

Turn these trends into skills with Learnijoy's hands-on AI & tech courses.

Explore courses