New Foundation Model for Tax-Aware Personalized Portfolio Management

Ramin Pishehvar· July 1, 2026 View original

Summary

Researchers introduce a three-phase deep reinforcement learning system for personalized portfolio management that overcomes limitations of prior work. It features a ticker-identity-free cross-asset encoder, a Mixture of Experts (MoE) actor-critic for diverse investment goals, and a personalization layer fine-tuned on individual transaction history.

A new research paper details a three-phase deep reinforcement learning system designed for personalized, tax-aware portfolio management. This innovative approach addresses common limitations in financial reinforcement learning, such as reliance on specific tickers, monolithic objectives, and static user models. The first phase involves pretraining a cross-asset encoder using self-supervised learning on a multi-asset corpus, augmented by a time series foundation model called Chronos. This encoder can generalize to any publicly traded asset without retraining, using a 50-dimensional metadata vector. The second phase fine-tunes a Mixture of Experts (MoE) portfolio actor-critic with PPO, optimizing for six distinct investment goals simultaneously, including tax-loss harvesting. An intent router blends expert advice based on the active objective and market conditions. Finally, a lightweight personalization layer is added in phase three. This layer adapts to individual users at inference time by fine-tuning a LoRA module on real brokerage transaction history, inferring investment objectives from actual trading behavior rather than questionnaires. A natural language parser converts free-form goals into structured parameters.

Why it matters

This system offers a significant leap in personalized financial advice, enabling AI to manage portfolios with greater adaptability, tax efficiency, and alignment with individual investor goals, which is highly valuable for financial institutions.

How to implement this in your domain

  1. 1Investigate integrating time series foundation models like Chronos for enhanced financial data encoding.
  2. 2Explore Mixture of Experts (MoE) architectures to manage diverse and potentially conflicting investment objectives.
  3. 3Develop personalization layers that infer user intent from historical behavior rather than explicit inputs.
  4. 4Implement natural language processing for converting free-form investment goals into structured parameters.
  5. 5Assess the potential for tax-aware portfolio optimization in existing financial product offerings.

Who benefits

BFSIFinTechWealth ManagementInvestment BankingPersonal Finance

Key takeaways

  • The system uses a ticker-identity-free encoder for broad asset generalization.
  • A Mixture of Experts architecture handles multiple investment goals, including tax-loss harvesting.
  • Personalization is achieved by inferring user objectives from transaction history.
  • It represents a significant advancement in personalized, tax-aware portfolio management using deep RL.

Original post by Ramin Pishehvar

"arXiv:2606.30997v1 Announce Type: new Abstract: We present a three-phase deep reinforcement learning system for personalized portfolio management that addresses three limitations shared by all prior financial RL work: 1) ticker lock-in, 2) monolithic objectives , and 3) static us…"

View on X

Originally posted by Ramin Pishehvar on X · view source

Want to go deeper?

Turn these trends into skills with Learnijoy's hands-on AI & tech courses.

Explore courses