ResearchAI Engineering & DevTools AI Research

DLR Boosts Low-Rank LLM Pre-Training with Zero Inference Cost

Dong Wang, Wenwu Tang, Yun Cheng, Olga Saukh· June 30, 2026 View original

Summary

This paper introduces Duplicated Latent Residual (DLR), a training-only, parameter-free plug-in that enhances low-rank pre-training of large language models (LLMs). DLR augments low-rank outputs with a fixed structured residual that is absorbed into the up-projection after training, resulting in zero additional parameters, FLOPs, or memory during deployment while improving perplexity, especially for larger models.

Pre-training large language models (LLMs) at scale is exceptionally resource-intensive. Low-rank pre-training offers a promising solution by factorizing weight matrices to reduce parameters and computational operations, though it often compromises model quality compared to full-rank training. This research proposes Duplicated Latent Residual (DLR), an innovative plug-in designed to improve low-rank pre-training without incurring any additional cost during inference. DLR works by augmenting the standard low-rank output with a fixed, structured residual during training. Crucially, after training, this residual is mathematically absorbed into the up-projection layer, meaning DLR adds zero learnable parameters, FLOPs, or memory overhead during deployment. Experiments across LLaMA models (60M to 7B parameters) show DLR consistently strengthens low-rank pre-training, particularly for models 130M and above, and the folded checkpoints transfer effectively to supervised fine-tuning.

Why it matters

For AI engineers and product developers, DLR offers a significant advancement in making large language models more accessible and efficient to pre-train, potentially lowering costs and accelerating the development of high-quality, smaller models without sacrificing inference performance.

How to implement this in your domain

1Evaluate DLR for pre-training custom low-rank LLMs to reduce computational costs.
2Integrate DLR into existing low-rank model training pipelines to improve quality.
3Benchmark DLR-enhanced low-rank models against full-rank counterparts for performance and efficiency.
4Consider DLR when developing LLMs for edge devices or resource-constrained environments.

Who benefits

AI/ML DevelopmentCloud ComputingEdge AISoftware DevelopmentHigh-Tech

Key takeaways

DLR enhances low-rank LLM pre-training without adding inference cost.
It uses a training-only, parameter-free structured residual.
The residual is absorbed post-training, maintaining low-rank deployment efficiency.
DLR improves perplexity, especially for larger LLaMA models (130M+).

Original post by Dong Wang, Wenwu Tang, Yun Cheng, Olga Saukh

"arXiv:2606.28932v1 Announce Type: new Abstract: Large language models have driven recent progress in language and multimodal AI, yet pre-training them at scale is prohibitively expensive. Low-rank pre-training, which factorizes each weight matrix into a rank-r product to reduce b…"

View on X

Originally posted by Dong Wang, Wenwu Tang, Yun Cheng, Olga Saukh on X · view source

Want to go deeper?

Turn these trends into skills with Learnijoy's hands-on AI & tech courses.

Explore courses

More in AI Engineering & DevTools

AI Engineering & DevTools

Sky Pro Cloud Rendering Optimized, Cost Cut by 50%

An upcoming Sky Pro update significantly reduces cloud rendering costs by 50% through texture consolidation and introduces more intuitive cloud shape controls. The new controls allow independent erosion strength adjustments for cloud tops and bottoms, improving visual quality and ease of use.

@dangreenheckJun 30, 2026

AI InvestingAI News & ToolsAI Engineering & DevTools

Popping the GPU Bubble

The piece discusses the current high demand and pricing for GPUs, suggesting that the market might be nearing a point of correction or saturation.

radqJun 30, 2026

AI News & ToolsAI Engineering & DevTools

LongCat-2.0 Model Launching Soon on Hugging Face

The LongCat-2.0 model is expected to be released shortly on the Hugging Face platform, making it accessible to developers and researchers.

@_akhaliqJun 30, 2026