Diff3R: Feed-forward 3D Gaussian Splatting with Uncertainty-aware Differentiable Optimization

arXiv 2026

Yueh-Cheng Liu¹ Jozef Hladký² Matthias Nießner¹ Angela Dai¹

¹Technical University of Munich ²Computing Systems Lab, Huawei Technologies, Switzerland

Abstract

Recent advances in 3D Gaussian Splatting (3DGS) present two main directions: feed-forward models offer fast inference in sparse-view settings, while per-scene optimization yields high-quality renderings but is computationally expensive. To combine the benefits of both, we introduce Diff3R, a novel framework that explicitly bridges feed-forward prediction and test-time optimization. By incorporating a differentiable 3DGS optimization layer directly into the training loop, our network learns to predict an optimal initialization for test-time optimization rather than a conventional zero-shot result. To overcome the computational cost of backpropagating through the optimization steps, we propose computing gradients via the Implicit Function Theorem and a scalable, matrix-free PCG solver tailored for 3DGS optimization. Additionally, we incorporate a data-driven uncertainty model into the optimization process by adaptively controlling how much the parameters are allowed to change during optimization. This approach effectively mitigates overfitting in under-constrained regions and increases robustness against input outliers. Since our proposed optimization layer is model-agnostic, we show that it can be seamlessly integrated into existing feed-forward 3DGS architectures for both pose-given and pose-free methods, providing improvements for test-time optimization.

Video

Overview

Given a sparse set of context views (with optional camera parameters), our feed-forward network predicts an initial set of 3D Gaussian parameters ($\Theta_0$). Our proposed differentiable optimization layer refines these parameters via gradient descent to yield the optimized Gaussians ($\Theta^*$). To train the network end-to-end, we introduce an efficient analytical solution for the backward pass using implicit gradients and a matrix-free PCG solver. Additionally, to make the optimization more robust in sparse-view settings, we predict learnable uncertainty weights ($\boldsymbol{\Lambda}$). These weights act as an adaptive proximal bound on the optimization trajectory, preventing the model from overfitting to the context views.

Comparisons

PixelSplat

Ours

MVSplat

Ours

AnySplat

Ours

DA3

Ours

DepthSplat

Ours

BibTeX

@article{liu2025diff3r,
      title={Diff3R: Feed-forward 3D Gaussian Splatting with Uncertainty-aware Differentiable Optimization}, 
      author={Yueh-Cheng Liu and Jozef Hladký and Matthias Nießner and Angela Dai},
      year={2026},
      eprint={2604.01030},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2604.01030}, 
}