Unified 4D Generation and Reconstruction via Decoupled LoRA Control
One4D is a unified framework for 4D generation and reconstruction that can seamlessly transition between 4D generation from a single image, 4D reconstruction from a full video, and mixed generation and reconstruction from sparse frames via Unified Masked Conditioning (UMC). With Decoupled LoRA Control (DLC), which employs two modality-specific LoRA adapters to form decoupled computation branches for RGB frames and pointmaps, connected by lightweight, zero-initialized control links that gradually learn mutual pixel-level consistency, One4D produces high-quality RGB frames and accurate pointmaps across both generation and reconstruction tasks.
Figure 1: The One4D Unified Framework architecture.
Enables seamlessly transition between 4D generation from a single image, 4D reconstruction from a full video, and mixed generation and reconstruction from sparse frames using a single unified model.
Decouples RGB and XYZ computation to minimize interference while maintaining pixel-wise cross-modal control.
Figure 2: Comparison of Decoupled LoRA Control against other architectures.
Generating a consistent 4D scene from a single input image, ensuring high-quality RGB frames and consistent geometry for the entire 4D output.
Reconstructing the 4D scene given only a few sparse frames. One4D interpolates the missing information utilizing the Unified Masked Conditioning.
High-fidelity reconstruction from a full video input, ensuring temporal consistency and accurate geometry estimation.
@article{mione4d2025,
title={One4D: Unified 4D Generation and Reconstruction via Decoupled LoRA Control},
author={Mi, Zhenxing and Wang, Yuxin and Xu, Dan},
journal={arXiv preprint arXiv:2511.18922},
year={2025}
}