One4D: Unified 4D Generation and Reconstruction via Decoupled LoRA Control

Unified Framework

One4D is a unified framework for 4D generation and reconstruction that can seamlessly transition between 4D generation from a single image, 4D reconstruction from a full video, mixed generation and reconstruction from sparse frames, and 4D generation from a text prompt via Unified Masked Conditioning (UMC). With Decoupled LoRA Control (DLC), which employs two modality-specific LoRA adapters to form decoupled computation branches for RGB frames and pointmaps, connected by lightweight, zero-initialized control links that gradually learn mutual pixel-level consistency, One4D produces high-quality RGB frames and accurate pointmaps across both generation and reconstruction tasks.

Methodology

Figure 1: The One4D Unified Framework architecture.

🎛️

Unified Masked Conditioning

Enables seamless transitions between 4D generation from a single image, 4D reconstruction from a full video, mixed generation and reconstruction from sparse frames, and 4D generation from a text prompt using a single unified model.

🧩

Decoupled LoRA Control

Decouples RGB and XYZ computation with modality-specific LoRA adapters to reduce interference while preserving pixel-wise cross-modal control. Lightweight zero-initialized links let the branches share appearance and geometry cues.

Figure 2: Comparison of Decoupled LoRA Control against other architectures.

Results Showcase

Single image to 4D

Generating a consistent 4D scene from a single input image, ensuring high-quality RGB frames and consistent geometry for the entire 4D output.

Sparse frames to 4D

Reconstructing the 4D scene given only a few sparse frames. One4D interpolates the missing information utilizing the Unified Masked Conditioning.

Full video to 4D

High-fidelity reconstruction from a full video input, ensuring temporal consistency and accurate geometry estimation.

Text to 4D

Generating a consistent 4D scene from a pure text prompt, ensuring high-quality RGB frames and consistent geometry for the entire 4D output.

BibTeX

@inproceedings{mione4d2026,
  title={One4D: Unified 4D Generation and Reconstruction via Decoupled LoRA Control},
  author={Mi, Zhenxing and Wang, Yuxin and Xu, Dan},
  booktitle={Proceedings of the European Conference on Computer Vision (ECCV)},
  year={2026}
}