ResField layers incorporates time-dependent weights into MLPs to effectively represent complex temporal signals.
ResField layers incorporates time-dependent weights into MLPs to effectively represent complex temporal signals.
Neural fields, a category of neural networks trained to represent high-frequency signals, have gained significant attention in recent years due to their impressive performance in modeling complex 3D data, especially large neural signed distance (SDFs) or radiance fields (NeRFs) via a single multi-layer perceptron (MLP). However, despite the power and simplicity of representing signals with an MLP, these methods still face challenges when modeling large and complex temporal signals due to the limited capacity of MLPs. In this paper, we propose an effective approach to address this limitation by incorporating temporal residual layers into neural fields, dubbed ResFields, a novel class of networks specifically designed to effectively represent complex temporal signals. We conduct a comprehensive analysis of the properties of ResFields and propose a matrix factorization technique to reduce the number of trainable parameters and enhance generalization capabilities. Importantly, our formulation seamlessly integrates with existing techniques and consistently improves results across various challenging tasks: 2D video approximation, dynamic shape modeling via temporal SDFs, and dynamic NeRF reconstruction. Lastly, we demonstrate the practical utility of ResFields by showcasing its effectiveness in capturing dynamic 3D scenes from sparse sensory inputs of a lightweight capture system.
Our key idea is to substitute one or several MLP layers with time-dependent layers whose weights are modeled as trainable residual parameters added to the existing layer weights.
There residual weights are modeled as a learnable low-rank composition.
Increasing the model capacity in this way offers three key advantages:
1) Runtime: the underlying MLP does not increase in size and hence maintains the inference and training speed.
2) Generalizability: retains the implicit regularization and generalization properties of MLPs.
3) Universality: ResFields are versatile, easily extendable, and compatible with most MLP-based methods for spatiotemporal signals.