Neural Radiance Fields (NeRFs) have revolutionized the reconstruction of static scenes and objects in 3D, offering unprecedented quality. However, extending NeRFs to model dynamic objects or object articulations remains a challenging problem. Previous works have tackled this issue by focusing on part-level reconstruction and motion estimation for objects, but they often rely on heuristics regarding the number of moving parts or object categories, which can limit their practical use. In this work, we introduce LEIA, a novel approach for representing dynamic 3D objects. Our method involves observing the object at distinct time steps or "states" and conditioning a hypernetwork on the current state, using this to parameterize our NeRF. This approach allows us to learn a view-invariant latent representation for each state. We further demonstrate that by interpolating between these states, we can generate novel articulation configurations in 3D space that were previously unseen. Our experimental results highlight the effectiveness of our method in articulating objects in a manner that is independent of the viewing angle and joint configuration. Notably, our approach outperforms previous methods that rely on motion information for articulation registration.
We evaluate our method on both synthetic and real-world data. We show that our method can generate novel states for articulated objects that were not seen during training. We demonstrate the effectiveness of our method in capturing complex articulations and show that it outperforms previous methods that rely on motion information for articulation registration. We also show that our method is robust to single and multiple articulations, as well as combinations of motions.
The following results show LEIA working for data from a real-world storage object whose images we collected. We show the start state and the motion generated by LEIA.
The following results show LEIA working for data from PartNet-Mobility, a synthetic dataset. We show the start and end states, the ground truth motion, and the motion generated by LEIA.
@misc{swaminathan2024leialatentviewinvariantembeddings,
title={LEIA: Latent View-invariant Embeddings for Implicit 3D Articulation},
author={Archana Swaminathan and Anubhav Gupta and Kamal Gupta and Shishira R. Maiya and Vatsal Agarwal and Abhinav Shrivastava},
year={2024},
eprint={2409.06703},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2409.06703},
}