SparseDFF: Sparse-View Feature Distillation for One-Shot Dexterous Manipulation

Published in ICLR, 2024

Humans excel at transferring manipulation skills across diverse objects due to their understanding of semantic correspondences. To give robots similar abilities, we develop a method for acquiring view-consistent 3D Distilled Feature Fields (DFF) from sparse RGBD observations. Our approach, \method, maps image features to 3D point clouds, creating a dense feature field for one-shot learning of dexterous manipulations transferable to new scenes. The core of \method is a lightweight feature refinement network, optimized with contrastive loss between pairwise views. We also use point-pruning to enhance feature continuity. Evaluations show our method enables robust manipulations of rigid and deformable objects, demonstrating strong generalization to varying objects and scene contexts.

Recommended citation: Wang, Q., Zhang, H., Deng, C., You, Y., Dong, H., Zhu, Y., & Guibas, L. (2023). SparseDFF: Sparse-View Feature Distillation for One-Shot Dexterous Manipulation. ICLR 2024.