Img2CAD: Reverse Engineering 3D CAD Models from Images through VLM-Assisted Conditional Factorization

Published in Arxiv, 2024

Img2CAD introduces a novel approach for reconstructing 3D CAD models from single-view images. Leveraging large vision-language models (VLMs) like GPT-4V for semantic guidance, and TrAssembler, a transformer-based network, for continuous attribute prediction, our method achieves accurate and editable CAD outputs from common image inputs. We also provide a newly curated dataset, CAD-ified from ShapeNet, covering diverse everyday objects.

Recommended citation: You, Y., Uy, M.A., Han, J., Thomas, R., Zhang, H., You, S., & Guibas, L. (2024). Img2CAD: Reverse Engineering 3D CAD Models from Images through VLM-Assisted Conditional Factorization. ECCV 2024.