Img2CAD: Reverse Engineering 3D CAD Models from Images through VLM-Assisted Conditional Factorization

Published in SIGGRAPH Asia 2025, 2025

Img2CAD introduces a novel approach for reconstructing 3D CAD models from single-view images. Leveraging large vision-language models (VLMs) like GPT-4V for semantic guidance, and TrAssembler, a transformer-based network, for continuous attribute prediction, our method achieves accurate and editable CAD outputs from common image inputs. We also provide a newly curated dataset, CAD-ified from ShapeNet, covering diverse everyday objects.

Recommended citation: You, Y., Uy, M.A., Han, J., Thomas, R., Zhang, H., Du, Y., Chen, H., Engelmann, F., You, S., & Guibas, L. (2025). Img2CAD: Reverse Engineering 3D CAD Models from Images through VLM-Assisted Conditional Factorization. SIGGRAPH Asia 2025.