TY - JOUR
T1 - 3D-RVP
T2 - A method for 3D object reconstruction from a single depth view using voxel and point
AU - Zhao, Meihua
AU - Xiong, Gang
AU - Zhou, Meng Chu
AU - Shen, Zhen
AU - Wang, Fei Yue
N1 - Funding Information:
Fei-Yue Wang (Fellow, IEEE) received his Ph.D. degree in computer and systems engineering from the Rensselaer Polytechnic Institute, Troy, NY, USA, in 1990. He joined The University of Arizona in 1990 and became a Professor and the Director of the Robotics and Automation Laboratory and the Program in Advanced Research for Complex Systems. In 1999, he founded the Intelligent Control and Systems Engineering Center at the Institute of Automation, Chinese Academy of Sciences (CAS), Beijing, China, under the support of the Outstanding Chinese Talents Program from the State Planning Council, and in 2002, was appointed as the Director of the Key Laboratory of Complex Systems and Intelligence Science, CAS. In 2011, he became the State Specially Appointed Expert and the Director of the State Key Laboratory for Management and Control of Complex Systems.
Funding Information:
This work was supported in part by the National Natural Science Foundation of China under Grants 61773382, U1909218, U1909204, 61773381, U1811463, 61872365 & 61806198; CAS Key Technology Talent Program (Zhen Shen); Chinese Guangdong's S&T Project (2019B1515120030).
Funding Information:
This work was supported in part by the National Natural Science Foundation of China under Grants 61773382 , U1909218 , U1909204 , 61773381 , U1811463 , 61872365 & 61806198 ; CAS Key Technology Talent Program (Zhen Shen); Chinese Guangdong’s S&T Project ( 2019B1515120030 ).
Publisher Copyright:
© 2020 Elsevier B.V.
PY - 2021/3/21
Y1 - 2021/3/21
N2 - Three-dimensional object reconstruction technology has a wide range of applications such as augment reality, virtual reality, industrial manufacturing and intelligent robotics. Although deep learning-based 3D object reconstruction technology has developed rapidly in recent years, there remain important problems to be solved. One of them is that the resolution of reconstructed 3D models is hard to improve because of the limitation of memory and computational efficiency when deployed on resource-limited devices. In this paper, we propose 3D-RVP to reconstruct a complete and accurate 3D geometry from a single depth view, where R, V and P represent Reconstruction, Voxel and Point, respectively. It is a novel two-stage method that combines a 3D encoder-decoder network with a point prediction network. In the first stage, we propose a 3D encoder-decoder network with residual learning to output coarse prediction results. In the second stage, we propose an iterative subdivision algorithm to predict the labels of adaptively selected points. The proposed method can output high-resolution 3D models by increasing a small number of parameters. Experiments are conducted on widely used benchmarks of a ShapeNet dataset in which four categories of models are selected to test the performance of neural networks. Experimental results show that our proposed method outperforms the state-of-the-arts, and achieves about 2.7% improvement in terms of the intersection-over-union metric.
AB - Three-dimensional object reconstruction technology has a wide range of applications such as augment reality, virtual reality, industrial manufacturing and intelligent robotics. Although deep learning-based 3D object reconstruction technology has developed rapidly in recent years, there remain important problems to be solved. One of them is that the resolution of reconstructed 3D models is hard to improve because of the limitation of memory and computational efficiency when deployed on resource-limited devices. In this paper, we propose 3D-RVP to reconstruct a complete and accurate 3D geometry from a single depth view, where R, V and P represent Reconstruction, Voxel and Point, respectively. It is a novel two-stage method that combines a 3D encoder-decoder network with a point prediction network. In the first stage, we propose a 3D encoder-decoder network with residual learning to output coarse prediction results. In the second stage, we propose an iterative subdivision algorithm to predict the labels of adaptively selected points. The proposed method can output high-resolution 3D models by increasing a small number of parameters. Experiments are conducted on widely used benchmarks of a ShapeNet dataset in which four categories of models are selected to test the performance of neural networks. Experimental results show that our proposed method outperforms the state-of-the-arts, and achieves about 2.7% improvement in terms of the intersection-over-union metric.
KW - 3D object reconstruction
KW - Encoder-decoder network
KW - Machine learning
KW - Point prediction network
UR - http://www.scopus.com/inward/record.url?scp=85097471582&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85097471582&partnerID=8YFLogxK
U2 - 10.1016/j.neucom.2020.10.097
DO - 10.1016/j.neucom.2020.10.097
M3 - Article
AN - SCOPUS:85097471582
SN - 0925-2312
VL - 430
SP - 94
EP - 103
JO - Neurocomputing
JF - Neurocomputing
ER -