TY - JOUR
T1 - A 3D atrous convolutional long short-term memory network for background subtraction
AU - Hu, Zhihang
AU - Turki, Turki
AU - Phan, Nhathai
AU - Wang, Jason T.L.
N1 - Funding Information:
This project was funded by the Deanship of Scientific Research (DSR) at King Abdulaziz University, Jeddah, Saudi Arabia under grant no. (KEP-3-611-39). This project was funded by the Deanship of Scientific Research (DSR) at King Abdulaziz University, Jeddah, Saudi Arabia under grant no. (KEP-3-611-39). The authors, therefore, acknowledge with thanks DSR technical and financial support.
Funding Information:
This project was funded by the Deanship of Scientific Research (DSR) at King Abdulaziz University, Jeddah, Saudi Arabia under Grant (KEP-3-611-39). The authors, therefore, acknowledge with thanks DSR technical and financial support.
Publisher Copyright:
© 2013 IEEE.
PY - 2018/7/27
Y1 - 2018/7/27
N2 - Background subtraction, or foreground detection, is a challenging problem in video processing. This problem is mainly concerned with a binary classification task, which designates each pixel in a video sequence as belonging to either the background or foreground scene. Traditional approaches for tackling this problem lack the power of capturing deep information in videos from a dynamic environment encountered in real-world applications, thus often achieving low accuracy and unsatisfactory performance. In this paper, we introduce a new 3-D atrous convolutional neural network, used as a deep visual feature extractor, and stack convolutional long short-term memory (ConvLSTM) networks on top of the feature extractor to capture long-term dependences in video data. This novel architecture is named a 3-D atrous ConvLSTM network. The new network can capture not only deep spatial information but also long-term temporal information in the video data. We train the proposed 3-D atrous ConvLSTM network with focal loss to tackle the class imbalance problem commonly seen in background subtraction. Experimental results on a wide range of videos demonstrate the effectiveness of our approach and its superiority over existing methods.
AB - Background subtraction, or foreground detection, is a challenging problem in video processing. This problem is mainly concerned with a binary classification task, which designates each pixel in a video sequence as belonging to either the background or foreground scene. Traditional approaches for tackling this problem lack the power of capturing deep information in videos from a dynamic environment encountered in real-world applications, thus often achieving low accuracy and unsatisfactory performance. In this paper, we introduce a new 3-D atrous convolutional neural network, used as a deep visual feature extractor, and stack convolutional long short-term memory (ConvLSTM) networks on top of the feature extractor to capture long-term dependences in video data. This novel architecture is named a 3-D atrous ConvLSTM network. The new network can capture not only deep spatial information but also long-term temporal information in the video data. We train the proposed 3-D atrous ConvLSTM network with focal loss to tackle the class imbalance problem commonly seen in background subtraction. Experimental results on a wide range of videos demonstrate the effectiveness of our approach and its superiority over existing methods.
KW - 3D atrous convolution
KW - Background subtraction
KW - convolutional LSTM network
KW - deep learning
KW - foreground segmentation
UR - http://www.scopus.com/inward/record.url?scp=85050731927&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85050731927&partnerID=8YFLogxK
U2 - 10.1109/ACCESS.2018.2861223
DO - 10.1109/ACCESS.2018.2861223
M3 - Article
AN - SCOPUS:85050731927
SN - 2169-3536
VL - 6
SP - 43450
EP - 43459
JO - IEEE Access
JF - IEEE Access
M1 - 8423055
ER -