Pedestrian Detection in RGB-D Data Using Deep Autoencoders
Pavel Aleksandrovich Kazantsev and Pavel Vyacheslavovich Skribtsov
DOI : 10.3844/ajassp.2015.847.856
American Journal of Applied Sciences
Volume 12, Issue 11
Recent popularity of RGB-D sensors mostly comes from the fact that RGB-images and depth maps supplement each other in machine vision tasks, such as object detection and recognition. This article addresses a problem of RGB and depth data fusion for pedestrian detection. We propose pedestrian detection algorithm that involves fusion of outputs of 2D- and 3D-detectors based on deep autoencoders. Outputs are fused with neural network classifier trained using a dataset which entries are represented by pairs of reconstruction errors of 2D- and 3D-autoencoders. Experimental results show that fusing outputs almost totally eliminate false accepts (precision is 99.8%) and brings recall to 93.2% when tested on the combined dataset that includes a lot of samples with significantly distorted human silhouette. Though we use walking pedestrians as objects of interest, there are few pedestrian-specific processing blocks in this algorithm, so, in general, it can be applied to any type of objects.
© 2015 Pavel Aleksandrovich Kazantsev and Pavel Vyacheslavovich Skribtsov. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.