Monocular Visual Object 3D Localization in Road Scenes
This is a paper published at ACM Multimedia 2019 (Long Oral). [PDF Available Here]
Problems to Solve
- Accurately localize the 3D positions of the objects in videos captured by a camera mounted on an autonomous vehicle.
- Adaptively estimate ground plane of each frame for more robust object 3D localization.
Framework
- Monocular depth estimation or other 3D sensors to obtain depth information.
- Object depth histogram analysis or 3D point cloud clustering for object depth initialization.
- Adaptive ground plane estimation taking advantage of sparse and dense ground features.
- Tracklet smoothing using the results from multi-object tracking.
Quantitative Results
Localization error and time complexity for pedestrians localization on KITTI dataset.
Localization error for vehicle localization on KITTI dataset.
Ground plane estimation results.
Qualitative Results
Example results for pedestrian and vehicle 3D localization.
Please refer our paper published in ACM Multimedia 2019:
object-localization
mask-rcnn
depth-estimation
ground-plane-estimation
multi-object-tracking
kitti