Multiple Descriptors for Visual Odometry Trajectory Estimation

Mohammed Salameh, Azizi Abdullah, Shahnorbanun Sahran

Abstract


Visual Simultaneous Localization and Mapping (VSLAM) systems are widely used in mobile robots for autonomous navigation.  One important part in VSLAM is trajectory estimation. Trajectory estimation is a part of the localisation task in VSLAM where a robot needs to estimate the camera pose in order to precisely align the real visited image locations.  The poses are estimated using Visual Odometry Trajectory Estimation (VOTE) by extracting distinctive and trackable keypoints from sequence image locations having been visited by a robot. In the visual trajectory estimation, one of the most popular solutions is arguably PnP-RANSCA function. PnP-RANSAC is a common approach used for estimating the VOTE which uses a feature descriptor such as SURF to extract key-points and match them in pairs based on their descriptors. However, due to the sensor noise and the high fluctuating scenes constitute an inevitable shortcoming that reduces the single visual descriptor performance in extracting the distinctive and trackable keypoints. Thus, this paper proposes a method that uses a random sampling scheme to combine the result of multiple key-points descriptors. The scheme extracts the best keypoints from SIFT, SURF and ORB key-point detectors based on their key-point response value. These keypoints are combined and refined based on Euclidean distances. This combination of keypoints with their corresponding visual descriptors are used in VOTE which reduces the trajectory estimation errors. The proposed algorithm is evaluated on the widely used benchmark dataset KITTI where the three longest sequences are selected, 00 with 4541 images, 02 with 2761 images and 05 with 1101 images. In trajectory estimation experiment, the proposed algorithm can reduce the trajectory error of 44%, 8% and 13% on KITTI dataset for the sequence 00, 02 and 05 respectively based on translational and rotational errors. Also, the proposed algorithm succeeded in reducing the number of keypoints used in VOTE as combined with the state-of-the-art RTAB-Map.


Keywords


Visual Odometry, Trajectory Estimation, Structure from Motion, RANSAC, Selection scheme, Feature Matching.

Full Text:

PDF

References


Herbert Bay, Andreas Ess, Tinne Tuytelaars, and Luc

Van Gool. Speeded-up robust features (surf). Computer

vision and image understanding, 110(3):346–359, 2008.

Li-Hung Chen and Kai-Wei Chiang. The performance

analysis of stereo visual odometry assisted low-cost ins/gps

integration system. Smart Science, 3(3):148–156, 2015.

Hsiang-Jen Chien, Chen-Chi Chuang, Chia-Yen Chen, and

Reinhard Klette. When to use what feature? sift, surf,

orb, or a-kaze features for monocular visual odometry. In

Image and Vision Computing New Zealand (IVCNZ), 2016

International Conference on, pages 1–6. IEEE, 2016.

Jakob Engel, Thomas Sch¨ops, and Daniel Cremers. Lsdslam:

Large-scale direct monocular slam. In Computer

Vision–ECCV 2014, pages 834–849. Springer, 2014.

Marco Fanfani, Fabio Bellavia, and Carlo Colombo. Accurate

keyframe selection and keypoint tracking for robust

visual odometry. Machine Vision and Applications, 2016.

Martin A Fischler and Robert C Bolles. Random sample

consensus: a paradigm for model fitting with applications

to image analysis and automated cartography. Communications

of the ACM, 24(6):381–395, 1981.

Xiao-Shan Gao, Xiao-Rong Hou, Jianliang Tang, and

Hang-Fei Cheng. Complete solution classification for the

perspective-three-point problem. IEEE transactions on

pattern analysis and machine intelligence, 25(8):930–943,

Andreas Geiger, Philip Lenz, Christoph Stiller, and Raquel

Urtasun. Vision meets robotics: The kitti dataset. The

International Journal of Robotics Research, 32(11):1231–

, 2013.

Andreas Geiger, Philip Lenz, and Raquel Urtasun. Are

we ready for autonomous driving? the kitti vision benchmark

suite. In Conference on Computer Vision and Pattern

Recognition (CVPR), 2012.

Giorgio Grisetti, Slawomir Grzonka, Cyrill Stachniss,

Patrick Pfaff, and Wolfram Burgard. Efficient estimation

of accurate maximum likelihood maps in 3d. In Intelligent

Robots and Systems, 2007. IROS 2007. IEEE/RSJ International

Conference on, pages 3472–3478. IEEE, 2007.

Jie Guo, Zhihua Wei, and Duoqian Miao. Lane detection

method based on improved ransac algorithm. InAutonomous Decentralized Systems (ISADS), 2015 IEEE

Twelfth International Symposium on, pages 285–288. IEEE,

Itseez. The OpenCV Reference Manual. Itseez, 2.4.9.0

edition, April 2014.

J Kersten and V Rodehorst. Enhancement strategies for

frame-to-frame uas stereo visual odometry. International

Archives of the Photogrammetry, Remote Sensing & Spatial

Information Sciences, 41, 2016.

Mathieu Labbe and Francois Michaud. Appearance-based

loop closure detection for online large-scale and long-term

operation. Robotics, IEEE Transactions on, 29(3):734–745,

Mathieu Labbe and Franc¸ois Michaud. Online global loop

closure detection for large-scale multi-session graph-based

slam. In Intelligent Robots and Systems (IROS 2014), 2014

IEEE/RSJ International Conference on, pages 2661–2666.

IEEE, 2014.

Chengbo Liu, Qiang Shen, Hai Pan, and Miao Li. Modelling

and simulation: an improved ransac algorithm based

on the relative angle information of samples. International

Journal of Modelling, Identification and Control,

(2):144–152, 2017.

David G Lowe. Distinctive image features from scaleinvariant

keypoints. International journal of computer

vision, 60(2):91–110, 2004.

Mark Maimone, Yang Cheng, and Larry Matthies. Two

years of visual odometry on the mars exploration rovers.

Journal of Field Robotics, 24(3):169–186, 2007.

Moritz Menze and Andreas Geiger. Object scene flow for

autonomous vehicles. In Conference on Computer Vision

and Pattern Recognition (CVPR), 2015.

Raul Mur-Artal, JMM Montiel, and Juan D Tardos. Orbslam:

a versatile and accurate monocular slam system.

arXiv preprint arXiv:1502.00956, 2015.

Ra´ul Mur-Artal and Juan D. Tard´os. ORB-SLAM2: an

open-source SLAM system for monocular, stereo and RGBD

cameras. IEEE Transactions on Robotics, 33(5):1255–

, 2017.

David Nister, Oleg Naroditsky, and James Bergen. Visual

odometry. In Computer Vision and Pattern Recognition, 2004. CVPR 2004. Proceedings of the 2004 IEEE Computer

Society Conference on, volume 1, pages I–I. Ieee,

Taih´u Pire, Thomas Fischer, Javier Civera, Pablo

De Crist´oforis, and Julio Jacobo Berlles. Stereo parallel

tracking and mapping for robot localization. In Intelligent

Robots and Systems (IROS), 2015 IEEE/RSJ International

Conference on, pages 1373–1378. IEEE, 2015.

Martin Rais, Gabriele Facciolo, Enric Meinhardt-Llopis,

Jean-Michel Morel, Antoni Buades, and Bartomeu Coll.

Accurate motion estimation through random sample aggregated

consensus. arXiv preprint arXiv:1701.05268, 2017.

Ethan Rublee, Vincent Rabaud, Kurt Konolige, and Gary

Bradski. Orb: an efficient alternative to sift or surf. In Computer

Vision (ICCV), 2011 IEEE International Conference

on, pages 2564–2571. IEEE, 2011.

Mohammed Omar Salameh. MULTIPLE VISUAL DESCRIPTOR

combination FOR LOOP CLOSURE DETECTION

AND VISUAL ODOMETER TRAJECTORY ESTIMATION. PhD thesis, Universiti Kebangsaan Malaysia,

De-cai SHI, Xiu-cheng DONG, and Yu ZHENG. An improved

orthogonal iterative algorithm for monocular camera

pose estimation. DEStech Transactions on Computer

Science and Engineering, 3(aics), 2016.

Hauke Strasdat, JMM Montiel, and Andrew J Davison.

Real-time monocular slam: Why filter? In Robotics and

Automation (ICRA), 2010 IEEE International Conference

on, pages 2657–2664. IEEE, 2010.

Leo T¨ornqvist, Pentti Vartia, and Yrj¨o O Vartia. How

should relative changes be measured? The American Statistician,

(1):43–46, 1985.

Yue Wang, Jin Zheng, Qi-Zhi Xu, Bo Li, and Hai-Miao

Hu. An improved ransac based on the scale variation homogeneity.

Journal of Visual Communication and Image

Representation, 40:751–764, 2016.

Jun Yu, Chang-wei Luo, Chen Jiang, Rui Li, Ling-yan Li,

and Zeng-fu Wang. A digital video stabilization system

based on reliable sift feature matching and adaptive lowpass

filtering. In CCF Chinese Conference on Computer

Vision, pages 180–189. Springer, 2015.




DOI: http://dx.doi.org/10.18517/ijaseit.8.4-2.6834

Refbacks

  • There are currently no refbacks.



Published by INSIGHT - Indonesian Society for Knowledge and Human Development