Object Detection in X-ray Images Using Transfer Learning with Data Augmentation

Reagan L. Galvez, Elmer P. Dadios, Argel A. Bandala, Ryan Rhay P. Vicerra


Object detection in X-ray images is an interesting problem in the field of machine vision. The reason is that images from an X-ray machine are usually obstructed with other objects and to itself, therefore object classification and localization is a challenging task. Furthermore, obtaining X-ray data is difficult due to an insufficient dataset available compared with photographic images from a digital camera. It is vital to easily detect objects in an X-ray image because it can be used as decision support in the detection of threat items such as improvised explosive devices (IED’s) in airports, train stations, and public places. Detection of IED components accurately requires an expert and can be achieved through extensive training. Also, manual inspection is tedious, and the probability of missed detection increases due to several pieces of baggage are scanned in a short period of time. As a solution, this paper used different object detection techniques (Faster R-CNN, SSD, R-FCN) and feature extractors (ResNet, MobileNet, Inception, Inception-ResNet) based on convolutional neural networks (CNN) in a novel IEDXray dataset in the detection of IED components. The IEDXray dataset is an X-ray image of IED replicas without the explosive material. Transfer learning with data augmentation was performed due to limited X-ray data available to train the whole network from scratch. Evaluation results showed that individual detection achieved 99.08% average precision (AP) in mortar detection and 77.29% mAP in three IED components.


Convolutional neural networks; data augmentation; object detection; transfer learning; X-ray image

Full Text:



National Consortium for the Study of Terrorism and Responses to Terrorism (START), “Global Terrorism Database [Data file].” 2018.

N. Dalal and B. Triggs, “Histograms of Oriented Gradients for Human Detection,” in 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), vol. 1, pp. 886–893.

R. A. Bedruz, E. Sybingco, A. Bandala, A. R. Quiros, A. C. Uy, and E. Dadios, “Real-time vehicle detection and tracking using a mean-shift based blob analysis and tracking approach,” in 2017IEEE 9th International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment and Management (HNICEM), 2017, pp. 1–5.

A. C. P. Uy et al., “Automated vehicle class and color profiling system based on fuzzy logic,” in 2017 5th International Conference on Information and Communication Technology (ICoIC7), 2017, pp. 1–6.

S. Akcay, M. E. Kundegorski, C. G. Willcocks, and T. P. Breckon, “Using deep convolutional neural network architectures for object classification and detection within x-ray baggage security imagery,” IEEE Trans. Inf. Forensics Secur., vol. 13, no. 9, pp. 2203–2215, 2018.

T. Franzel, U. Schmidt, and S. Roth, “Object detection in multi-view X-ray images,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2012, vol. 7476 LNCS, pp. 144–154.

J. Redmon and A. Farhadi, “{YOLO9000:} Better, Faster, Stronger,” CoRR, vol. abs/1612.0, 2016.

M. Bacstan, “Multi-view object detection in dual-energy x-ray images,” Mach. Vis. Appl., vol. 26, no. 7–8, pp. 1045–1060, 2015.

V. Riffo, S. Flores, and D. Mery, “Threat Objects Detection in X-ray Images Using an Active Vision Approach,” J. Nondestruct. Eval., vol. 36, no. 3, p. 44, 2017.

V. Riffo and D. Mery, “Automated detection of threat objects using adapted implicit shape model,” IEEE Trans. Syst. Man, Cybern. Syst., vol. 46, no. 4, pp. 472–482, 2015.

D. Mery et al., “GDXray: The database of X-ray images for nondestructive testing,” J. Nondestruct. Eval., vol. 34, no. 4, p. 42, 2015.

M. Xu, H. Zhang, and J. Yang, “Prohibited Item Detection in Airport X-Ray Security Images via Attention Mechanism Based CNN,” in Chinese Conference on Pattern Recognition and Computer Vision (PRCV), 2018, pp. 429–439.

R. L. Galvez, A. A. Bandala, E. P. Dadios, R. R. P. Vicerra, and J. M. Z. Maningo, “Object detection using convolutional neural networks,” in TENCON 2018-2018 IEEE Region 10 Conference, 2018, pp. 2023–2027.

R. L. Galvez, E. P. Dadios, A. A. Bandala, and R. R. P. Vicerra, “Threat object classification in X-ray images using transfer learning,” in 2018 IEEE 10th International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment and Management, HNICEM 2018, 2019.

S. J. Pan and Q. Yang, “A survey on transfer learning,” IEEE Trans. Knowl. Data Eng., vol. 22, no. 10, pp. 1345–1359, 2009.

T.-Y. Lin et al., “Microsoft coco: Common objects in context,” in European conference on computer vision, 2014, pp. 740–755.

S. Ren, K. He, R. Girshick, and J. Sun, “Faster r-cnn: Towards real-time object detection with region proposal networks,” in Advances in neural information processing systems, 2015, pp. 91–99.

R. Girshick, “Fast r-cnn,” in Proceedings of the IEEE international conference on computer vision, 2015, pp. 1440–1448.

W. Liu et al., “Ssd: Single shot multibox detector,” in European conference on computer vision, 2016, pp. 21–37.

J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 779–788.

J. Dai, Y. Li, K. He, and J. Sun, “R-fcn: Object detection via region-based fully convolutional networks,” in Advances in neural information processing systems, 2016, pp. 379–387.

S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” arXiv Prepr. arXiv1502.03167, 2015.

C. Szegedy et al., “Going deeper with convolutions,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 1–9.

A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in Advances in neural information processing systems, 2012, pp. 1097–1105.

K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv Prepr. arXiv1409.1556, 2014.

K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778.

C. Szegedy, S. Ioffe, V. Vanhoucke, and A. A. Alemi, “Inception-v4, inception-resnet and the impact of residual connections on learning,” in Thirty-First AAAI Conference on Artificial Intelligence, 2017.

J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “Imagenet: A large-scale hierarchical image database,” in 2009 IEEE conference on computer vision and pattern recognition, 2009, pp. 248–255.

A. G. Howard et al., “Mobilenets: Efficient convolutional neural networks for mobile vision applications,” arXiv Prepr. arXiv1704.04861, 2017.

Smiths Detection, “HI-SCAN 6040-2is dual-view X-ray machine,” 2019. [Online]. Available: https://www.smithsdetection.com/products/hi-scan-6040-2is-hr/.

J. Huang et al., “Speed/accuracy trade-offs for modern convolutional object detectors,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 7310–7311.

M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisserman, “The pascal visual object classes (voc) challenge,” Int. J. Comput. Vis., vol. 88, no. 2, pp. 303–338, 2010.

DOI: http://dx.doi.org/10.18517/ijaseit.9.6.9960


  • There are currently no refbacks.

Published by INSIGHT - Indonesian Society for Knowledge and Human Development