Exploration of The Impact of Kernel Size for YOLOv5-based Object Detection on Quadcopter

Rissa Rahmania; Felix Corputty; Suryo Adhi Wibowo; Dany Eka Saputra; Annisa Istiqomah

doi:10.30630/joiv.6.3.898

Exploration of The Impact of Kernel Size for YOLOv5-based Object Detection on Quadcopter

Rissa Rahmania - Bina Nusantara University, Bandung Campus, Jakarta, Indonesia
Felix Corputty - Telkom University, Bandung, Indonesia
Suryo Wibowo - Telkom University, Bandung, Indonesia
Dany Saputra - Bina Nusantara University, Bandung Campus, Jakarta, Indonesia
Annisa Istiqomah - Bina Nusantara University, Bandung Campus, Jakarta, Indonesia

Citation Format:

DOI: http://dx.doi.org/10.30630/joiv.6.3.898

Abstract

Drones or quadcopters have been widely used in various fields based on deep learning, especially object detection. However, drone vision characteristics such as occlusion and small objects are still being explored for performance in terms of accuracy and speed detection. The YOLO architecture is very commonly used for cases requiring high-speed detection. To overcome the limitations of drone vision, in this paper, we explore the size of the YOLOv5s backbone kernel in the shallowest convolutional layer to achieve better performance. The kernel is a filter that has a main role in the feature map, and it defines the size of the convolution matrix, and the resulting features in the shallowest convolutional layer are more representative of the case of object detection and recognition. The techniques can be divided into three major categories: (1) data preprocessing, which involves augmentation and normalization of the data, (2) kernel size exploration in the shallowest convolutional layer of the YOLOv5s, and (3) model implementation in the real environment using the quadcopter. The dataset consisted of four classes representing dragon fruit, snake fruit, banana, and pineapple, with a total of 8000 data. Exploration results with kernel size give promising results. Kernel sizes 5 and 7 give an mAP of 0.988. Through these results, modification of the kernel size provides an opportunity for more in-depth investigations, such as with the epoch parameter, padding scheme, and other optimization techniques.

Keywords

YOLOv5; object detection; kernel size; quadcopter; deep learning.

Full Text:

PDF

References

R. Shrestha, R. Bajracharya, and S. Kim, â€œ6G Enabled Unmanned Aerial Vehicle Traffic Management: A Perspective,â€ IEEE Access, vol. 9, pp. 91119â€“91136, 2021, doi: 10.1109/ACCESS.2021.3092039.

J. Kim, S. Kim, C. Ju, and H. il Son, â€œUnmanned aerial vehicles in agriculture: A review of perspective of platform, control, and applications,â€ IEEE Access, vol. 7. Institute of Electrical and Electronics Engineers Inc., pp. 105100â€“105115, 2019. doi: 10.1109/ACCESS.2019.2932119.

Z. Liu, C. Liu, W. Zhao, and A. Li, â€œA User-Priority-Driven Multi-UAV Cooperative Reconnaissance Strategy,â€ International Journal of Aerospace Engineering, vol. 2021, pp. 1â€“14, Oct. 2021, doi: 10.1155/2021/9504056.

S. K. Niranjan, REVA University, Institute of Electrical and Electronics Engineers. Bangalore Section, and Institute of Electrical and Electronics Engineers, Proceedings of the International Conference on Smart Technologies in Computing, Electrical and Electronics (ICSTCEE 2020) : October 9-10, 2020, Virtual Conference.

W. Jiang, Y. Zhou, L. Ding, C. Zhou, and X. Ning, â€œUAV-based 3D reconstruction for hoist site mapping and layout planning in petrochemical construction,â€ Automation in Construction, vol. 113, May 2020, doi: 10.1016/j.autcon.2020.103137.

S. A. Wibowo, H. Lee, E. K. Kim, and S. Kim, â€œCollaborative Learning based on Convolutional Features and Correlation Filter for Visual Tracking,â€ International Journal of Control, Automation and Systems, vol. 16, no. 1, pp. 335â€“349, Feb. 2018, doi: 10.1007/s12555-017-0062-x.

S. A. Wibowo, H. Lee, E. K. Kim, and S. Kim, â€œVisual tracking based on complementary learners with distractor handling,â€ Mathematical Problems in Engineering, vol. 2017, 2017, doi: 10.1155/2017/5295601.

S. A. Wibowo, H. Lee, E. K. Kim, and S. Kim, â€œConvolutional Shallow Features for Performance Improvement of Histogram of Oriented Gradients in Visual Object Tracking,â€ Mathematical Problems in Engineering, vol. 2017, 2017, doi: 10.1155/2017/6329864.

M. Liu, X. Wang, A. Zhou, X. Fu, Y. Ma, and C. Piao, â€œUav-yolo: Small object detection on unmanned aerial vehicle perspective,â€ Sensors (Switzerland), vol. 20, no. 8, Apr. 2020, doi: 10.3390/s20082238.

X. Zhang, E. Izquierdo, and K. Chandramouli, â€œDense and Small Object Detection in UAV Vision based on Cascade Network.â€ [Online]. Available: http://www.goldmansachs.com/our-thinking/technology-driving-

Z. Pi, Y. Lian, X. Chen, Y. Wu, Y. Li, and L. Jiao, â€œA Novel Spatial and Temporal Context-Aware Approach for Drone-Based Video Object Detection,â€ 2020. doi: 10.1109/ICCVW.2019.00027.

K. M. Abughalieh, B. H. Sababha, and N. A. Rawashdeh, â€œA video-based object detection and tracking system for weight sensitive UAVs,â€ Multimedia Tools and Applications, vol. 78, no. 7, pp. 9149â€“9167, Apr. 2019, doi: 10.1007/s11042-018-6508-1.

P. Zhang, Y. Zhong, and X. Li, â€œSlimYOLOv3: Narrower, Faster and Better for Real-Time UAV Applications.â€ [Online]. Available: https://github.com/PengyiZhang/SlimYOLOv3.

Q. Wu and Y. Zhou, â€œReal-Time Object Detection Based on Unmanned Aerial Vehicle,â€ 2019. doi: 10.1109/DDCLS.2019.8908984.

J. Zhang, X. Liang, M. Wang, L. Yang, and L. Zhuo, â€œCoarse-to-fine object detection in unmanned aerial vehicle imagery using lightweight convolutional neural network and deep motion saliency,â€ Neurocomputing, vol. 398, pp. 555â€“565, Jul. 2020, doi: 10.1016/j.neucom.2019.03.102.

H. C. Baykara, E. Biyik, G. Gul, D. Onural, and A. S. Ozturk, â€œReal-time detection, tracking and classification of multiple moving objects in uav videos,â€ in Proceedings - International Conference on Tools with Artificial Intelligence, ICTAI, Jun. 2018, vol. 2017-November, pp. 945â€“950. doi: 10.1109/ICTAI.2017.00145.

J. Lee, J. Wang, D. Crandall, S. Sabanovic, and G. Fox, â€œReal-time, cloud-based object detection for unmanned aerial vehicles,â€ in Proceedings - 2017 1st IEEE International Conference on Robotic Computing, IRC 2017, May 2017, pp. 36â€“43. doi: 10.1109/IRC.2017.77.

A. Wiranata, S. A. Wibowo, R. Patmasari, R. Rahmania, and R. Mayasari, â€œInvestigation of Padding Schemes for Faster R-CNN on Vehicle Detection,â€ 2018. doi: 10.1109/ICCEREC.2018.8712086.

J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, â€œYou Only Look Once: Unified, Real-Time Object Detection,â€ Jun. 2015, [Online]. Available: http://arxiv.org/abs/1506.02640

A. Bochkovskiy, C.-Y. Wang, and H.-Y. M. Liao, â€œYOLOv4: Optimal Speed and Accuracy of Object Detection,â€ Apr. 2020, [Online]. Available: http://arxiv.org/abs/2004.10934

B. Custers Editor, â€œThe Future of Drone Use Opportunities and Threats from Ethical and Legal Perspectives.â€ [Online]. Available: http://www.springer.com/series/8857

H. Takano et al., â€œVisible Light Communication on LED-equipped Drone and Object-Detecting Camera for Post-Disaster Monitoring,â€ in IEEE Vehicular Technology Conference, Apr. 2021, vol. 2021-April. doi: 10.1109/VTC2021-Spring51267.2021.9448902.

S. Hossain and D. J. Lee, â€œDeep learning-based real-time multiple-object detection and tracking from aerial imagery via a flying robot with GPU-based embedded devices,â€ Sensors (Switzerland), vol. 19, no. 15, Aug. 2019, doi: 10.3390/s19153371.

D. Du et al., â€œVisDrone-DET2019: The Vision Meets Drone Object Detection in Image Challenge Results.â€ [Online]. Available: http://www.aiskyeye.com/.

M. Mandal, L. K. Kumar, and S. K. Vipparthi, â€œMOR-UAV: A Benchmark Dataset and Baselines for Moving Object Recognition in UAV Videos,â€ in MM 2020 - Proceedings of the 28th ACM International Conference on Multimedia, Oct. 2020, pp. 2626â€“2635. doi: 10.1145/3394171.3413934.

Z. Q. Zhao, P. Zheng, S. T. Xu, and X. Wu, â€œObject Detection with Deep Learning: A Review,â€ IEEE Transactions on Neural Networks and Learning Systems, vol. 30, no. 11. Institute of Electrical and Electronics Engineers Inc., pp. 3212â€“3232, Nov. 01, 2019. doi: 10.1109/TNNLS.2018.2876865.

R. Girshick, J. Donahue, T. Darrell, and J. Malik, â€œRich feature hierarchies for accurate object detection and semantic segmentation,â€ Nov. 2013, [Online]. Available: http://arxiv.org/abs/1311.2524

S. Ren, K. He, R. Girshick, and J. Sun, â€œFaster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks,â€ Jun. 2015, [Online]. Available: http://arxiv.org/abs/1506.01497

L. Jiao et al., â€œA Survey of Deep Learning-based Object Detection,â€ Jul. 2019, doi: 10.1109/ACCESS.2019.2939201.

W. Liu et al., â€œSSD: Single Shot MultiBox Detector,â€ Dec. 2015, doi: 10.1007/978-3-319-46448-0_2.

A. A. Abdelhamid, S. R. Alotaibi, and A. Mousa, â€œDeep learning-based prototyping of android gui from hand-drawn mockups,â€ IET Software, vol. 14, no. 7, pp. 816â€“824, Dec. 2020, doi: 10.1049/iet-sen.2019.0378.

G. Jocher, â€œYOLOv5,â€ 2020. https://github.com/ultralytics/yolov5 (accessed Jul. 08, 2022).

C.-Y. Wang, H.-Y. M. Liao, I.-H. Yeh, Y.-H. Wu, P.-Y. Chen, and J.-W. Hsieh, â€œCSPNet: A New Backbone that can Enhance Learning Capability of CNN,â€ Nov. 2019, [Online]. Available: http://arxiv.org/abs/1911.11929

K. He, X. Zhang, S. Ren, and J. Sun, â€œSpatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition,â€ Jun. 2014, doi: 10.1007/978-3-319-10578-9_23.

T.-Y. Lin et al., â€œMicrosoft COCO: Common Objects in Context,â€ May 2014, [Online]. Available: http://arxiv.org/abs/1405.0312

J. Hosang, R. Benenson, and B. Schiele, â€œLearning non-maximum suppression,â€ 2017. Accessed: Jul. 08, 2022. [Online]. Available: https://arxiv.org/abs/1705.02950

L. Alzubaidi et al., â€œReview of deep learning: concepts, CNN architectures, challenges, applications, future directions,â€ Journal of Big Data, vol. 8, no. 1, Dec. 2021, doi: 10.1186/s40537-021-00444-8.

Username
Password
Remember me