Improving Badminton Player Detection Using YOLOv3 with Different Training Heuristic

Muhammad Haq - Tokyo Metropolitan University, Japan
Norio Tagawa - Tokyo Metropolitan University, Japan

Citation Format:



There has been a considerable rise in the amount of research and development focused on computer vision over the previous two decades. One of the most critical processes in computer vision is "visual tracking," which involves following objects with a camera. Tracking objects is the practice of following an individual moving object or group of moving things over time. Identifying or connecting target elements in consecutive video frames of a badminton match requires visual object tracking. The aim of this study is to identify badminton players using the You Only Look Once (YOLO) technique in conjunction with a variety of training heuristics. This methodology has a few advantages over other approaches to detecting objects. The convolutional neural network and Fast convolutional neural network are two examples of the many algorithmic approaches that are available. In this study, a neural network is used to produce predictions about the bounding boxes and the class probabilities for these boxes.. The results demonstrated that it was far faster than other methods in terms of its ability to recognize the image. The performance of image classification networks significantly improved as a result of the implementation of a variety of training strategies for the detection of objects. The mean average precision score for YOLOv3 with various training heuristics increased from 32.0 to 36.0 as a direct result of these adjustments. In comparison to YOLOv3, our future study might examine the performance of alternative models like Faster R-CNN or RetinaNet.


— multiple object tracking; convolutional neural network; different training heuristic.

Full Text:



K. Host and M. Ivašić-Kos, “An overview of Human Action Recognition in sports based on Computer Vision,” Heliyon, vol. 8, no. 6, p. e09633, Jun. 2022, doi: 10.1016/J.HELIYON.2022.E09633.

K. Joshi, V. Tripathi, C. Bose, and C. Bhardwaj, “Robust Sports Image Classification Using InceptionV3 and Neural Networks,” Procedia Comput Sci, vol. 167, pp. 2374–2381, Jan. 2020, doi: 10.1016/J.PROCS.2020.03.290.

N. E. Miner, “Interactive virtual reality simulation system for robot control and operator training,” Proc IEEE Int Conf Robot Autom, no. pt 2, pp. 1428–1435, 1994, doi: 10.1109/robot.1994.351289.

S. A. Stansfield, “A Distributed Virtual Reality Simulation System for Situational Training,” Presence: Teleoperators and Virtual Environments, vol. 3, no. 4, pp. 360–366, Nov. 1994, doi: 10.1162/PRES.1994.3.4.360.

S. Li and J. Sun, “Application of virtual reality technology in the field of sport,” Proceedings of the 1st International Workshop on Education Technology and Computer Science, ETCS 2009, vol. 2, pp. 455–458, 2009, doi: 10.1109/ETCS.2009.363.

N. A. Rahmad, N. A. J. Sufri, N. H. Muzamil, and M. A. As’ari, “Badminton player detection using faster region convolutional neural network,” Indonesian Journal of Electrical Engineering and Computer Science, vol. 14, no. 3, pp. 1330–1335, Jun. 2019, doi: 10.11591/IJEECS.V14.I3.PP1330-1335.

W.-Y. Wang, H.-H. Shuai, K.-S. Chang, and W.-C. Peng, “ShuttleNet: Position-aware Fusion of Rally Progress and Player Styles for Stroke Forecasting in Badminton,” Dec. 2021, doi: 10.48550/arxiv.2112.01044.

M. Firdhaus et al., “The new Convolutional Neural Network (CNN) local feature extractor for automated badminton action recognition on vision based data,” J Phys Conf Ser, vol. 1529, no. 2, p. 022021, Apr. 2020, doi: 10.1088/1742-6596/1529/2/022021.

M. Manafifard, H. Ebadi, and H. Abrishami Moghaddam, “A survey on player tracking in soccer videos,” Computer Vision and Image Understanding, vol. 159, pp. 19–46, Jun. 2017, doi: 10.1016/J.CVIU.2017.02.002.

B. Thulasya Naik, M. Farukh Hashmi, C. Author, and M. Farukh Hashmi mdfarukh, “Ball and Player Detection & Tracking in Soccer Videos Using Improved YOLOV3 Model,” 2021, doi: 10.21203/

J. Redmon and A. Farhadi, “YOLOv3: An Incremental Improvement,” Apr. 2018, doi: 10.48550/arxiv.1804.02767.

Z. Zhang, T. He, H. Zhang, Z. Zhang, J. Xie, and M. Li, “Bag of Freebies for Training Object Detection Neural Networks,” Feb. 2019, doi: 10.48550/arxiv.1902.04103.

A. A. Khan and J. Shao, “SPNet: A deep network for broadcast sports video highlight generation,” Computers and Electrical Engineering, vol. 99, p. 107779, Apr. 2022, doi: 10.1016/J.COMPELECENG.2022.107779.

R. Zhang, L. Wu, Y. Yang, W. Wu, Y. Chen, and M. Xu, “Multi-camera multi-player tracking with deep player identification in sports video,” Pattern Recognit, vol. 102, Jun. 2020, doi: 10.1016/J.PATCOG.2020.107260.

G. Quanan and X. Yunjian, “Kalman Filter Algorithm for Sports Video Moving Target Tracking,” Proceedings - 2020 International Conference on Advance in Ambient Computing and Intelligence, ICAACI 2020, pp. 26–30, Sep. 2020, doi: 10.1109/ICAACI50733.2020.00010.

H. Kim and K. S. Hong, “Soccer video mosaicing using self-calibration and line tracking,” in Proceedings 15th International Conference on Pattern Recognition. ICPR-2000, 2000, pp. 592–595 vol.1. doi: 10.1109/ICPR.2000.905407.

M. Archana and M. K. Geetha, “Object Detection and Tracking Based on Trajectory in Broadcast Tennis Video,” Procedia Comput Sci, vol. 58, pp. 225–232, Jan. 2015, doi: 10.1016/J.PROCS.2015.08.060.

T. Watanabe, M. Haseyama, and H. Kitajima, “A soccer field tracking method with wire frame model from TV images,” in 2004 International Conference on Image Processing, 2004. ICIP ’04., 2004, pp. 1633-1636 Vol. 3. doi: 10.1109/ICIP.2004.1421382.

Y. Lyu and S. Zhang, “Badminton Path Tracking Algorithm Based on Computer Vision and Ball Speed Analysis,” J Sens, vol. 2021, 2021, doi: 10.1155/2021/3803387.

S. Yang, F. Ding, P. Li, and S. Hu, “Distributed multi-camera multi-target association for real-time tracking,” Scientific Reports 2022 12:1, vol. 12, no. 1, pp. 1–13, Jun. 2022, doi: 10.1038/s41598-022-15000-4.

A. Yamada, Y. Shirai, and J. Miura, “Tracking players and a ball in video image sequence and estimating camera parameters for 3D interpretation of soccer games,” in 2002 International Conference on Pattern Recognition, 2002, pp. 303–306 vol.1. doi: 10.1109/ICPR.2002.1044697.

G. Thomas, R. Gade, T. B. Moeslund, P. Carr, and A. Hilton, “Computer vision for sports: Current applications and research topics,” Computer Vision and Image Understanding, vol. 159, pp. 3–18, Jun. 2017, doi: 10.1016/J.CVIU.2017.04.011.

J. Chen and J. J. Little, “Where should cameras look at soccer games: Improving smoothness using the overlapped hidden Markov model,” Computer Vision and Image Understanding, vol. 159, pp. 59–73, Jun. 2017, doi: 10.1016/J.CVIU.2016.10.017.

J. Chen, H. M. Le, P. Carr, Y. Yue, and J. J. Little, “Learning online smooth predictors for realtime camera planning using recurrent decision trees,” Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2016-December, pp. 4688–4696, Dec. 2016, doi: 10.1109/CVPR.2016.507.

M. S. Ibrahim, S. Muralidharan, Z. Deng, A. Vahdat, and G. Mori, “A Hierarchical Deep Temporal Model for Group Activity Recognition,” Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2016-December, pp. 1971–1980, Dec. 2016, doi: 10.1109/CVPR.2016.217.

P. Parisot and C. De Vleeschouwer, “Scene-specific classifier for effective and efficient team sport players detection from a single calibrated camera,” Computer Vision and Image Understanding, vol. 159, pp. 74–88, Jun. 2017, doi: 10.1016/J.CVIU.2017.01.001.

L. Liu, “Objects detection toward complicated high remote basketball sports by leveraging deep CNN architecture,” Future Generation Computer Systems, vol. 119, pp. 31–36, Jun. 2021, doi: 10.1016/J.FUTURE.2021.01.020.

K. Lu, J. Chen, J. J. Little, and H. He, “Lightweight convolutional neural networks for player detection and classification,” Computer Vision and Image Understanding, vol. 172, pp. 77–87, Jul. 2018, doi: 10.1016/J.CVIU.2018.02.008.

A. F. Agarap, “Deep Learning using Rectified Linear Units (ReLU),” Mar. 2018, doi: 10.48550/arxiv.1803.08375.

T. He, Z. Zhang, H. Zhang, Z. Zhang, J. Xie, and M. Li, “Bag of Tricks for Image Classification with Convolutional Neural Networks,” Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2019-June, pp. 558–567, Dec. 2018, doi: 10.48550/arxiv.1812.01187.


  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

JOIV : International Journal on Informatics Visualization
ISSN 2549-9610  (print) | 2549-9904 (online)
Organized by Department of Information Technology - Politeknik Negeri Padang, and Institute of Visual Informatics - UKM and Soft Computing and Data Mining Centre - UTHM
W :
E :,,

View JOIV Stats

Creative Commons License is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.