Improved Image Classification Task Using Enhanced Visual Geometry Group of Convolution Neural Networks

Nurzarinah Zakaria - Universiti Tun Hussein Onn Malaysia, Batu Pahat, Johor, 86400, Malaysia
Yana Mazwin Mohmad Hassim - Universiti Tun Hussein Onn Malaysia, Batu Pahat, Johor, 86400, Malaysia


Citation Format:



DOI: http://dx.doi.org/10.30630/joiv.7.4.01752

Abstract


Convolutional Neural Networks (CNNs) have become essential to solving image classification tasks. One of the most frequent models of CNNs for image classification is the Visual Geometry Group (VGG). The VGG architecture is made up of multiple layers of convolution and pooling processes followed by fully connected layers. Among the various VGG models, the VGG16 architecture has gained great attention due to its remarkable performance and simplicity. However, the VGG16 architecture is still prone to have many parameters contributing to its complexity. Moreover, the complexity of VGG16 may cause a longer execution time. The complexity of VGG16 architecture is also more highly prone to overfitting and may affect the classification accuracy. This study proposes an enhancement of VGG16 architecture to overcome such drawbacks. The enhancement involved the reduction of the convolution blocks, implementing batch normalization (B.N.) layers, and integrating global average pooling (GAP) layers with the addition of dense and dropout layers in the architecture. The experiment was carried out with six benchmark datasets for image classification tasks. The results from the experiment show that the network parameters are 79% less complex than the standard VGG16. The proposed model also yields better classification accuracy and shorter execution time. Reducing the parameters in the proposed improved VGG architecture allows for more efficient computation and memory usage. Overall, the proposed improved VGG architecture offers a promising solution to the challenges of long execution times and excessive memory usage in VGG16 architecture. 


Keywords


Convolutional Neural Networks; computer vision; deep learning; image classification; Visual Geometry Group

Full Text:

PDF

References


P. Patel and A. Thakkar, "The upsurge of deep learning for computer vision applications," International Journal of Electrical and Computer Engineering, vol. 10, no. 1, 2020, doi: 10.11591/ijece.v10i1.pp538-548.

K. Joshi, V. Tripathi, C. Bose, and C. Bhardwaj, "Robust Sports Image Classification Using InceptionV3 and Neural Networks," Procedia Comput Sci, vol. 167, no. Iccids 2019, pp. 2374–2381, 2020, doi: 10.1016/j.procs.2020.03.290.

M. Mahdianpari, B. Salehi, M. Rezaee, F. Mohammadimanesh, and Y. Zhang, "Very deep convolutional neural networks for complex land cover mapping using multispectral remote sensing imagery," Remote Sens (Basel), vol. 10, no. 7, 2018, doi: 10.3390/rs10071119.

R. Yamashita, M. Nishio, R. K. G. Do, and K. Togashi, "Convolutional neural networks: an overview and application in radiology," Insights into Imaging, vol. 9, no. 4. 2018. doi: 10.1007/s13244-018-0639-9.

Y. Wang and Z. Wang, "A survey of recent work on fine-grained image classification techniques," J Vis Commun Image Represent, vol. 59, 2019, doi: 10.1016/j.jvcir.2018.12.049.

A. Khan, A. Sohail, U. Zahoora, and A. S. Qureshi, "A survey of the recent architectures of deep convolutional neural networks," Artif Intell Rev, vol. 53, no. 8, 2020, doi: 10.1007/s10462-020-09825-6.

A. Krizhevsky, I. Sutskever, and G. E. Hinton, "ImageNet classification with deep convolutional neural networks," Commun ACM, vol. 60, no. 6, 2017, doi: 10.1145/3065386.

G. Huang, Z. Liu, L. van der Maaten, and K. Q. Weinberger, "Densely connected convolutional networks," in Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, 2017. doi: 10.1109/CVPR.2017.243.

C. Szegedy et al., "Going deeper with convolutions," in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2015. doi: 10.1109/CVPR.2015.7298594.

Y. Lecun, E. Bottou, Y. Bengio, and P. Haffner, "Gradient-Based Learning Applied to Document Recognition," 1998.

K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, IEEE Computer Society, Dec. 2016, pp. 770–778. doi: 10.1109/CVPR.2016.90.

F. N. Iandola, S. Han, M. W. Moskewicz, K. Ashraf, W. J. Dally, and K. Keutzer, "SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size," Feb. 2016, [Online]. Available: http://arxiv.org/abs/1602.07360

K. Simonyan and A. Zisserman, "Very Deep Convolutional Networks for Large-Scale Image Recognition," 3rd International Conference on Learning Representations, ICLR 2015 - Conference Track Proceedings, pp. 1–14, Sep. 2014, [Online]. Available: http://arxiv.org/abs/1409.1556

F. Chollet, "Xception: Deep learning with depthwise separable convolutions," in Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, 2017. doi: 10.1109/CVPR.2017.195.

M. D. Zeiler and R. Fergus, "Visualizing and understanding convolutional networks," in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2014. doi: 10.1007/978-3-319-10590-1_53.

Z. Song, L. Fu, J. Wu, Z. Liu, R. Li, and Y. Cui, "Kiwifruit detection in field images using Faster R-CNN with VGG16," in IFAC-PapersOnLine, Elsevier B.V., 2019, pp. 76–81. doi: 10.1016/j.ifacol.2019.12.500.

P. Hridayami, I. K. G. D. Putra, and K. S. Wibawa, "Fish species recognition using VGG16 deep convolutional neural network," Journal of Computing Science and Engineering, vol. 13, no. 3, pp. 124–130, 2019, doi: 10.5626/JCSE.2019.13.3.124.

S. Theetchenya, S. Ramasubbareddy, S. Sankar, and S. M. Basha, "Hybrid approach for content-based image retrieval," International Journal of Data Science, vol. 6, no. 1. p. 45, 2021. doi: 10.1504/ijds.2021.117467.

S. S. Sawant et al., "An optimal-score-based filter pruning for deep convolutional neural networks," Applied Intelligence, vol. 52, no. 15, 2022, doi: 10.1007/s10489-022-03229-5.

I. Singh Walia, M. Srivastava, D. Kumar, M. Rani, P. Muthreja, and G. Mohadikar, "Pneumonia Detection using Depth-Wise Convolutional Neural Network (DW-CNN)," 2020, doi: 10.4108/.

Zhongqin Bi, Ling Yu, Honghao Gao, Ping Zhou, and Hongyang Yao, "Improved VGG model based efcient trafc sign recognition for safe driving in 5G scenarios.pdf." 2020.

M. N. Islam et al., "Diagnosis of hearing deficiency using EEG based AEP signals: CWT and improved-VGG16 pipeline," PeerJ Comput Sci, vol. 7, p. e638, 2021, doi: 10.7717/peerj-cs.638.

A. Labach, H. Salehinejad, and S. Valaee, "Survey of Dropout Methods for Deep Neural Networks," Apr. 2019, [Online]. Available: http://arxiv.org/abs/1904.13310

G. Chen, P. Chen, Y. Shi, C.-Y. Hsieh, B. Liao, and S. Zhang, "Rethinking the Usage of Batch Normalization and Dropout in the Training of Deep Neural Networks," May 2019, [Online]. Available: http://arxiv.org/abs/1905.05928

D. Serdyuk, O. Braga, and O. Siohan, "Audio-Visual Speech Recognition is Worth $32times 32times 8$ Voxels," in 2021 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2021 - Proceedings, 2021. doi: 10.1109/ASRU51503.2021.9688191.

Y. Zhou, H. Chang, Y. Lu, X. Lu, and R. Zhou, "Improving the Performance of VGG through Different Granularity Feature Combinations," IEEE Access, vol. 9, 2021, doi: 10.1109/ACCESS.2020.3031908.

F. Pasa, V. Golkov, F. Pfeiffer, D. Cremers, and D. Pfeiffer, "Efficient Deep Network Architectures for Fast Chest X-Ray Tuberculosis Screening and Visualization," Sci Rep, vol. 9, no. 1, Dec. 2019, doi: 10.1038/s41598-019-42557-4.

M. Mateen, J. Wen, Nasrullah, S. Song, and Z. Huang, "Fundus image classification using VGG-19 architecture with PCA and SVD," Symmetry (Basel), vol. 11, no. 1, Jan. 2019, doi: 10.3390/sym11010001.

F. Zhao, B. Zhang, Z. Zhang, X. Zhang, and C. Wei, "Classification and detection method of Blood lancet based on VGG16 network," 2021 IEEE International Conference on Mechatronics and Automation, ICMA 2021, pp. 849–853, 2021, doi: 10.1109/ICMA52036.2021.9512686.

H. Yang, J. Ni, J. Gao, Z. Han, and T. Luan, "A novel method for peanut variety identification and classification by Improved VGG16," Sci Rep, vol. 11, no. 1, 2021, doi: 10.1038/s41598-021-95240-y.

Q. Yan, B. Yang, W. Wang, B. Wang, P. Chen, and J. Zhang, "Apple leaf diseases recognition based on an improved convolutional neural network," Sensors (Switzerland), vol. 20, no. 12, pp. 1–14, 2020, doi: 10.3390/s20123535.

X. Li et al., "Multi-Modal Multi-Instance Learning for Retinal Disease Recognition," in MM 2021 - Proceedings of the 29th ACM International Conference on Multimedia, 2021. doi: 10.1145/3474085.3475418.

B. Cui, X. M. Dong, Q. Zhan, J. Peng, and W. Sun, "LiteDepthwiseNet: A Lightweight Network for Hyperspectral Image Classification," IEEE Transactions on Geoscience and Remote Sensing, vol. 60, 2022, doi: 10.1109/TGRS.2021.3062372.

E. Ovalle-Magallanes, N. G. Aldana-Murillo, J. G. Avina-Cervantes, J. Ruiz-Pinales, J. Cepeda-Negrete, and S. Ledesma, “Transfer Learning for Humanoid Robot Appearance-Based Localization in a Visual Map,” IEEE Access, vol. 9, pp. 6868–6877, 2021, doi: 10.1109/ACCESS.2020.3048936.

M. Lin, Q. Chen, and S. Yan, "Network In Network," Dec. 2013, [Online]. Available: http://arxiv.org/abs/1312.4400

P. P. Das, A. Acharjee, and Marium-E-Jannat, "Double coated VGG16 architecture: An enhanced approach for genre classification of spectrographic representation of musical pieces," in 2019 22nd International Conference on Computer and Information Technology, ICCIT 2019, 2019. doi: 10.1109/ICCIT48885.2019.9038339.

H. P. A. Tjahyaningtijas, A. K. Nugroho, C. V. Angkoso, I. K. E. Purnama, and M. H. Purnomo, "Automatic Segmentation on Glioblastoma Brain Tumor Magnetic Resonance Imaging Using Modified U-Net," EMITTER International Journal of Engineering Technology, vol. 8, no. 1, pp. 161–177, Jun. 2020, doi: 10.24003/emitter.v8i1.505.

M. M. Bejani and M. Ghatee, "A systematic review on overfitting control in shallow and deep neural networks," Artif Intell Rev, vol. 54, no. 8, 2021, doi: 10.1007/s10462-021-09975-1.

T. D1Etterich, "Overfitting and Undercomputing in Machine Learning," ACM Computing Surveys (CSUR), vol. 27, no. 3, 1995, doi: 10.1145/212094.212114.

S. J. Nowlan and G. E. Hinton, "Simplifying neural networks by soft weight sharing," in The Mathematics Of Generalization, 2018. doi: 10.1162/neco.1992.4.4.473.

D. M. Hawkins, "The Problem of Overfitting," Journal of Chemical Information and Computer Sciences, vol. 44, no. 1. 2004. doi: 10.1021/ci0342472.

H. N. A. Pham and E. Triantaphyllou, "The impact of overfitting and overgeneralization on the classification accuracy in data mining," in Soft Computing for Knowledge Discovery and Data Mining, 2008. doi: 10.1007/978-0-387-69935-6_16.

D. P. Kingma and J. Ba, "Adam: A Method for Stochastic Optimization," Dec. 2014, [Online]. Available: http://arxiv.org/abs/1412.6980

M. Tan and Q. V. Le, "EfficientNet: Rethinking model scaling for convolutional neural networks," in 36th International Conference on Machine Learning, ICML 2019, 2019.

M. Grandini, E. Bagli, and G. Visani, "Metrics for Multi-Class Classification: an Overview," Aug. 2020, [Online]. Available: http://arxiv.org/abs/2008.05756

M. Shu, "Deep Learning for Image Classification on Very Small Datasets Using Transfer Learning," 2019.

K. S. Lee, S. K. Jung, J. J. Ryu, S. W. Shin, and J. Choi, "Evaluation of transfer learning with deep convolutional neural networks for screening osteoporosis in dental panoramic radiographs," J Clin Med, vol. 9, no. 2, Feb. 2020, doi: 10.3390/jcm9020392.