Intra-frame Based Video Compression Using Deep Convolutional Neural Network (DCNN)

Arief Putra - Politeknik Negeri Samarinda, Samarinda, 75131, Indonesia
Achmad Gaffar - Politeknik Negeri Samarinda, Samarinda, 75131, Indonesia
Muhammad Sumadi - Universitas Muhammadiyah Kalimantan Timur, Samarinda, Indonesia
Lisa Setiawati - Politeknik Negeri Samarinda, Samarinda, 75131, Indonesia

Citation Format:



In principle, a video codec is built by implementing various algorithms and their development. The next generation of codecs involves more artificial intelligence applications and their development. DCNN (Deep Convolutional Neural Network) is a multi-layer NN concept with a deep learning approach in the field of artificial intelligence development. This study has proposed a DCNN with three hidden layers for intra-frame-based video compression. DCT and fractal methods were used to compare the performance of the proposed method.  The training image (obtained from the average of all down-sampled frames) is divided into several square blocks using the square block shift operation until all parts of the image are fulfilled. All pixels in each block act as input data patterns. After the training process, the trained proposed DCNN was then used to construct the feature and sub-feature image obtained through the max function operation in the feature bank and sub-feature bank. These feature and sub-feature images were then a spatial redundancy minimizer with specific manipulation techniques and simultaneously a quantizer without converting the frame's pixels to a bit-stream. The result of this process is a compressed image. Experiments on the entire dataset resulted in AAPR (Average Approximate Performance Ratio) of 147.71%, or an average of 1.5 times better than other methods. For further studies, the performance improvement of the proposed DCNN is performed by modifying its structure so that the output is direct in the form of feature and sub-feature images. Another way is to combine it with the DCT or fractal method to improve the performance of the result.


Video codecs; intra frame; video compression; DCNN.

Full Text:



A. Punchihewa, "Video Compression: Challenges and Opportunities," in Project - 22 - Image and Video Coding and Compression. vol. 2019, ed, 2019, pp. 24-28.

R. E. Childers and U. o. C. A. D. o. C. Science, A Study of Rate Control for H.265/HEVC Video Compression: University of Central Arkansas, Department of Computer Science, 2020.

A. B. W. Putra, A. F. O. Gaffar, A. Wajiansyah, and I. H. Qasim, "Feature-Based Video Frame Compression Using Adaptive Fuzzy Inference System," in 2018 International Symposium on Advanced Intelligent Informatics (SAIN), 2018, pp. 49-55.

ISO/EIC, "Information technology — Coding of moving pictures and associated audio for digital storage media at up to about 1,5 Mbit/s — Part 3: Audio," vol. ISO/EIC 11172-3:1993 ICS : 35.040.40 ed: Technical Committee : ISO/IEC JTC 1/SC 29, 1993, p. 150.

ISO/IEC, "Information technology — Generic coding of moving pictures and associated audio information — Part 2: Video," vol. ISO/IEC 13818-2:2013 ICS : 35.040.40, ed: Technical Committee : ISO/IEC JTC 1/SC 29, 2013, p. 225.

ISO/IEC, "Information technology — Coding of audio-visual objects — Part 10: Advanced video coding," vol. ISO/IEC 14496-10:2020 ICS : 35.040.40, ed: Technical Committee : ISO/IEC JTC 1/SC 29, 2020, p. 859.

ISO/IEC, "Information technology — High efficiency coding and media delivery in heterogeneous environments — Part 2: High efficiency video coding," vol. ISO/IEC 23008-2:2020 ICS : 35.040.40, ed: Technical Committee : ISO/IEC JTC 1/SC 29, 2020, p. 889.

S. Akramullah, Digital video concepts, methods, and metrics: quality, compression, performance, and power trade-off analysis: Springer Nature, 2014.

A. Djelouah, J. Campos, S. Schaub-Meyer, and C. Schroers, "Neural inter-frame compression for video coding," in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 6421-6429.

E.B.Tashmanov and R. A. Raxmonberdiev, "The Inter Frame Image Processing In a Video Codec Based On The Wavelet Transformation," IJRET: International Journal of Research in Engineering and Technology, vol. 06, 2017.

M. Z. Islam, M. E. H. Eimon, B. Ahmed, and M. A. M. Hasan, "Classification Based Inter-Frame Prediction in Video Compression," in 2019 5th International Conference on Advances in Electrical Engineering (ICAEE), 2019, pp. 404-408.

L. Sinapayen and T. Ikegami, "Video Compression with a Predictive Neural Network," in The 31st Annual Conference of the Japanese Society for Artificial Intelligence, Tokyo, 2017, pp. 3M22-3M22.

S. Zhu, S. Zhang, and C. Ran, "An improved inter-frame prediction algorithm for video coding based on fractal and H. 264," IEEE Access, vol. 5, pp. 18715-18724, 2017.

O. Rippel, S. Nair, C. Lew, S. Branson, A. G. Anderson, and L. Bourdev, "Learned video compression," in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 3454-3463.

F. Brand, J. Seiler, and A. Kaup, "Intra frame prediction for video coding using a conditional autoencoder approach," in 2019 Picture Coding Symposium (PCS), Ningbo, China, 2019, pp. 1-5.

N. Manjanaik, B. Parameshachari, S. Hanumanthappa, and R. Banu, "Intra Frame Coding In Advanced Video Coding Standard (H. 264) to Obtain Consistent PSNR and Reduce Bit Rate for Diagonal Down Left Mode Using Gaussian Pulse," in IOP Conference Series: Materials Science and Engineering, 2017, p. 012209.

K. S. Reddy, B. Srikanth, and C. L. Reddy, "Design and Analysis of Video Compression Technique using HEVC Intra-frame Coding," IJESRT (International Journal of Engineering Sciences & Research Technology, vol. 06, pp. 477-482, 2017.

B. Li, J. Han, and Y. Xu, "Co-located Reference Frame Interpolation Using Optical Flow Estimation for Video Compression," in 2018 Data Compression Conference, Snowbird, UT, USA, 2018, pp. 13-22.

F. Sampaio, B. Zatt, M. Shafique, L. Agostini, J. Henkel, and S. Bampi, "Content-adaptive reference frame compression based on intra-frame prediction for multiview video coding," in 2013 IEEE International Conference on Image Processing, 2013, pp. 1831-1835.

F. Kamisli, "Block-based spatial prediction and transforms based on 2D Markov processes for image and video compression," IEEE Transactions on Image Processing, vol. 24, pp. 1247-1260, 2015.

P. K. Charles and K. Habibulla Khan, "A novel search technique of motion estimation for video compression," Global Journal of Computer Science and Technology, vol. 17, pp. 1-5, 2017.

M. Ebrahim and W. C. Chai, "Multi-phase joint reconstruction framework for multi-view video compression using block-based compressive sensing," in 2015 Visual Communications and Image Processing (VCIP), 2015, pp. 1-4.

J. Lin, D. Liu, H. Li, and F. Wu, "M-LVC: Multiple frames prediction for learned video compression," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 3546-3554.

A. Jacob, V. Pawar, V. Vishwakarma, and A. Mane, "Deep Learning Approach to Video Compression," in 2019 IEEE Bombay Section Signature Conference (IBSSC), 2019, pp. 1-5.

R. Birman, Y. Segal, and O. Hadar, "Overview of research in the field of video compression using deep neural networks," Multimedia Tools and Applications, vol. 79, pp. 11699-11722, 2020.

P. Dhungel, P. Tandan, S. Bhusal, S. Neupane, and S. Shakya, "Video Compression for Surveillance Application using Deep Neural Network," Journal of Artificial Intelligence and Capsule Networks, vol. 2, pp. 131-145, 2020.

J. Lee, K. Kong, G. Bae, and W.-J. Song, "BlockNet: A deep neural network for block-based motion estimation using representative matching," Symmetry, vol. 12, p. 840, 2020.

R. Society and R. S. Staff, Machine Learning: The Power and Promise of Computers That Learn by Example. Great Britain: Royal Society, 2017.

S. Ginanjar, A. Wibowo, and E. Sarwoko, "The best architecture selection with deep neural network (DNN) method for breast cancer classification using MicroRNA data," in Journal of Physics: Conference Series,The 9th International Seminar on New Paradigm and Innovation of Natural Sciences and its Application, Central Java, 2020, p. 012106.

S. Emmot, "Characterizing Video Compression Using Convolutional Neural Networks," Independent thesis Advanced level (professional degree), Computer Science and Engineering, master's level, Luleå University of Technology, Sweden, 2020.

X. Lei, H. Pan, and X. Huang, "A dilated CNN model for image classification," IEEE Access, vol. 7, pp. 124087-124095, 2019.

N. Krishnaraj, M. Elhoseny, M. Thenmozhi, M. M. Selim, and K. Shankar, "Deep learning model for real-time image compression in Internet of Underwater Things (IoUT)," Journal of Real-Time Image Processing, vol. 17, pp. 2097-2111, 2020.

D. Im, D. Han, S. Choi, S. Kang, and H.-J. Yoo, "DT-CNN: Dilated and transposed convolution neural network accelerator for real-time image segmentation on mobile devices," in 2019 IEEE international symposium on circuits and systems (ISCAS), 2019, pp. 1-5.

J. H. Park, J. H. Kim, and S. I. Cho, "The analysis of CNN structure for image denoising," in 2018 International SoC Design Conference (ISOCC), 2018, pp. 220-221.

Z. Chen, Y. Li, F. Liu, Z. Liu, X. Pan, W. Sun, Y. Wang, Y. Zhou, H. Zhu, and S. Liu, "CNN-optimized image compression with uncertainty based resource allocation," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2018, pp. 2559-2562.

J. Yang and J. Li, "Application of deep convolution neural network," in 2017 14th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), 2017, pp. 229-232.

L. Cavigelli, P. Hager, and L. Benini, "CAS-CNN: A deep convolutional neural network for image compression artifact suppression," in 2017 International Joint Conference on Neural Networks (IJCNN), 2017, pp. 752-759.

P. Kapoor and S. Patyal, "DCT Image Compression for Color Images," International Journal on Recent and Innovation Trends in Computing and Communication, vol. 2, pp. 3247-3252, 2014.

F. Alfiah, A. Setiadi, Saepudin, A. Supriadi, and I. Maulana, ""DCT Methods on Compression RGB and Grayscale image," International Journal of Computer Technique, vol. 04, pp. 24-29, 2017.

X. Zhou, Y. Bai, and C. Wang, "Image compression based on discrete cosine transform and multistage vector quantization," International Journal of Multimedia and Ubiquitous Engineering, vol. 10, pp. 347-356, 2015.

D. Sandhya and V. Rathod, "Fractal based image compression techniques," Int J Comput Appl, vol. 178, pp. 11-18, 2017.

N. Buduma and N. Locascio, Fundamentals of Deep Learning: Designing Next-generation Machine Intelligence Algorithms vol. 1st Edition. USA: O'Reilly Media, 2017.

Y. Li, F. Qi, and Y. Wan, "Improvements on bicubic image interpolation," in 2019 IEEE 4th Advanced Information Technology, Electronic and Automation Control Conference (IAEAC), 2019, pp. 1316-1320.

O. Ieremeiev, V. Lukin, K. Okarma, and K. Egiazarian, "Full-reference quality metric based on neural network to assess the visual quality of remote sensing images," Remote Sensing, vol. 12, p. 2349, 2020.

O. F. Mohammad, M. S. M. Rahim, S. R. M. Zeebaree, and F. Ahmed, "A survey and analysis of the image encryption methods," International Journal of Applied Engineering Research, vol. 12, pp. 13265-13280, 2017.