Big Mart Sales Data Visualization and Correlation

Artika Arista - Universitas Pembangunan Nasional Veteran Jakarta, Indonesia
Theresiawati Theresiawati - Universitas Pembangunan Nasional Veteran Jakarta, Indonesia
Henki Bayu Seta - Universitas Pembangunan Nasional Veteran Jakarta, Indonesia


Citation Format:



DOI: http://dx.doi.org/10.30630/joiv.8.1.1780

Abstract


The amount of unprocessed data available every day is growing. This massive amount of data needs to be effectively assessed to give results that are extremely useful. In the present day, it is crucial for inventory management and demand forecasting to collect sales data for commodities or things, together with all their numerous dependent or independent parts. In a Big Mart Company, the use of sales forecasting is to estimate numerous goods that are readily available and supplied at multiple retailers in different towns. As the number of products and outlets increased drastically, it became increasingly difficult to forecast them manually. As a result, it is necessary to see to what extent the relationship between several variables, including price, popularity, time of day, outlet type, outlet location, etc., affects the appeal of a product. In this research, a data cleaning process was carried out, and data visualization using scatter plots, as well as finding Pearson correlations. The raw processing the data with study of a case big mart sales data is taken from the Kaggle website [https://www.kaggle.com/datasets/sandeepgauti/bigmart-sales]. The Pearson correlation test determines a lack of connection between the two Item_Weight and Item_Outlet_Sales variables. There is a strong but negative correlation where if Item_Visibility decreases, Item_Outlet_Sales also decreases. Positive relationships exist between the two Item_MRP and Item_Outlet_Sales variables. In addition to the correlation test, descriptive statistical analysis is also performed here. With this simple data processing, the raw data will be better organized and easier to analyze, read, and use.  

Keywords


Visualization; correlation; big mart sales data; Kaggle; Pearson correlation

References


- Tjahjanto, A. Arista, and - Ermatita, “Application of the Waterfall Method in Information System for State-owned inventories Management Development,” Sinkron: jurnal dan penelitian teknik informatika, vol. 7, no. 4, pp. 2182–2192, 2022, doi: 10.33395/sinkron.v7i4.11678.

A. Arista and K. N. M. Ngafidin, “An Information System Risk Management of a Higher Education Computing Environment,” International Journal on Advanced Science, Engineering and Information Technology (IJASEIT), vol. 12, no. 2, pp. 557–564, 2022, doi: 10.18517/ijaseit.12.2.13953.

A. Arista and B. S. Abbas, “Using the UTAUT2 model to explain teacher acceptance of work performance assessment system,” International Journal of Evaluation and Research in Education (IJERE), vol. 11, no. 4, pp. 2200–2208, 2022, doi: 10.11591/ijere.v11i4.22561.

U. Rusdiana, I. Ernawati, N. Falih, and A. Arista, “Comparison of Distance Metrics on Fuzzy C-Means Algorithm Through Customer Segmentation,” in 2021 International Conference on Informatics, Multimedia, Cyber and Information System (ICIMCIS), 2021, pp. 307–311.

W. Cholil, F. Panjaitan, F. Ferdiansyah, A. Arista, R. Astriratma, and T. Rahayu, “Comparison of Machine Learning Methods in Sentiment Analysis PeduliLindungi Applications,” in 2022 International Conference on Informatics, Multimedia, Cyber and Information System (ICIMCIS), IEEE, 2022, pp. 276–280.

T. Theresiawati, H. B. Seta, and A. Arista, “Implementing quality function deployment using service quality and Kano model to the quality of e-learning,” International Journal of Evaluation and Research in Education (IJERE), vol. 12, no. 3, pp. 1560–1571, Sep. 2023, doi: 10.11591/ijere.v12i3.25511.

N. Malik and K. Singh, “Sales Prediction Model for Big Mart,” Paricahy:Maharaja Surajmal Institute Journal of Applied Research, vol. 3, no. 1, pp. 22–32, 2020.

P. Ranjitha and M. Spandana, “Predictive Analysis for Big Mart Sales Using Machine Learning Algorithms,” in Proceedings - 5th International Conference on Intelligent Computing and Control Systems, ICICCS 2021, Institute of Electrical and Electronics Engineers Inc., May 2021, pp. 1416–1421. doi: 10.1109/ICICCS51141.2021.9432109.

M. K. Nishad and S. Kondekar, “BIG MART SALES PREDICTION,” International Research Journal of Modernization in Engineering Technology and Science, vol. 4, no. 5, pp. 1698–1702, 2022, [Online]. Available: www.irjmets.com

T. K. Thivakaran and M. Ramesh, “Exploratory Data analysis and sales forecasting of bigmart dataset using supervised and ANN algorithms,” Measurement: Sensors, vol. 23, Oct. 2022, doi: 10.1016/j.measen.2022.100388.

P. Kamari, R. Pamula, and P. K. Jain, “A Two-Level Statistical Model for Big Mart Sales Prediction,” in International Conference on Computing, Power and Communication Technologies (GUCON), IEEE, 2018, pp. 617–620. doi: https://doi.org/10.1109/GUCON.2018.8675060.

G. Behera and N. Nain, “Grid search optimization (GSO) based future sales prediction for big mart,” in Proceedings - 15th International Conference on Signal Image Technology and Internet Based Systems, SISITS 2019, Institute of Electrical and Electronics Engineers Inc., Nov. 2019, pp. 172–178. doi: 10.1109/SITIS.2019.00038.

G. Behera and N. Nain, “A Comparative Study of Big Mart Sales Prediction,” in Conference: 4th International Conference on Computer Vision and Image ProcessingAt: MNIT Jaipur, 2019, pp. 1–12. [Online]. Available: https://www.researchgate.net/publication/336530068

R. Fildes, S. Ma, and S. Kolassa, “Retail forecasting: Research and practice,” Int J Forecast, vol. 38, no. 4, pp. 1283–1318, Oct. 2022, doi: 10.1016/j.ijforecast.2019.06.004.

C. Auppakorn and N. Phumchusri, “Daily Sales Forecasting for Variable-Priced Items in Retail Business,” in ACM International Conference Proceeding Series, Association for Computing Machinery, Apr. 2022, pp. 80–86. doi: 10.1145/3535782.3535794.

A. Keyaben Patel, N. Kumar, and S. Choudhari, “BigMart Sale Prediction using Machine Learning,” International Journal of Innovative Science and Research Technology , vol. 6, no. 9, 2021, [Online]. Available: https://www.xajzkjdx.cn/gallery/423-april2020.pdf

J. L. P. Ignatius, S. Selvakumar, J. S. N. Spandana, and S. Govindarajan, “Data Analytics and Reporting API – A Reliable Tool for Data Visualization and Predictive Analysis,” Information Technology and Control, vol. 51, no. 1, pp. 59–77, Mar. 2022, doi: 10.5755/j01.itc.51.1.29467.

Bhavana T and Lakshmi K, “MACHINE LEARNING ALGORITHM FOR PREDICTING BIG-MART SALES,” International Research Journal of Modernization in Engineering Technology and Science, vol. 4, no. 6, pp. 3457–3461, 2022, [Online]. Available: www.irjmets.com

R. Dwivedi, “SALES FORECASTING IN BIG MART,” Apr. 2020.

R. F. Ali, A. Muneer, A. Almaghthawi, A. Alghamdi, S. M. Fati, and E. A. A. Ghaleb, “BMSP-ML: big mart sales prediction using different machine learning techniques,” IAES International Journal of Artificial Intelligence, vol. 12, no. 2, pp. 874–883, Jun. 2023, doi: 10.11591/ijai.v12.i2.pp874-883.

S. N. Gunjal, D. B. Kshirsagar, B. J. Dange, H. E. Khodke, and C. S. Kulkarni, “Machine Learning Approach for Big-Mart Sales Prediction Framework,” International Journal of Innovative Technology and Exploring Engineering, vol. 11, no. 6, pp. 69–75, May 2022, doi: 10.35940/ijitee.F9916.0511622.

A. Kothekar, M. Bodhale, P. Satapure, and R. Sarode, “Big Mart Sales Analysis Using Machine Learning,” International Journal of Advanced Research in Science, Communication and Technology (IJARSCT, vol. 2, no. 5, 2022, doi: 10.48175/IJARSCT-4084.

Y. F. Akande, J. Idowu, A. Misra, S. Misra, O. N. Akande, and R. Ahuja, “Application of XGBoost Algorithm for Sales Forecasting Using Walmart Dataset,” in Lecture Notes in Electrical Engineering, Springer Science and Business Media Deutschland GmbH, 2022, pp. 147–159. doi: 10.1007/978-981-19-1111-8_13.

A. Arista, “Comparison Decision Tree and Logistic Regression Machine Learning Classification Algorithms to determine Covid-19,” Sinkron: jurnal dan penelitian teknik informatika, vol. 7, no. 1, pp. 59–65, Jan. 2022, doi: 10.33395/sinkron.v7i1.11243.

R. Rawat and R. Yadav, “Big Data: Big data analysis, issues and challenges and technologies,” in IOP Conference Series: Materials Science and Engineering, IOP Publishing Ltd, Jan. 2021. doi: 10.1088/1757-899X/1022/1/012014.

P. Chhikara, N. Jain, R. Tekchandani, and N. Kumar, “Data dimensionality reduction techniques for Industry 4.0: Research results, challenges, and future research directions,” in Software - Practice and Experience, John Wiley and Sons Ltd, Mar. 2022, pp. 658–688. doi: 10.1002/spe.2876.

R. Rastogi and M. Bansal, “Diabetes prediction model using data mining techniques,” Measurement: Sensors, vol. 25, Feb. 2023, doi: 10.1016/j.measen.2022.100605.

M. Mądziel and T. Campisi, “Energy Consumption of Electric Vehicles: Analysis of Selected Parameters Based on Created Database,” Energies (Basel), vol. 16, no. 3, Feb. 2023, doi: 10.3390/en16031437.

Y. A. Alsultanny, “Big Data Visualization by MapReduce for Discovering the Relationship Between Pollutant Gases,” Journal Port Science Research, vol. 4, no. 2, pp. 56–63, Nov. 2021, doi: 10.36371/port.2021.2.3.

I. Jebli, F. Z. Belouadha, M. I. Kabbaj, and A. Tilioua, “Prediction of solar energy guided by pearson correlation using machine learning,” Energy, vol. 224, Jun. 2021, doi: 10.1016/j.energy.2021.120109.

D. Risqiwati, A. D. Wibawa, E. S. Pane, W. R. Islamiyah, A. E. Tyas, and M. H. Purnomo, “Feature Selection for EEG-Based Fatigue Analysis Using Pearson Correlation,” in International Seminar on Intelligent Technology and Its Applications (ISITIA), 2020, pp. 164–169.

T. Fu, X. Tang, Z. Cai, Y. Zuo, Y. Tang, and X. Zhao, “Correlation research of phase angle variation and coating performance by means of Pearson’s correlation coefficient,” Progress in Organic Coatings , vol. 139, Feb. 2020, doi: 10.1016/j.porgcoat.2019.105459.

H. Pan, X. You, S. Liu, and D. Zhang, “Pearson correlation coefficient-based pheromone refactoring mechanism for multi-colony ant colony optimization,” Applied Intelligence, vol. 51, no. 2, pp. 752–774, Feb. 2021, doi: 10.1007/s10489-020-01841-x.

X. Shu and Y. Ye, “Knowledge Discovery: Methods from data mining and machine learning,” Social Science Research , vol. 110, Feb. 2023, doi: 10.1016/j.ssresearch.2022.102817.

A. Arista, “Visualization & correlation of big mart sales data,” Portfolio of DSBIZ Certification. Accessed: Apr. 27, 2023. [Online]. Available: https://bisa.ai/portofolio/detail/NjA4