Evaluation of the Performance of Kernel Non-parametric Regression and Ordinary Least Squares Regression
DOI: http://dx.doi.org/10.62527/joiv.8.3.2430
Abstract
Researchers need to understand the differences between parametric and nonparametric regression models and how they work with available information about the relationship between response and explanatory variables and the distribution of random errors. This paper proposes a new nonparametric regression function for the kernel and employs it with the Nadaraya-Watson kernel estimator method and the Gaussian kernel function. The proposed kernel function (AMS) is then compared to the Gaussian kernel and the traditional parametric method, the ordinary least squares method (OLS). The objective of this study is to examine the effectiveness of nonparametric regression and identify the best-performing model when employing the Nadaraya-Watson kernel estimator method with the proposed kernel function (AMS), the Gaussian kernel, and the ordinary least squares (OLS) method. Additionally, it determines which method yields the most accurate results when analyzing nonparametric regression models and provides valuable insights for practitioners looking to apply these techniques in real-world scenarios. However, criteria such as generalized cross-validation (GCV), mean square error (MSE), and coefficient determination are used to select the most efficient estimated model. Simulated data was used to evaluate the performance and efficiency of estimators using different sample sizes. The results favorable the simulation illustrate that the Nadaraya-Watson kernel estimator using the proposed kernel function (AMS) exhibited favorable and superior performance compared to other methods. The coefficients of determination indicate that the highest values attained were 98%, 99%, and 99%. The proposed function (AMS) yielded the lowest MSE and GCV values across all samples. Therefore, this suggests that the model can generate precise predictions and enhance the performance of the focused data.
Keywords
Full Text:
PDFReferences
N. A. Erilli, “Non-Parametric Regressiın Estimation For Data With Equal Values,” Eur. Sci. J., vol. 10, no. 4, pp. 70–82, Febr.2014.
L. A. Mohammed and M. A.-N. A. Al-Hassan, “Comparison of Estimates Nonparametric In Multiple Regression Analysis Function (Gamma, Beta),” J. Econ. Adm. Sci., vol. 24, no. 108, pp. 488–497, 2018.
N. Garba, N. S. Danchadi, and M. K. Abdulmumin, “Evaluating the Performance of Ordinary Least Square and Polynomial Regression with Respect to Sample Size,” Int. J. Sci. Glob. Sustain., vol. 7, no. 3, p. 6, 2021.
W. Hardle and E. Mammen, “Comparing nonparametric versus parametric regression fits,” Ann. Stat., pp. 1926–1947, 1993.
R. L. Eubank, Nonparametric regression and spline smoothing. CRC press, 1999.
B. S. Yandell and R. L. Eubank, “Spline Smoothing and Nonparametric Regression,” Technometrics, vol. 31, no. 3, p. 380, 1989, doi: 10.2307/3556148.
M. Köhler, A. Schindler, and S. Sperlich, “A review and comparison of bandwidth selection methods for kernel regression,” Int. Stat. Rev., vol. 82, no. 2, pp. 243–274, 2014.
Z.H. Rashid, and H. Manaf Yousef, “Nadaraya-Watson a Smoothing Technique for Estimating Regression Function,” Journal of Economics and Administrative Sciences., Vol.18,no. 65 , p.283, 2012.
A. L. Burton, “OLS (Linear) regression,” Encycl. Res. methods Criminol. Crim. Justice, vol. 2, pp. 509–514, 2021.
F. Ali, “Bayesian Methods for Estimation the Parameters of Finite Mixture of Inverse Rayleigh Distribution,” Math. Probl. Eng., vol. 2023, Jun. 2023, doi:10.1155/2023/2912584
N. Mohammed and F. Ali, “Estimation of parameters of finite mixture of Rayleigh distribution by the expectation-maximization algorithm,” J. Math., vol. 2022, Dec. 2022. do: 10.1155/2022/7596449
F. Ali and J. Zhang, “Mixture model-based association analysis with case-control data in genome wide association studies,” Stat. Appl. Genet. Mol. Biol., vol. 16, no. 3, pp. 173–187, 2017, doi: 10.1515/sagmb-2016-0022.
M. Tsagris, A. Alenazi, and C. Stewart, “Flexible non-parametric regression models for compositional response data with zeros,” Stat. Comput., vol. 33, no. 5, p. 106, 2023. doi: 10.1007/s11222-023-10277-5.
N. H. Anderson, “Scales and statistics: parametric and nonparametric.,” Psychol. Bull., vol. 58, no. 4, p. 305, 1961.
H. Nadia and A. Mohammad, “Model of Robust Regression with Parametric and Nonparametric Methods,” Math. Theory Model., vol. 3, no. 5, pp. 27–39, 2013.
S. H. Shah, A. Rehman, T. Rashid, J. Karim, and S. Shah, “A comparative study of ordinary least squares regression and Theil-Sen regression through simulation in the presence of outliers,” J Sci Technol, vol. 137, p. 142, 2016.
J. Opara, A. I. Iheagwara, and I. Okenwe, “Comparison of parametric (OLS) and nonparametric (THEIL’S) linear regression,” Adv. Res. J. Multi-Disciplinary Discov., vol. 2, no. 1, pp. 24–29, 2016.
A. M. Gad and M. E. Qura, “Regression estimation in the presence of outliers: A comparative study,” Int. J. Probab. Stat., vol. 5, no. 3, pp. 65–72, 2016.
E. P. Akpos, “Comparisons of Theil ’ S and Simple Regression on Normal and Non-Normal Data Set With Different,” Int. J. Manag. Appl. Sci. 4 (1), 70, vol. 74, no. November, pp. 24–28, 2017.
T. H. Ali, “Modification of the adaptive Nadaraya-Watson kernel method for nonparametric regression (simulation study),” Commun. Stat. Simul. Comput., vol. 51, no. 2, pp. 391–403, 2022, doi: 10.1080/03610918.2019.1652319.
H. Liu and J. Xiang, “Kernel regression residual signal-based improved intrinsic time-scale decomposition for mechanical fault detection,” Meas. Sci. Technol., vol. 30, no. 1, p. 15107, 2019, doi: 10.1088/1361-6501/aaf252.
J. Wolberg, Data analysis using the method of least squares: Extracting the most information from experiments. Springer Science & Business Media, 2006. doi: 10.1007/3-540-31720-1.
J. Fan, “Design-adaptive nonparametric regression,” J. Am. Stat. Assoc., vol. 87, no. 420, pp. 998–1004, 1992.
L. A. Mohammed and M. A.-N. A. Al-Hassan, “Nonparametric in Multiple regression analysis function Epanechnikov & Gaussion,” J. Baghdad Coll. Econ. Sci. Univ., no. 59, pp. 1–14, 2019.
Nur’Eni, M. Fajri, and S. Astuti, “Comparison of Kernel regression model with a polynomial regression model on financial data,” in Journal of Physics: Conference Series, 2021, vol. 1763, no. 1, p. 12017. doi: 10.1088/1742-6596/1763/1/012017.
W. Härdle and O. Linton, “Chapter 38 Applied nonparametric methods,” Handb. Econom., vol. 4, no. 26, pp. 2295–2339, 1994, doi: 10.1016/S1573-4412(05)80007-8.
Han, Y. and Zhang, D., “Nadaraya-Watson estimators for reflected stochastic processes”. Acta Mathematica Scientia., vol.44, no.1, pp.143-160 ,2024. doi: doi.org/10.1007/s10473-024-0107-1.
D. Aydin, “A Comparison of the Nonparametric Regression Models using Smoothing Spline and Kernel regression,” Int. J. Math. Comput. Sci. Eng., vol. 1, no. 12, pp. 416–420, 2007, doi=10.1.1.192.8167&rep=rep1&type=pdf.
I. Horová, J. Koláček, and J. Zelinka, Kernel smoothing in MATLAB: Theory and practice of kernel smoothing. World scientific, 2012. doi: 10.1142/8468.
H. A. Farahani, A. Rahiminezhad, L. Same, and K. Immannezhad, “A comparison of Partial Least Squares (PLS) and Ordinary Least Squares (OLS) regressions in predicting of couples mental health based on their communicational patterns,” Procedia - Soc. Behav. Sci., vol. 5, pp. 1459–1463, 2010, doi: 10.1016/j.sbspro.2010.07.308.
K. Schmidheiny and U. Basel, “The multiple linear regression model,” Short Guid. to Microeconometrics, Version, vol. 20, p. 29, 2013.
R. Nath, S. Nishad, M. Saad, M. Ankush, and G. Editors, AI and IOT in Renewable Energy Studies in Infrastructure and Control. Springer, 2021.
W.M. Browne, "Cross-validation methods," Journal of mathematical psychology., Vol. 44, no. 1, pp. 108-132,2000, doi : doi.org/10.1006/jmps.1999.1279.
A. Di Bucchianico,"Coefficient of determination (R^2)," Encyclopedia of statistics in quality and reliability.,2008, doi: 10.1002/9780470061572.eqr173.
Z. Wang, and C.B. Alan, "Mean squared error: Love it or leave it? A new look at signal fidelity measures," IEEE signal processing magazine., Vol.26, no.1, pp.98-117, 2009, doi: 10.1109/MSP.2008.930649.
T. O. Hodson, T. M. Over, and S. S. Foks, “Mean squared error, deconstructed,” J. Adv. Model. Earth Syst., vol. 13, no. 12, pp. 1–10, 2021.
D. Chicco, M. J. Warrens, and G. Jurman, “The coefficient of determination R-squared is more informative than SMAPE, MAE, MAPE, MSE and RMSE in regression analysis evaluation,” PeerJ Comput. Sci., vol. 7, pp. 1–24, 2021, doi: 10.7717/PEERJ-CS.623.
O. Köksoy, “Multiresponse robust design: Mean square error (MSE) criterion,” Appl. Math. Comput., vol. 175, no. 2, pp. 1716–1729, 2006, doi: 10.1016/j.amc.2005.09.016.
K. Vehkalahti, The Concise Encyclopedia of Statistics by Yadolah Dodge, vol. 76, no. 3. Springer Science & Business Media, 2008. doi: 10.1111/j.1751-5823.2008.00062_25.x.
C. L. Cheng, Shalabh, and G. Garg, “Coefficient of determination for multiple measurement error models,” J. Multivar. Anal., vol. 126, pp. 137–152, 2014, doi: 10.1016/j.jmva.2014.01.006.
P. Craven and G. Wahba, “Estimating the Correct Degree of Smoothing by the method of Generalized Cross-Validation,” Numer. Math., vol. 403, pp. 377–403, 1979.
L. M. Chaves, L. D. de Carvalho, C. J. dos Reis, and D. J. de Souza, “Explaining the generalized cross-validation on linear models,” J. Math. Stat., vol. 15, no. 1, pp. 298–307, 2019, doi: 10.3844/jmssp.2019.298.307.