ON

— Pixel is the smallest element given by the image from a digital camera and is used as a data source in the digital image processing process. In this paper, two data collection processes are carried out, i.e., taking actual height data using a standard stature meter and taking sample photos using a camera placed from the sample with a distance of 160 cm and a height of 100 cm. The sample photos obtained are then processed for segmentation of the sample body against the surrounding environment using several digital image-processing techniques such as grayscale, blur, edge detection, and bounding box to obtain a pixel value representing the height of the sample. The next stage is the regression analysis process by correlating actual height with pixel height using five regression equation analysis methods: least squares, logarithmic powers, exponentials, quadratic polynomials, and cubic polynomials. This study analyzes the differences between these methods in terms of the correlation coefficient, Root Mean Squared Error (RMSE), average error, and accuracy between height calculation data based on digital image processing and actual height measurement data. From the results obtained, the logarithmic power method produces the best analytical value compared to other methods with the correlation coefficient, RMSE, average error percentage, and percentage accuracy of 0.976, 1.3, 0.58%, and 99.42%, respectively. While the cubic polynomial is in the last position, the correlation coefficient, RMSE, average error percentage, and accuracy percentage are 0.978, 1.41, 0.64%, and 99.36%, respectively.


I. INTRODUCTION
Measuring height is an activity often applied to children, adolescents, and adults if they want to observe their height growth.This measurement is also often used for a person's threshold value to be able to enter and play a game in recreational and amusement parks.In practice, measuring height can be done using a manual measuring instrument such as a stature meter.However, this can be a problem if the person has a height that exceeds the measuring device.So far, this process has also required close contact between the measuring person and the person being measured, which can be very risky, especially in the current social distancing conditions of the COVID-19 pandemic.
Several studies that have been carried out previously use digital techniques in the use of sensors [1] which uses an ultrasonic sensor of the HC-SR04 type with a very good level of accuracy.The Kinect XBOX 360 camera sensor is used to measure height with the smallest error of 3.41% for positions at a distance of 2 m [2].While measuring height using a camera sensor [3], [4], [5], the system accuracy obtained is 99.3%, 95.97%, and 98.42%, respectively.Although the use of this camera sensor has a high level of accuracy, in the process of calculating the image scale in pixels when used as the actual scale in centimeters (cm), it turns out that the formula approach is made by itself and the use of certain multipliers.
Regression analysis is an approach technique based on approximation statistical observations modeled in mathematical equations to find the relationship between variables based on characteristic curve analysis using related variable data that has been obtained in previous real measurements.This analysis can be used in the sensor calibration process which relates the measurement output from the sensor to the actual output of the measurement of standard measuring instruments which usually uses the least squares equation method [6] where Near Infrared (NIR) sensor calibration is carried out for monitoring blood hemoglobin which has an error of 1%-6%.However, this method's use is simple, producing an equation with the output characteristic of being too rigid and inflexible, namely in the form of a straight line, so it is not suitable for use in certain systems.The study by Rifqi et al. [7] looked for the relationship between the volume and weight of star fruit; the accuracy was only 86.43%.
In reality, not all sensors have the same output characteristics, so other regression analysis methods are needed for comparison purposes.Several types of regression analysis are more complex and utilize a non-linear approach to produce an equation with more flexible and less rigid output characteristics.The types are logarithmic, exponential, and polynomial power methods.In several studies, these methods are often compared to find the best equation that can represent the actual data [8], i.e., determining the quality of dragon fruit.It is found that the cubic polynomial regression model is the best for predicting texture values, water content, total dissolved solids, and total acid with correlation coefficients of 0.92, 0.85, 0.99, and 0.32, respectively.In Zainab et al. [9], the relationship between RGB color and chlorophyll-a content was found; it is the best equation in the form of an exponential equation with a correlation coefficient of 0.767.In addition, in other studies [10], [11] studied the relationship between RGB colors for the distribution of solid materials, and the best equation was obtained, i.e., the exponential equation with a correlation coefficient of 0.9336 and 0.9075, respectively.
In this paper, an approach that has not been carried out in several previous studies in the field will be carried out.The approach is an analysis of differences in RMSE, Error, accuracy, and the value of the correlation coefficient determined from the four types of regression methods, namely the least squares method, logarithmic power, exponential, and polynomial for height calculation systems based on digital image processing (DIP) with real values.This DIP starts with using camera sensors for sample data collection, which will be processed to obtain a segmentation of the sample body.The segmentation is generally obtained from analysis in the form of one of two basic properties, i.e., the value of intensity, discontinuity, and image similarity to the surrounding environment [12], so that the pixel value representing the sample height is obtained.Furthermore, it is compared with the actual height value using four regression analysis methods to obtain the best equation to represent the actual height.Hopefully, this research can be an alternative to developing an automatic height calculation system based on camera sensors that are effective, efficient, and able to reduce direct contact, especially during the Covid-19 pandemic, and can be a reference for further research in the field.

A. Materials and Tools
In the hardware design process stage, there are several types of equipment, which are instrument components such as:  The camera used is as a sensor to take pictures or digital images. Tripod to assist placing the camera at the desired height and distance. Fabric as a background that serves to facilitate the process of segmentation of the sample body with its environment.
 Height measuring instrument in the form of a stature meter, which serves to measure the actual height of the sample. Illumination that serves to provide sufficient lighting and help eliminate the formation of shadows by other lighting sources because it can provide noise in the DIP process. PC/laptop that functions for software design and data analysis processes.

B. Research Procedure
Retrieval of real data from the sample in the form of measurement of height (cm) using a stature meter and taking photos of the front view.Then it is processed by DIP based on Graphic User Interface (GUI) using Python programming language.Python is a programming language currently growing rapidly and is widely known by programmers because it has a short and simple writing structure that is easy to learn and understand.Python is open source where everyone can create, add, develop, and use libraries for various purposes.Currently, Python is used in all fields.The Open-Source Computer Vision Library (OpenCV) is commonly used in digital image processing.
According to Andrekha and Huda [13], OpenCV can be used in image or video processing to extract various information from an image by processing the region of interest (ROI) or taking certain image areas as input, such as the body area.Furthermore, it is labeled and compared with the labeled training data on the sample body areas of several humans with various positions that have been studied and stored.Some of the DIP techniques [14] are grayscale.This technique is an RGB image matrix processing process with three-pixel matrices overlapping each other to become a grayscale image with only a 1-pixel matrix.The next technique is a blur, which uses a convolution technique to the actual image matrix with a filter matrix kernel to obtain a new image [15].
In comparison, the edge detection technique is a process used to determine an object's edge by identifying the boundary line based on the location of pixels that have different extreme intensity values from objects in the image reported by Utari et al. [16].The last method is a bounding box, which is a bounding box that surrounds an object to obtain pixel value information.To take photos of the samples carried out as in the schematic in Fig. 1.In this design, the camera is placed at a distance of 2 m from the background with a height of 1 m from the floor, while the sample stands 40 cm from the background.
This research is a quantitative type of empirical research based on real observations in comparing the height obtained from the proposed system with the results obtained with definite measuring instruments.System testing is done by referring to the existing data.The sample population obtained is then divided into two groups, i.e., the sample for test data and the sample for training data.The division of the training data sample groups was applied to represent an equal comparison between the number of samples for women and men.

C. Regression Methods
The training data with the same composition of male and female ratios are data from the height determined by DIP in pixels taken using the camera sensor.Meanwhile, the actual height measurement in cm using standard measuring tools has been obtained and then used to perform regression analysis, aiming to determine a mathematical equation that can represent the relationship between height in pixels and height in cm.This study uses five regression analysis methods, such as the least squares equation method, logarithmic power, exponential, and polynomial, both second and third order.The explanation of the five methods is as follows: 1) Least Square Method: This method is usually known as linear regression because the resulting curve is a straight line that has the form of an equation as in (1) [18], where x denotes the height in pixels of the sample data, y is the height in cm, n is the number of sample data, y' is the height data from the system, ̅ and are the average real height data of measurements and the average height in pixels, respectively.

= +
(1) 2) Logarithmic Power Method: This method has the final equation form as in (6), where the determination of the constants a and b uses the least squares equation method, which is transformed using the logarithmic function in (7) [19].
= 10 (10) 3) Exponential Method: This method has a final form as in (11), similar to the logarithmic power method.The expression to find constants a and b uses the least squares method equation, but first, it is transformed using the ln function, i.e., the natural logarithm [19].
4) Polynomial Method: This method is a form of nonlinear analysis where the resulting curve is quadratic in the form of second order as in ( 16) and cubic, which is third order as in (17) [19].
As for finding constants a, b, c, d, and so on, use the following equation rules:

D. Evaluation Metrics
In this metrics evaluation process, the 5 mathematical expressions of the regression method based on (1), ( 6), ( 11), (16), and (17) determined the output, i.e., y, by providing input x using test data.Each equation has system output data in the form of different heights (cm).This data set is compared with the actual data so that the correlation coefficient, the RMSE, the mean Error, and the accuracy values for each equation are obtained.

1) Correlation Coefficient:
This coefficient (R) is a value that indicates the strength of the relationship between two data sets.The value of this correlation coefficient is in the range of 0 to 1 where 0 indicates there is no relationship between the two data sets while 1 indicates a very close relationship between the two data sets.
2) RMSE, Error, and Accuracy: This evaluation is a method used to determine the level of accuracy of the output on a system by comparing the real data set of measurement results using standard tools to the output data set of the system.The equations that determine RMSE, Error, and accuracy are stated (23), (24), and (26) [20], [21].

A. Hardware and GUI Development
In the hardware installation process, as shown in Fig. 2, it can be seen that the components used are in the form of an ASUS-type mobile camera that can be accessed directly by a PC using a USB cable and additional software such as Droid Cam.Lighting uses room lighting in the form of two additional lamp units placed on the left and right sides of the camera.This is so that the lighting can be evenly distributed where too little or too much lighting can cause much noise, spoiling the image capture [22].The use of a green background cloth measuring 2.5 m  2.6 m.The use of green color utilizes the green screen concept.The digital video camera sensors have the highest sensitivity to green, resulting in less noise and less need for lighting [23].The camera is placed at a distance of 2 m from the center point of the background with a height of 1 m with the help of a tripod.The purpose of this camera placement is that the captured images only cover the body of the sample and the background cloth behind it to minimize the appearance of noise by other objects and simplify the segmentation process and sample identification.
The process of making a GUI based on the Python programming language using the "Tkinter" library.This GUI is created to facilitate the process of taking photos from samples because the data can be directly taken and stored in the PC memory and simultaneously as a form of GUI design for measuring height.The GUI form that has been designed as in Fig. 3 where there are two main menus, i.e., the menu to access the camera directly, i.e. "Camera," and the menu to open photos from files stored in the directory, i.e., "Open File".Both menus can be accessed and directly display photos into the GUI, which are then stored in the save image menu.

B. Data Retrieval
The process for data collection is carried out in two stages, namely, taking real sample data in the form of height in cm and taking sample photos using hardware and a GUI that has been designed.The population of this research sample is 70 data where the sample subjects are students aged 18-21 years and have a height range from 140 cm to 180 cm.Furthermore, the 70-sample data were divided into 40 training data and 30 test data.Of the 40-training data consisting of 20 women and 20 men for the process of regression analysis and calculation of the correlation coefficient.In comparison, the other 30 data are used as test data for the calculation of RMSE, average error, and accuracy.
The real data collection process from the sample is presented in Fig. 4. To collect data from height, use a standard measuring instrument in the form of a stature meter where this tool is attached to a wall whose height has been set.The data collection process is that the sample stands under the measuring instrument, and the tip of the measuring instrument is positioned so that it touches the head of the sample so that actual height data can be obtained.
Fig. 4 The process of measuring real data from samples For taking photos of samples as shown in Fig. 5, the sample stands at a distance of 40 cm in front of the background without using footwear.The captured images from this process are used in the next stage, namely the configuration process stage in DIP.

C. Digital Image Processing Configuration
In this stage, photos of the samples that have been taken using the next camera during the data acquisition process are entered into the DIP technique.This stage is a trial-and-error process that aims to detect the body shape of the sample through a segmentation process and the formation of ROI in order to obtain a pixel value that represents the height of the sample.In Fig. 6, the process carried out is to change the RGB color image into a gray image using the grayscale technique.It aims to speed up image processing for the next stage.The next step is to perform the blurring process and edge detection of the grayscale image.These two processes have a close relationship where the blurring process is carried out to make the image smoother so that it can reduce the existing noise.Then the edge detection process is used so that the noise contained in the image can be identified and detected so that its influence can be minimized.Fig. 7 is given an example of a case where a test is carried out on each level of blur given to the image starting from a grayscale image directly without a blur process to a certain level of blur that affects the edge detection process.The results are shown in Figs.7(a)-(d).In Fig. 7(a) it seems that the remaining noise is still visible, so proceed to the next level of blur, i.e., blur level 1 as shown in Fig. 7(b) where this noise level is reduced.At the next blur level, namely blur level 2, it is shown in Fig. 7(c), all noise in the image has disappeared even though the body structure of the sample is cut off but can still provide information regarding its height.At a blur level of 3 as in Fig. 7(d), all the noise in the image has been removed, but the body structure of the sample is cut off more when compared to the processed results at the previous blur level.After the blur test process and edge detection were carried out, based on the final result, it was decided that the level of blur used in this study was blur level 2. This is because the noise has been removed at this stage and provides a better body structure for the sample compared to other blur levels.Furthermore, at the edge detection stage, an image of the human body is obtained without any surrounding noise, which is then used as input for the next stage, especially to determine ROI.In determining ROI, the bounding box technique is used to detect and obtain edge values, both the top, bottom, right, and left edges on the body of the sample in pixels based on the edge detection image that has been processed previously.The obtained edge values are then entered into the initial image as shown in Fig. 8, where these values are used as coordinates to draw a line.Thus, it can form a box that covers the entire body of the sample and also from this value can be obtained a value that represents the height and width of the body of the sample in pixels.The pixel value of the obtained height is then compared with the actual height in the next stage in the form of a regression analysis process to determine the RMSE value, average error, and accuracy.

D. Regression Analysis
Sample data that has been acquired in the previous process such as height data from the sample in cm is set as the real height of the body and height in pixels as the image height.Of the 70 existing data, 40 were used as training data consisting of 20 women and 20 men in the hope that the data distribution was good, average, and could represent all genders.These data are inputted into the regression analysis process for the five least squares method, logarithmic power, exponential, quadratic polynomial, and cubic polynomial, where pixels are the independent variable on the x-axis and real height is the dependent variable on the y-axis, for the entire process of analyzing the five methods, assistance from a data processing application, namely Microsoft Excel 2013.In this stage, five regression analysis equations were obtained as well as the correlation coefficient value of each method to the actual data.
For the least squares method, the equation obtained is: with the correlation coefficient value is 0.976.For the method of logarithmic powers, the following expression is given: with a correlation coefficient of 0.976.The exponential method is expressed by: with a correlation coefficient of 0.975.For the quadratic polynomial method, the equation is obtained in the form of: = −0.0003468!+ 0.653186872 − 66.459643 (30) which has a correlation coefficient of 0.976.
As for the cubic polynomial method is given by: = 0.0000241 # − 0.03425 !+ 16.5 − 2532.66 (31) with a correlation coefficient of 0.978.Based on the expressions obtained through each of these methods, each method has a correlation coefficient that is generally very good, which is above 0.975.When sorted, the method that has the best correlation coefficient is the cubic polynomial method, with a correlation coefficient of 0.978.Then followed by the least squares method, logarithmic power, and quadratic polynomial with correlation coefficients 0.976, 0.976, and 0.975, respectively.This shows that each method equation can be said to represent the actual height data.

E. RMSE, Average Error, and Accuracy
In this stage, the five equations from each method are tested using 30 test data where each equation has its own output and then analyzed to obtain the RMSE value, average error, and accuracy.The results given by each equation for the entire test data are presented in the graph shown in Figs.9(a)-(e) where Figs.9(a)-(e) shows the trend of least squares equations, logarithmic powers, cubic polynomials, quadratic polynomials, and exponentials, respectively.At first glance, it looks Figs.9(a)-(e) appears to be similar in trend between test data and height (cm).I. Based on Table I, the outputs produced by the five methods do not have very significant differences with the RMSE level, average error, and accuracy, but these values are above the threshold of a good value.Of these five methods, if sorted by RMSE, average error, and accuracy, the best results are obtained by the logarithmic power method with RMSE, average error, and accuracy are 1.3, 0.58%, and 99.42%, respectively.This is followed by the methods of least squares equations, quadratic polynomials, exponentials, and cubic polynomials.For the cubic polynomial method in the last order with RMSE values, the average error and accuracy are 1.41, 0.64%, and 99.36%, respectively.Despite being in the latter position, this cubic polynomial method produces better results when compared to previous studies [4], [5], [17].
Based on the results that have been obtained, such as the value of the correlation coefficient, RMSE, average error, and accuracy, the logarithmic power method has the best results, so in this paper, this method is then fed into the GUI to be used as a method of measuring human height as shown in Fig. 10.The system has been able to carry out the segmentation process and identify the body of the sample properly by using several digital image processing techniques such as grayscale, blur, edge detection, and bounding box so that the edge values that cover the entire body of the sample are obtained to represent the height and width of the sample body in pixels.The height in pixels obtained from this digital imageprocessing step is compared with the actual height through a regression analysis process such as the least squares equation method, logarithmic power, cubic polynomial, quadratic polynomial, and exponential.Based on the analysis results, the logarithmic power method is superior to other methods with the correlation coefficient, RMSE, average error, and accuracy are 0.976, 1.3, 0.58%, and 99.42%, respectively.While the cubic polynomial method has the last performance with correlation coefficient, RMSE, mean error, and accuracy are 0.978, 1.69, 0.89%, and 99.11%, respectively.Despite being in the last position, there are no significant differences between the other best methods.So, it is hoped that this study can become the basis for alternative considerations in developing an automatic height calculation system based on camera sensors that are effective, and efficient.In addition, it could reduce direct contact, especially during the COVID-19 pandemic can also be used as a reference for further research.

Fig. 1
Fig. 1 Taking photos of samples

Fig. 2
Fig. 2 Hardware setup for sampling data

Fig. 3
Fig. 3 GUI display for height measurement

Fig. 5
Fig. 5 Photo data collection and results from a sample

3 Fig. 7
Fig. 7 Various images from the blur and edge detection process

Fig. 8
Fig. 8 Image with bounding box process

Fig. 9
Fig.9The graph of the output of height against the test data sample

Fig. 10
Fig.10 Display of GUI for height measurement IV.CONCLUSION

TABLE I RMSE
, AVERAGE ERROR, AND ACCURACY OF EACH METHOD