Computational of Concrete Slump Model Based on H2O Deep Learning Framework and Bagging to Reduce Effects of Noise and Overfitting

— Concrete mixture design for concrete slump test has many characteristics and is mostly noisy. Such data will affect the prediction of machine learning. This study aims to experiment on the H2O Deep Learning framework and Bagging for noisy data and to overfit avoidance to create the Concrete Slump Model. The data includes cement, blast furnace slag, fly ash, water, superplasticizer, coarse aggregate, fine aggregate, age, slump, and compressive strength. Primary data for a concrete mixed design using the fine aggregate material from Merapi Volcano, the hills of Muntilan, and Kalioro. The coarse aggregate was obtained from Pamotan, Jepara, Semarang, Ungaran, and Mojosongo Boyolali Central Java. The cement used Gresik and Holcim products, and the water was from Tembalang, Semarang. The experiment model with one input layer with 7 neurons, one hidden layer with 20 neurons, and one output layer with 1 neuron using activation function TanH, with parameter L1=1.0E-5, L2=0.0, max weight=10.0, epsilon=1.0E-8, rho=0.99, and epoch=800 can achieve RMSE of 2.272. This result shows that after introducing Bagging, the error can be reduced up to 2.5 RMSE, approximately (50% lower) compared to the model without Bagging. The manually tested mixture data was used to model evaluation. The result shows that the model was able to achieve RMSE 0.568. Following this study, this model can be used for further research, such as creating slump design practicum equipment/ application software.


I. INTRODUCTION
Concrete is a mixture of complex materials like cement, water, coarse aggregates, and fine aggregates, with various effects of characteristics of river sand, dune sand, crushed sand [1], and chemical components and mineral admixtures mixed by some formula to improve the quality. Therefore, it becomes difficult to predict the concrete slump with these complex materials [2], [3]. The slump in value is very important. Vakhshouri and Nejadi [4] state that slump also affects the compressive strength of concrete.
Concrete slump prediction research using conventional methods is still popular based on British Standard, American Concrete Institute Method (ACI), etc. [5]. Conventional methods tend to rely on an analysis model which requires ideal conditions and high precision measurements, but those requirements are often hard to fulfill. Thus, a more advanced method for slump testing method must be developed. In the construction industry, digitization, new materials and technologies, and advanced automation are now becoming new trends [6], [7].
The most reliable approach to solving the complex problem of the concrete slump prediction model is by using computational intelligence for data mining. The model can also include machine learning in data mining, similar to how humans solve problems by training and implementing or using evolutionary learning. Those approaches can also be called soft computing approaches. Soft computing deals with imprecise, uncertain, partial truth, and approximation to achieve practicability, robustness, and low solution cost [8] which often happens when designing concrete slump. Soft computing approaches are more accurate than statistical approaches. In their research, Chine concludes that Artificial Neural Network (ANN) is more accurate in predicting concrete slump value than multiple linear regression in concrete mixture design [9]. Chopra also formulated a similar conclusion for ANN that ANN is more accurate in producing estimation than Decision Tree (DT) and Random Forest (RF) [10], but Feng et al. [11] declared that DT more accurate than ANN and Support Vector Machine (SVM). These results, including [9], show that Backpropagation Neural Network (BPNN) is practical for concrete compressive strength prediction [12]. From that preliminary research, ANN shows a good potential to solve concrete mix design problems. Hence, ANN was used as the baseline model further to improve the robustness of the concrete slump model.
State-of-the-art concrete mix design for concrete slump and compressive strength prediction model computation is an evolutionary and deep learning model approach. The evolutionary and deep learning model is more advanced in solving computational problems than other approaches. Wang proposed that Wavelet Neural Networks (WNN) estimation algorithm could analyze and estimate the concrete compressive test data [13]. Deng et al. [14] used Convolution Neural Network (CNN) with lower error than BPNN and SVM. Even though this is the case, CNN or CNN Modified [15] is useful only when the positional and spatial information of a certain feature in the data is important such as in image processing [16], [17] or signal processing cases [18], [19]. Another type of neural network, Recurrent Neural Networks (RNNs), is very suitable for dealing with time series data [20].
H2O Deep Learning framework (H2O) will be used to create the model [21] without convolutional layers and maxpooling layers of CNN or RNN. H2O works better than RNN for transactional data because RNN is strong in sequential or time series data [22]. To improve performance of the model, ensemble approaches like Random Forests and Gradient Boosting Regression Trees [23], and Bagging [18], Smoothed Bootstrap Resampling [26], [27] could be used to reduce the negative effect of inherent noise [28] and overfitting. Li states that Random Forests and Gradient Boosting Regression Trees do not improve the prediction quality [23]. Dahiya et al. [24] state in their research that their feature selection-enabled hybrid Bagging algorithm (FS-HB) performed best compared with fewer features and tree-based classifier for qualitative datasets. Its performance on numeric data was also better than other standalone classifiers.
Concrete mixture design data for concrete slump tests has many characteristics and is very noisy because of the diversity of origins of the concrete materials. This data will affect the accuracy of the model prediction. The use of H2O Deep Learning without an ensemble method is supposedly unable to overcome the data noise and avoid overfitting. This study aims to experiment with Bagging in case of noisy data and overfitted model. In order to do so, two models with and without Bagging will be tested where overfitting exists and where overfitting does not exist. We used Cross Validation to reduce the prevalence of robust overfitting in adversarial training [29], [30]. All of the produced models will be evaluated and the model with the least error (lowest Root Mean Square Error-RMSE) will be used as a baseline to create virtual slump test tools, practicum equipment, or application software.

II. MATERIALS AND METHOD
In order to achieve the optimum Computational of Concrete Slump Model, this research will be conducted as follows.

A. Data Collection
Secondary data was collected from UCI Data Repository data set created by I-Cheng Yeh [31], and primary data from Material Laboratory of Politeknik Negeri Semarang. The data consists of the concrete mixed design: cement, blast furnace slag, fly ash, water, superplasticizer, coarse aggregate, fine aggregate, and age.

B. Data Pre-processing and Splitting
Because of the variety of measurements in the data, the data will be pre-processed with z-transform (see Fig.1). Note that the output will not be pre-processed to preserve the model so that it can output the same measurement as the data. The mean μ and the standard deviation σ for each variable are first calculated. Afterward, all the data input is transformed with the following function.
Before the experiment process was done, the dataset was shuffled and partitioned into k partitions or k subset of data. This data was then split into train set and a validation set using k-fold Cross Validation [32]. One partition is used as the validation set, and the rest (k-1) is the training set.

C. Experiment
The H2O architecture model was designed by testing the number of hidden layers and the most optimal number of neurons to obtain the lowest possible error. Firstly, several hyperparameters were defined, while weights and biases were randomly generated. The hyperparameters are epoch, L1, L2, epsilon, and rho. Secondly, the model was trained by using AdaDelta learning algorithm with the training dataset. Activation functions ReLU and TanH are used as transfer function for each layer to avoid vanishing gradients. Then, the prediction will be used to calculate the loss and backpropagated to optimize the bias and weight.
The H2O neural network model that is used in this research contains one input layer, several hidden layers, and one output layer. Given the input X with n row data (batch of data) represented as a row matrix, and the weight is a matrix W of size (next_layer × prev_layer), each layer was calculated with simple matrix multiplication as follows, where is neuron summation and is the output of the neuron summation with activation function applied to it. The activation function !⋅" is a non-linear activation function, in this case ReLU [33] and TanH, that are calculated as follows [34], , With our configuration, for instance, 2 hidden layers, the calculation will be as follows: = × !2, 3 " + 5 6 ℎ = ! " = × ! 3 , 9 " + 5 : ℎ = ! " ; = × ! 9 ,;" + 5 < = = ! ; " Bagging (bootstrap aggregating) was used to improve the performance and robustness of the model over noisy data. The model was replicated b times, and each model was trained using randomly selected data with replacement [35]. By bootstrapping the dataset, each model will adapt to its own sets of data. Take a dataset D, for example, we can augment b new datasets from D.
We will assume that there is a dataset > F ⊂ @ which no noisy data are occurring. The fact that each dataset d is selected randomly with replacements, with enough number of different datasets, we eventually augmented > 2∈|H| ≈ > F . If data > F is used to train a model, generalization effect should occur thus reducing the chance of overfitting by noisy data. This step is important because in general, it is hard to single out noise in the data, especially with high dimensional data.
After finishing with Bagging, the result of each model was aggregated with min-pooling based on the RMSE, hence we singled out the model with better sets of data to achieve the lowest error. The validation data set is used to test the current parameters for overfitting. The effect of Bagging was tested by training two models, with and without Bagging, by comparing the RMSE (see Fig. 1).
The error was calculated using RMSE function, calculated as follows: with = P is the target, = is the predicted output, and N is the size of the minibatch.

D. Model Evaluation
The evaluation phase is the stage of testing the best model of training and validation results with hyperparameters that have been obtained with and without Bagging. The model with the lowest RMSE obtained from this evaluation stage is the Concrete Slump Model, which is ready for the next research about virtual machine development for concrete slump design.

A. Initial Architecture of Concrete Slump Model
The number of hidden layers (HL) and the number of neurons (N) in each hidden layer are the parameters that determine the architecture of the expected model. Based on the preliminary experiment stage, this experiment is to find the optimal number of neurons with the activation function of TanH, ReLU, Maxout, and with or without Bagging. For other parameters the default parameters from the library were used. This experiment shows that the best training RMSE of the H2O Deep Learning without Bagging Model with activation function TanH is 6.0807934, while ReLU 6.3655844, and Maxout 5.798217 (Fig. 2 (Fig.3). The results show that the RMSE tends to decrease when the number of neurons reaches 40-60. Based on these experiments, the next step is to find the optimum hidden layer from the proposed model, which lies between 40-60 neurons. In other words, 50 neurons should be used, with ReLU, Maxout, and TanH, with or without Bagging. The hyperparameter used is L1= 1. T , L2=0, epsilon=0.99, and rho = 1. U . The test results from H2O with 50 Neurons architecture (Fig. 4) and H2O+Bagging with 50 Neurons architecture (Fig. 5) show that RMSE values vary on different hidden layers. These results depend on the activation function. The model with the Maxout activation function tends to produce lower RMSE in 1-2 hidden layers but is unstable, while ReLU with 6-9 hidden layers and TanH with 4-6 layers, each of which used 50 neurons. In order to achieve a non-complex architecture model with low computation cost, the experiments with 6 hidden layers and 50 neurons were not continued further. The next study is done for the architecture with one hidden layer with 50 neurons using ReLU and TanH activation function. This experiment shows that TanH activation function performs better than ReLU but is insignificant. The error rate decreases from epoch 100, starts to flatten at around epoch 100-500, and is relatively stable at epoch 500-800. This shows that learning outcomes begin to be effective at epoch 500 and reach their optimal value at epoch 750-800 (Fig. 6).

B. Optimize Architecture of Concrete Slump Model
The next experiment was carried out to obtain the optimal architecture of Concrete Slump Model. This experiment was performed on H2O framewprk with or without Bagging with the number of neurons tested was 10, 20, 30, 40, 50, 60, and 80, 100 at epoch 800. The first test was carried out on 1HL10N architecture, translating to 1 hidden layer and 10 neurons. Based on the Cross Validation algorithm with a partition value k = 10, training was done with data from (k-1) or 9 partitions. The result of training with the maximum epoch of 800 has the lowest error value of 6.542. Bagging has the ability to increase stability and avoid overfitting by reducing variance and noise. The role of Bagging has been proven across all architectures with lower RMSE values than without Bagging.

1) The effects of TanH vs ReLU
Both activation functions show trends of error reduction when the number of neurons is increased on H2O architecture (Fig, 7) and H2O+Bagging architecture (Fig, 8). In ReLU case, the sudden jumps can be observed in neuron size of 30. This means, ReLU requries larger number of neurons to achieve smaller errors.  Though the training process of both models is promising, we see the error increases as the number of neurons is also increased for ReLU case (see Fig. 9 and Fig.10). This is a sign of overfitting.  2) The effects of TanH with Bagging In TanH case, the model does not show an indication of overfitting. By using Bagging, the error variance becomes smaller allowing us to use a lower number of neurons for training (Fig. 11). This also means that we can reduce the computational cost further. By evaluating the validation data ( Fig. 12), we can conclude that we can use a lower number of neurons if we use Bagging for training which might not be obvious in the training process.

3) The effects of ReLU on Bagging
Without Bagging, ReLU requires larger neurons until it reaches the optimum value (see Fig. 13). The biggest improvement can be seen if Bagging was used to train the model. The training results improve, especially for the lower number of neurons. Without Bagging, the difference between a number of neurons is more prominent compared to with Bagging. This shows that Bagging can help avoid overfitting, as shown in comparison with ReLU model (Fig. 14). Even though both models show a sign of overfitting, the model with Bagging successfully dampened the effects.

C. Model Evaluation Result
The best-performing concrete slump model was obtained by using 7 input neurons, 1 hidden layer with 20 neurons, and 1 output neuron using H2O Deep Learning Framework with Bagging and TanH activation function. The training parameter was L1 = 1 T , L2 = 0.0, max weight = 10.0, V = 1 U , W = 0.99, and epoch = 800 resulting in training RMSE = 2.272.
Concrete mix design data used as testing data to evaluate the model is unseen data by the model. In order to collect the data, we created a new mixture design using the fine aggregate which originated from Merapi Volcano, the hills of Muntilan, and Kalioro Central Java. The coarse aggregate was obtained from Pamotan, Jepara, Semarang, Ungaran, and Mojosongo Boyolali Central Java. The cement was used by Gresik and Holcim products, and the water was from Tembalang, Semarang, Central Java. All the previously mentioned materials were used to produce several mixed designs. The test mixture was molded, cured for 28 days, then manually tested using compressive test machine, resulting in compressive strength of 25 MPa. The manually tested mixture data was used to test the computational Concrete Slump Model. The evaluation shows that the model was able to achieve RMSE 0.568. This test shows that the model could perform well enough, shown by a small error. Hence, this model can be further developed into virtual concrete slump test/ application software.

D. Discussion
From the test series above, both models with TanH and ReLU gives acceptable results regarding the Concrete Slump Model. Both are able to predict with RMSE lower than 3 MPa. With TanH, no signs of overfitting were detected, but when Bagging was introduced to the model, the RMSE continued to drop, achieving better performance across multiple neurons. On the other hand, ReLU shows signs of overfitting. By increasing the number of neurons in 1 hidden layer architecture, we introduce a more complex model, increasing the model variance. As the model variance exceeds the data requirements, the model will fail to predict new data. Hence overfitting occurs. It can be seen from the results that once the number of neurons for ReLU model increases, the model continues to produce higher and higher RMSE. In conclusion, TanH performs better than ReLU on the slump dataset because of overfitting.
To further study the effects of Bagging on an overfitted model, we will look at the results of the ReLU model. We have established that the ReLU model suffers from an overfitting problem. After introducing Bagging to the model, the error was reduced to 2.5 RMSE approximately (50% lower) compared to the model without Bagging. Based on this experiment, we concluded that Bagging can significantly reduce the effects of overfitting.

IV. CONCLUSION
After a series of tests performed in the concrete slump dataset, each model can predict new data with error to 7 RMSE. TanH activation function performed better than ReLU in predicting concrete slump value. Furthermore, the signs of overfitting can be observed using ReLU activation function. As the results suggest, the overfitting effects can be reduced significantly by using Bagging. The next research this Concrete Slump Model can be applied to build slump design practicum equipment/ application software in a virtual laboratory for civil engineering vocational students.