ON

— Handwriting recognition is a study of Optical Character Recognition (OCR) which has a high level of complexity. In addition, everyone has a unique and inconsistent handwriting style in writing characters upright, affecting recognition success. However, proper pre-processing and classification algorithms affect the success of pattern recognition systems. This paper proposes a pre-processing method for handwriting image recognition using a convolutional neural network (CNN). This study uses public datasets for training and private datasets for testing. This pre-processing consists of three processes: image cleaning, skew correction, and segmentation. These three processes aim to clean the image from unnecessary ink streaks. In addition, to make angle corrections to characters in italics in their writing. The model testing process uses image test data of handwriting that are not straight. There are three images based on the inclination angle: less than 45 degrees, equal to 45 degrees, and more than 45 degrees. Picture cleaning removes unnecessary strokes (noise) from the image using a layer mask, whereas skew correction changes the handwriting to an upright posture based on the detected angle. The pre-processing model we propose worked optimally on handwriting with a skew angle of fewer than 45 degrees and 45 degrees. Our proposed model generally works well for handwriting with fewer than 45 degrees skew with an accuracy of 88,96%. Research with a similar scope can continue to improve optimization with a focus on algorithms related to analysis layout studies. Besides that, it can focus more on automation in the segmentation process of each character.


I. INTRODUCTION
Automation processes are comprehensively implemented in various fields, especially for document digitization and printed or handwriting recognition, commonly known as OCR.OCR is a mechanical or electronic conversion of handwriting or printed images into machine-encoded text [1], [2].OCR is one of the most challenging research topics in pattern recognition and computer vision.The OCR system can enhance and automate several applications involving interaction between humans and computers [3], [4].There are two types of text writing, namely printed and handwritten [5], [3], [6].However, handwriting is more complicated because there are more variations in writing from different people.Therefore, the role of pre-processing in the OCR system, especially in handwriting using a private database, determines success in recognizing the typeface [7].Beier and Oderkerk [8] discuss the effect of the contrast of strokes on three fronts with different thicknesses.The difference in strokes on the font influences letter recognition.Fonts written in bold have a lower recognition rate than fonts written in medium and thin strokes.Recognizing handwriting is challenging because of its complexity compared to print writing.Some of these challenges are due to several things, such as the variety of handwriting images influenced by the inclination of writing [9], the tilt of the tool (camera), and writing imperfections (there are characters not perfectly continuous) [10], the writing media is not always clean, the tool (camera) that is highly dependent on conditions (lighting, stability, and camera resolution).
In general, the OCR system consists of six main stages, including image acquisition, pre-processing, layout and segmentation analysis, feature extraction, recognition, and post-processing.The image acquisition process can use data online (public database) or offline (private database).The handwritten images from the public database are in a condition ready to be processed to the next stage.Researchers have previously performed image processing from the public dataset.We often find handwriting with lots of bold and thin ink streaks on the text's outline, and the character position needs to be upright (tends to be slanted to the right or left).Offline image acquisition on the OCR system has more complex challenges than online [11].The use of techniques in pre-processing for each case is different because of the uniqueness of each.Therefore, the selection of the proper technique cannot be generalized in all cases.Some researchers have examined issues related to skew detection in some types of writing, either Latin, Arabic [12] [13], [14], or Indian [15].For example, Aachen University research explains various levels of pre-processing, including contrast normalization, median filtering, slant correction, and size normalization [15].The results indicated that combining contrast normalization, median filtering, slant correction, and size normalization resulted in significantly lower errors than the other combinations.Similar studies demonstrated efforts to level the skewness of handwritten images [16].The method begins with the establishment of a baseline and a top line.The line is then used as a reference to fix the writing's slope.
Pande et al. [5] used six pre-processing techniques on handwritten Devanagari images: boundary identification, cropping, transformation application, noise removal, angle correction, size normalization, and image sharpening.Meanwhile, Qaroush et al. [2] focused their study on layout analysis by applying segmentation algorithms in extracting text lines and segmentation per word in recognition of Arabic print.The pre-processing includes contrast enhancement, binarization, noise removal, skew correction, thinning, and orientation detection.Therefore, it is necessary to do optimization to improve the OCR system's performance, especially techniques for cleaning handwritten images with a tilt angle.
On the other hand, in addition to the pre-processing stage, classification methods also have an essential role.The previous research widely used CNN as a classification algorithm, including CNN [17], [1], [18], [19], and the hybrid use of support vector machine (SVM) on CNN architecture [14].There are various components of CNNs: convolution, stride, padding, CNN characteristics, and convolutional formula.It is essential to require the pre-processing algorithm to clean the text strokes from noise and detect and correct the skew of handwriting.The correctly pre-processed will affect handwriting recognition results.Therefore, this study focuses on proposing a method at the pre-processing stage for handwriting recognition cases.The pre-processing stage consists of three algorithms: image cleaning, skew correction, and segmentation.In addition, to find the best accuracy of handwriting recognition results using CNN with three letter slopes.This paper is structured as follows: Section 2 describes the proposed model, followed by Section 3, containing the results of the analysis and discussion of the proposed model, and finally, Section 4, the conclusion of this paper.

II. MATERIAL AND METHOD
This research process includes two main steps: training and testing.Both stages also apply to pre-processing the dataset, but the techniques used differ.For example, the training stage in this study uses public data, while testing uses private data.As a result, the data quality from the two sources is different.Therefore, this research proposes a method for the preprocessing stage at the testing stage, as shown in Fig. 1.

A. Dataset
This research used a public dataset as the training data from the Kaggle dataset of 150,000 handwriting images (Vaibhav datasets for training and landlord dataset for testing).The dataset contains handwriting images of capital letters A-Z and a-z.Each alphabetical category has various numbers, with a minimum of 2000 images per category.The type of image from the dataset is an RGB image in jpg format with a size of 32x32 pixels.Image data is grouped based on the type of category, which is category type, called the labeling process.As for the testing dataset, this research uses private data written by the researcher on paper with a standard black pen.The handwriting consists of upper-and lower-case letters with 550 data images.The testing data image is transformed into three variations of the skew angle: (1) the image with a skew angle of fewer than 45 degrees (clockwise/CW or counterclockwise/CCW) or ( < 45 CW/CCW) (2) the image with a skew of 45 degrees or ( = 45 CW/CCW) and (3) the image with a skew more significant than 45 degrees or ( > 45 CW/CCW).

B. Pre-processing.
The difference in the data used for training and testing needed a pre-processing phase to accommodate different techniques to test them.The training data uses standardized public datasets under normal conditions.Therefore, it requires the conversion operations carried out from RGB to grayscale images for the pre-processing [16], [20].While for the preprocessing stage for data testing, we proposed a model to normalize the skewed handwriting image.The model uses image cleaning, skew correction, and segmentation.Fig. 1 shows the details of these three processes.
1) Image Cleaning: Image cleaning aims to clean other objects apart from handwritten characters to get clean alphabet strokes from noise using the layer mask.Some stages of getting the layer mask are smoothening, adaptive, threshold, and dilation.Then the masking process is conducted between the (original) image and the layer mask to get an image clean from noise.
 Smoothening.Smoothening is a technique to find the average value from a set of pixels (of an image) to reduce noise and sharpness at the edges.Gaussian blur is one of the algorithms used for image-smoothing operations [21].Equation (1) describes the equation of the gaussian blur algorithm.
= cutoff frequency = sampling rate = standard deviation Fig. 1 The proposed model for handwriting recognition  Adaptive threshold.Thresholding is a good segmentation technique for images with a significant difference in intensity values between the background and the main object [22].In its application, thresholding requires a value used as the limiting value between the main object and the background, called the threshold [23].Based on the existing techniques, there are two classifications of thresholding: global thresholding and local (adaptive) thresholding [24], [25].When requiring some parts of the image object, some levels or threshold values can be used [26], [27].Local thresholding aims to deal with the difficulties due to the variation intensity with a threshold value determined from each pixel based on its grayscale value.Therefore, this approach is called an adaptive thresholding algorithm. Dilation.Dilation is a morphological transformation that combines two sets using the vector summation of the set elements.Dilation is merging background points into object parts based on the structuring elements used.Two ways to perform this operation are as follows [28].
 Masking.Masking removes part of the object in the original image using a layer mask by changing the pixel values.Using smoothening, adaptive threshold, and dilation techniques to produce layer masks.The layer mask has the exact dimensions as the original image.
A layer mask is a grayscale image consisting of dark and light areas.The dark area shows the result of the image deletion process, while the light area shows the result of the image saving process.
2) Skew Correction.Handwritten text is often not in standard condition.For example, most handwriting tends to lean right or left.Such a condition needs to be corrected to write in the standard condition.This research proposes two stages to correct handwriting angles: histogram and transformation [12], [29].3) Segmentation.The segmentation consists of two processes: the region of interest (RoI) determination and the cropping process.Assuming that the image processed in this process has gone through the pre-processing stage so that it is a clean and minimal noise image, the ROI determination process consists of four stages as follows: (1) scan areas that are more than 10 pixels wide (under 10 pixels are considered noise); (2) make a rectangle around the area; (3) the RoI area is produced by cropping the created rectangle and (4) resize 32*32 pixels.

C. Classification.
At this classification stage, the testing data type of character is identified based on the model resulting from the previous training process using a classification algorithm.This research uses a deep learning algorithm, especially a convolutional neural network [12], [22].CNN is the Multilayer Perceptron (MLP) development designed to process two-dimensional data.CNN is a deep neural network that has a deep network and can handle learning on image data.In the case of image classification, MLP is less suitable for use because it does not store spatial information from image data and considers each pixel as an independent feature resulting in poor results.The Neural Network accepts the input data as a single vector at the network layer.

D. Evaluation
The researchers measure the model produced by conducting testing and evaluation.A model performance produced in the classification model describes how far the model can classify the data in specific classes.The confusion matrix is one method to measure model performance.Accuracy is a way commonly used to measure classification system performance.The accuracy calculation estimates how effective the algorithm is by showing the actual value probability and the whole class labels.In other words, accuracy examines the algorithm's overall effectiveness [31].

III. RESULTS AND DISCUSSION
The testing image data used in the written testing stage is in word form, not alphabetic character form.For the next step, the pre-processing is conducted on the image using the proposed model.First, image cleaning is cleaning annotations on images other than handwriting from letter characters.The working way of Image Cleaning is by making the layer masking used as a filter to delete the noise and the part not required for the image.They made layer masking by smoothing and removing noise in the image using the Gaussian blur method-dilation functions as connecting two or more strokes (objects) whose positions are close to the picture.The next step is masking for selecting areas where each image's contour is selected and calculated.If the site is relatively lower than the average area of the other contours, then it is removed.Process it to produce an area mask to crop the original image.Fig. 2 shows the stages conducted in the image-cleaning process.
Second, Skew correction aims at correcting the position of the image that is not perpendicular (inclines) by automatically conducting the rotation process.The image of the handwriting character in an upright binary image has a unique feature in which the difference between the upper and lower limits of the histogram is relatively high compared to if the writing is not upright/skewed.This function detects the inclination by reading the histogram image data.The image with the lowest skew differs between the highest histogram's lower and upper limits.The difference between the lower and upper limits of the histogram is called the skew score.To calculate the skew score using equation ( 3  The third stage is called the segmentation process, which consists of two processes, determining the region of interest in the image for further cropping.Cropping is the last stage before being tested on the dataset.The writing character of the image is detected, separated, and cropped per character. The segmented image is a separate image containing each image consisting of one character, as Fig. 4 shows.Fig. 4(a) is a single image that has passed the RoI search stage (marked with a green box), while Fig. 4(b) is five separate cropped images (based on the green box) and resized to 32*32 pixels.The pre-processed image is a collection of images, each of which contains one handwriting character.This image is then processed using the CNN Model and classified into one type of letter.This research uses a CNN architecture with detailed specifications, as shown in Fig. 5.There are a lot of previous studies that have examined OCR but have different goals and backgrounds.In general, there are two types of writing in the scope of this research, namely print [9], [2], [12], [32] and handwritten [33], [5], [26], [34].One of the exciting issues for handwriting recognition is how the system can recognize and correct the angle of inclination of writing in the form of characters and text [35], [5]or the inclination of scanned documents [9], [32].OCR research to correct document skew has a higher accuracy value than characters with different slant sizes.The paper's position that is not upright during the digitization process causes the tilt.Meanwhile, algorithms that focus on overcoming the slope of the text or its characters [35] have a higher complexity.In addition, there are also limitations in correcting the character's position based on the tilt angle correction.The system can determine the tilt angle of the object.However, using the proposed model has an accuracy value of less than 90%.One of the reasons is the difficulty in correcting the angle of inclination of each handwritten character.The experimental results show that the proposed algorithm can work optimally on characters with a slope angle less than or equal to45 CW/CCW.Details of differences in proposed models with previous research are in Table 1-optimization of the algorithm by focusing the study on the study of the analysis layout.

IV. CONCLUSION
This paper proposed a modification of pre-processing for data testing to enhance the performance of handwriting recognition written in italics using CNN.At the preprocessing stage, the algorithm has three modifications: an algorithm to clean the image from noise in the form of unnecessary ink streaks, an algorithm to detect and correct slanted letter positions in writing, and an algorithm to segment words into several letters or characters.Furthermore, the algorithm can work well on handwritten characters with a slope of fewer than 45 degrees CW/CCW and equal to 45 degrees CW/CCW.As a result of this research, the handwriting recognition model suggested in this work has an accuracy rate of 88.96 percent.An angle of 45 degrees increases accuracy from 23.89% to 44.11%.Besides, images with fewer than 45 degrees increase the accuracy from 84.56% to 88.96%.Therefore, our proposed model works well for handwriting with a skew of fewer than 45 degrees CW/CCW.Therefore, our proposed model works well for handwriting with a skew of fewer than 45 degrees CW/CCW.Research with a similar scope can continue to improve optimization with a focus on algorithms related to analysis layout studies.

95%
Printed Arabic.Propose a segmentation algorithm to overcome overlapping characters Pal et al [34] 98,3% Devanagari and Banglahandwritten.Propose an algorithm to recognize the handwriting on form fields Abuhaiba [12] 94% Arabic-printed.Fails to detect skew if the angle is greater than10 in the printed document Alaei et al [32] 93,2% a) Changing all background points adjacent to the boundary point into the object points.b) Changing all points around the boundary point into the object points.The essential morphological operation of dilation math is conducted based on Minkowski algebra, as shown by Equation (2)


Histogram.An image histogram is a graphic display of the pixel intensity in an image grouped based on different levels of pixel intensity values.The histogram of the image is in two dimensions.The x-axis represents the pixel intensity value, and the y-axis represents the frequency of a pixel intensity value[30].The histogram formation process can be conducted by checking each pixel value in the image, then counting the number of pixel values and storing them in memory. Transformation.The transformation process aims to help get an image with a straight position based on the angle of correction results obtained from the previous process.The correction angle is obtained based on the best score value from the histogram for every change in angle after the rotation process.The pseudocode below shows.Initialize Image For Angles in range -44 to 44, step=1: Image = Rotate(Angle) Hits = histogram of Image Score = ([top value of List] -[bottom value of hits])**2 Insert Angles value to ListAngles Insert Score value to List Score MaxScore = MAX value in ListScore MaxAngle = Angle value of MaxScore SkewedImage = Rotate Image to MaxAngle value

Fig. 2
Fig. 2 Some stages conducted in the image-cleaning process The test results show that the image of an upright character inclines between45 CW/CCW is the best data source because the score results do not show ambiguous/double results and avoid errors if the handwriting character image is wrong.The character image display that still needs to correct for the slant shows in Fig 3.

Fig. 3
Fig. 3 Example of image display before and after the process of skew correction

Fig. 4 (
Fig. 4 (a).Image detected from the region of interest, Fig. 4 (b) Image of cropping and resizing based on the previous RoI

Fig. 5
Fig. 5 Architectural specification from convolutional neural network There are three types of test data: (1) Test data with a skew of 0 degrees; (2) Test data with a skew of 45 degrees; and (3) Test data with a skew of 90 degrees.Next, we tested all the test data using the selected CNN model.This research compares the accuracy of the test data with and without preprocessing algorithms with image cleaning and skew correction.Next, we carry out the classification process using the CNN model.Fig 6. shows a comparison of the accuracy of the three types of test data.Based on the test results, our proposed preprocessing model works optimally on handwriting with a skew angle of fewer than 45 degrees and 45 degrees.An angle of 45 degrees increases accuracy from 23.89% to 44.11%.Besides, images with fewer than 45 degrees increase the accuracy from 84.56% to 88.96%.Therefore, our proposed model works well for handwriting with a skew of fewer than 45 degrees CW/CCW.

Fig. 6
Fig.6Test result pre-processing vs without pre-processing based on its angle

TABLE I COMPARISON
OF PROPOSED ALGORITHMS WITH PREVIOUS LITERATURE