ON INFORMATICS VISUALIZATION

— Welfare robots, as a category of robotics, seeks to improve the quality of life of the elderly and patients by availing a control mechanism to enable the participants to be self-dependent. This is achieved by using man-machine interfaces that manipulate certain external processes like feeding or communicating. This research aims to realize a man-machine interface using brainwave combined with object recognition applicable to patients with locked-in syndrome. The system utilizes a camera with pretrained object-detection system that recognizes the environment and displays the contents in an interface to solicit a choice using P300 signals. Being a camera-based system, field of view and luminance level were identified as possible influences. We designed six experiments by adapting the arrangement of stimuli (triangular or horizontal) and brightness/colour levels. The results showed that the horizontal arrangement had better accuracy than the triangular method. Further, colour was identified as a key parameter for the successful discrimination of target stimuli. From the paper, the precision of discrimination can be improved by adopting a harmonized arrangement and selecting the appropriate saturation/brightness of the interface


I. INTRODUCTION
In the recent past, population aging has become a global phenomenon as fertility reduces and life expectancy increases [1], [2].According to a report by the United Nations, the oldage dependency ratio (the number of people aged 65 years and above relative to persons aged 20 to 64 years) will double globally in the next decade [3].The aging population inevitably requires more medical attention and support for daily living through caregivers.It is clear that the demand for long-term care will increase further and that the lack of caregivers will become a significant problem in society.Thus, to reduce the burden of not only long-term care recipients but also caregivers, support for the independence of long-term care recipients is required.The development of welfare robots to support individuals who have difficulty in daily living is attracting attention, especially among those needing longterm care [4]- [6].Welfare robots are a prospective solution that seeks to restore individualism and self-reliance of the care-receivers through mobility, feeding, and environment control, among others.Control mechanisms paired with support equipment and or communication schemes, i.e., human-machine interface (HMI), can handle or lessen the severity of the challenges experienced by the elderly and disabled [7]- [12].
In the recent past, research in the human-machine interface using biopotential signals targeting people with disabilities has gained traction for enhancing the users' quality of life and self-reliance and reducing the burden on the caregiver.Biosignals are present in any human being in varying forms.The commonly used bio signals as user input signals in HMI are electromyography (EMG) [13], electroencephalography (EEG) [14], [15], and electrooculography (EOG) [16].EOG results from the potential difference resulting from the movement of the eyes, EEG from the brain's electrical activity, and EMG results from the contraction of muscles.EOG signals have been implemented in various research areas.EOG was applied to control mouse functions [17].Further, the EOG signal was used in wheelchair control to help disabled people in [18] and [19].On the other hand, EMG has been used in various areas of welfare and wellness studies [20]- [22].
In this study, we focus on EEG so that even people with severe mobility and, to an extent, are locked in the state can operate assistive devices.EEG is generated when two electrodes are attached to the human head, generating a slight potential between them.Background EEG has different characteristics depending on the frequency range and can be classified into four major categories: delta, theta, alpha, and beta waves [23].The delta wave with a frequency of 1-3 Hz, Theta waves with a frequency of 4-7 Hz, Alpha waves with a frequency of 8-13 Hz, and Beta waves which generally refer to all waves with a frequency above 13 Hz.
Naturally, electrical activity in the brain occurs in response to stimuli such as light and sound and movements such as bending and stretching of the fingers.These are called eventrelated potentials (ERPs) [23].The amplitude of ERPs is smaller than that of the background EEG, and to confirm the waveform, it is necessary to additively average the EEG data obtained from many trials, aligning them to the event's occurrence time.As the number of additive averages increases, the background EEG flattens, and the ERPs become apparent.
ERPs have different components.Waves that are negativegoing evoked potential are called "N (negative)" e.g., in N100, N170, etc., and waves in the positive-going potential are denoted with "P (positive).",e.g., P100 and P300.They are distinguished by numbering them in the order of appearance or by attaching a standard vertex latency (in milliseconds).In this article, we will particularly focus on P300 ERP, which was first discovered by Sutton et al. [23].P300 is often observed in the Oddball task, i.e., tasks requiring mental judgment, such as selection and understanding.P300 is maximal in the centre of the parietal region, and the latency may be extended from 300 -900 ms.In general, the latency is shorter if the task is simple and longer as it is difficult and takes time to make a decision [24], [25].
Research utilizing EEG and P300 has targeted many use cases.The P300 has shown a promising future as an alternative way to build communication between humans and machines.Although the P300 is difficult to learn compared to the Thought Translation Device (TTD) [26], it has better accuracy than Language Support Systems (LSP) [27].The usability has high effectiveness and satisfaction [28] [29].For quadriplegia, the interface that uses brain waves for operation input is a few means to convey one's intention.A keyboard input system, "P300 Speller," that uses the brain wave/eventrelated potential, P300 has been put into practical use [30].This system blinks the characters displayed on the display sequence and determines the character the user is paying attention to from the P300 appearance timing.In addition, as an advanced form of the P300 Speller, research is underway to present life-related conceptual diagrams instead of letters and control the robot environment according to the results selected by the user [31], [32].
However, the P300 interface for environment control that can be customized and used for different living environments for each user has not yet been realized.Therefore, in this research, we aim to realize a P300 interface that corresponds to the living environment of each user by combining camera image analysis with deep learning.It is a system that recognizes a plurality of objects in the live environment, presents the images on display, and dis-criminates the object to be watched based on the P300 generation timing.Since the image presentation method greatly influences P300 induction, we will compare and verify the image presentation method as a performance verification.
Deep learning is a general term for machine learning (ML) that uses a neural network model with many layers.Machine learning refers to technology that has evolved as a field of artificial intelligence since the late 1950s [33].This study investigates a man-machine interface using brainwaves combined with object recognition.A pre-trained convolutional neural network (CNN), AlexNet, is used to recognize surrounding objects and present them to a visual interface in MATLAB.From the pretrained network trained with over 1000 objects, it is possible to customize the network to accurate object discrimination with a small amount of data and in a short time.

II. MATERIAL AND METHOD
Figure 1 shows the outline of the proposed system.The system comprises the brain-machine interface that entails an EEG recording device and wireless connectivity to a Personal computer interface.

A. EEG Recording
We created a visual stimulus presentation system to induce EEG P300 by flashing the three recognized objects in different positions and representations.For EEG measurement, we used a wireless biometric device, Polymate Mini AP108mB as shown in Fig. 2(b), which has eight electrodes and two channels of external input operated at a sampling frequency of 500 Hz and a bandpass filter of 0.15-30 Hz.The target location for electrode placement were Cz and Pz according to the international 10-20 method, and the reference electrodes were A1 and A2 in both earlobes.The electrode placement and setup of the sensors is shown in Fig. 2(a).

B. Brain-Computer Interface
The user is presented with a desired object to choose amongst three and the EEG signal elicited for each object is recorded.The visual interface comprises of a camera (HYUNDAI-DEGITAL-V33) connected to a PC (NEXTGEAR-NOTE i7941PA1, Core i7, NVDIA GeForce RTX) with MATLAB software (R2020a) installed.The camera passes the experiment target objects to a trained ML model for classification in MATLAB.In the experiment, three types of objects were used for testing: mouse, glasses, and a key chain as shown in Fig. 3.The ML model detected the objects and drew a bounding box and classification accuracy.

C. Experimental Protocol
In total, on experiment takes 150 seconds with 10 seconds of waiting time.Data is recorded from 300 ms before the stimulus presentation to 500 ms after, thus generating a total of 800 ms worth of data per frame, as shown in Fig. 4. The 300 ms baseline is subtracted from target data to formulate artifact and baseline free data.The resulting data is average within the 50 repetitions of the same object.
As shown in Fig. 5, we conducted six variations of experiments (Expt.A to F) with three stimuli (Stimuli I to Stimuli III).The experiments details are described below.5 is displayed.In this case, the coloured dot blinks with at a rate of 1 Hz and a duty cycle of 100 ms against a plain background in random order for 50 repetitions in each location.The brain waves of the subject watching the visual stimulus are recorded for further analysis.
2) Experiment B: In this case, the camera input is fed to the ML model for classification.The model detects the objects in the scene, places a bounding box over each detection, and places a corresponding blinking pink cursor (dot) at the object's center point.The blinking stimuli is thereby overlaid on the detected object, as shown in Fig. 5. Similarly, the blinking order is random, and the subject's EEG data is recorded for further analysis.The setup is hereby selected to identify any effects on the detection of P300 with multiple target objects in view.
3) Experiment C: In this case, a plain background and a single coloured cursor (dot) is utilized.The cursor appears in a random order in the middle of the screen, as shown in Fig. 5.

5) Experiment E:
In experiment E, we displayed all three target objects as an array with a blinking bounding box for each object, as shown in Fig. 5.The orientation is supposed to investigate if there are any differences derived from how objects are ordered and contrasted to Expt.B.

6) Experiment F:
In this case, we altered the camera lighting to explore if there would be any effects on the image quality.The results are contrasted with those of Expt.D.
Five participants (4 males and 1 female) took part in the experiment as described.In every session, a target stimulus (the object to be selected) was indicated.The recorded data were analysed to determine how well the stimuli is discernible from the three stimuli presented.The analysis tabulates the results as either failed, uncertain, or successful discrimination.

A. P300 Component
As an analysis method, the P300 component is confirmed from the EEG data by additive averaging processing to remove background EEG and other noises from the measured signal.From this, averaging processing for each of the 50 visual stimuli is displayed as a 3D plot shown in Fig. 6.The plot shows the averaged data from the baseline of -300 ms to 0 ms for each stimulus, which enables us to confirm the voltage peak independent of the baseline position.Positive discrimination is visually confirmed based on the amplitude strength of the resultant EEG signal around 300 ms, i.e., P300.
We further confirmed the presence of P300 from the stimuli as shown from EEG derived from Pz and Cz electrode locations.In this case, the subject was instructed to select stimuli 2 in experiment 1.From the figures, two electrode locations, Cz and Pz, gave similar results.In cases with differences, Pz had higher resolution than Cz in the generated scatter plot shown in Fig. 7 (a & b).As such, the reported results maintain the use of Pz electrode.

B. Signal Discrimination
Discrimination was confirmed visually from the resultant plots.Fig. 8 (a) to (c) shows the discrimination of different stimuli of subject 1.From the figure, discernible amplitudes are produced in target stimuli compared to the remaining stimuli.The visual inspection of results was grouped as either successful or failed discrimination.In case of a failed discrimination, the amplitudes of the elicited signal were indistinguishable as shown in Figures 9 (a 1. From this, we confirmed the detection of P300 with varying setups, with and without background images.Thus, the discrimination of single vs. multiple objects per scene did not affect the detection of P300.Colour, on the other hand, was found to affect successful discrimination, and this is as captured as failed discrimination.In the case of failed discrimination, this happened in the usage of background target objects.The failure was attributed to a failure in recognizing the blinking cursor due to colour mismatch.As seen, the colours of stimuli 2 and 3 are closely related to the blinking cursor (red dot), thereby introducing some difficulty in subject 2. This is easily remedied by using objects that have sufficient colour contrasts and shapes.

Subject
Experimen t Target Cz Pz If the position is set, it can be said that P300 discrimination is possible using the camera image with high probability.The problem is that depending on the arrangement of the object, there are individual differences in the occurrence of P300 due to the human visual field, and in the case of visual stimuli to the same location, it may be confused with non-target stimuli.We conducted experiments E and F to investigate the effects of object positioning and colour variation.In E, we placed three objects in a horizontal row and contrasted this to experiment A with triangularly placed objects.In F, we investigate variation in color through suppressing the brightness of the colors of conspicuous objects.From the figure, the presentation of stimuli in second instance Fig 10(b) was more discernible than the first experiment in Fig 10(a).
Table 2 concludes the experiment E and F from five subjects.Each subject had two random stimuli in each experiment.The results show that the Pz is sensitive to the layout of stimuli and detection on Cz has no effect with the light contrast.

IV. CONCLUSIONS
This research recognizes the surrounding environment using a USB camera and object recognition by deep learning.We create a visual stimulus presentation system for P300 induction from the camera image to customize it according to each user's life to realize a possible P300 interface.In Experiment A, P300 was generated by presenting visual stimuli using camera images of objects existing in the living environment and the method of presenting visual stimuli to the same place for an interface that can be used even by people who cannot even move their eyes.It was verified whether it was possible to confirm.As a result, both are possible, but it is difficult to distinguish the objects above and below due to the influence of the range of the human field of view, especially when presenting to the same place when there is a remarkably bright object.It was confirmed that attention was directed, and discrimination was affected.Therefore, in Experiment B, a presentation method in which objects are rearranged in a horizontal row to solve the visual field problem and a presentation method in which the saturation and brightness of all objects are reduced to solve the influence of the presence of bright objects.The experiment was conducted in.As a result, we succeeded in improving the object discrimination accuracy by P300 compared with the discrimination result of Experiment A. In other words, as a customizable interface that combines object recognition with deep learning, if it is a method of presenting an object in a horizontal row or presenting an image of an object with uniform saturation and brightness, P300 discrimination with high accuracy has been shown to be possible.In the future, robot control will be considered as the interface's output.If a robot that can select the necessary object from multiple objects and grasp and move the identified object is realized, it will help the quadriplegic person to lead an independent life.

Fig. 1
Fig. 1 Experimental setup showing brain-computer interface visualization system

Fig. 3
Fig. 3 Target objects with ML labels utilized in the experiment.

Fig. 4
Fig. 4 Data recording and blinking routine

Fig. 5
Fig.5Visual presentation of stimuli for Experiment A to F 1) Experiment A: In the first experiment, the visual stimuli shown in Fig.5is displayed.In this case, the coloured dot blinks with at a rate of 1 Hz and a duty cycle of 100 ms against a plain background in random order for 50 repetitions in each location.The brain waves of the subject watching the visual stimulus are recorded for further analysis.
D: In this case, the program displays the images of target objects with the same blink rate and duty cycle as indicated in Fig.4.The Images appear in random each in the middle of the screen, as shown in Fig.5.

Fig. 8
Fig. 8 Successful discrimination of different stimuli from Subject A. (a) Stimuli 1, (b) Stimuli 2 and (c) Stimuli 3 In this case, the subject was presented with stimuli and requested to select stimuli 1.In Fig 9(a), there was no discernible difference between stimuli 1, the target stimuli, and stimuli 3, non-target stimuli.In Fig 9(b), the signal is completely missing with mild traces of stimuli 1 & 2. The overall discrimination is described in Table1.From this, we confirmed the detection of P300 with varying setups, with and without background images.Thus, the discrimination of single vs. multiple objects per scene did not affect the detection of P300.Colour, on the other hand, was found to affect successful discrimination, and this is as captured as failed discrimination.In the case of failed discrimination, this happened in the usage of background target objects.The failure was attributed to a failure in recognizing the blinking cursor due to colour mismatch.As seen, the colours of stimuli 2 and 3 are closely related to the blinking cursor (red dot), thereby introducing some difficulty in subject 2. This is easily remedied by using objects that have sufficient colour contrasts and shapes.

Fig. 10
Fig. 10 Subject 5 comparison of discrimination stimuli 3 (a) Experiment B and (b) Experiment E.

TABLE I DISCRIMINATION
OF EXPERIMENT A TO D FOR THREE SUBJECTS