ON INFORMATICS

— Immersive technologies (Virtual, Augmented, and Mixed Reality) are widely used in cultural heritage for communication, enhancing the visiting experience, and improving learning and understanding. Immersive technologies have found their way into museums and other cultural spaces in various forms and shapes. This work aims to recognize the main forms of immersive technologies and applications in museums and other cultural spaces and provide information on the employed methods, technologies, equipment, and software solutions by conducting a systematic literature review aligned with the PRISMA guidelines. The analyzed literature was collected through a focused search in scientific databases (Scopus, ACM, and IEEE). The relevance to the subject was assessed based on the main technological focus (VR/AR/MR or XR) and the employed technologies. Methods and approaches for realizing their applications were studied and discussed. Thirteen articles were found to meet the selection criteria, of which two focus on VR, six are on AR, two are on Audio-AR, and three are on MR. The results showed that Augmented Reality solutions are preferred for on-site use; Mixed Reality applications started to emerge as Mixed Reality hardware technology became available and Virtual Reality despite being criticized for isolating visitors. The findings cover the existing gap in recent literature and can reveal a set of good practices and innovative ideas for future applications.


I. INTRODUCTION
Digital technologies are adopted in museums and other cultural spaces in order to enhance cultural heritage communication and can be of various forms, from websites and virtual tours [1]- [4] to specialized applications and installations to be used on-site [5]- [9]. Digital technologies are used for cultural heritage communication during pre-visit planning and exploration, during the visit for guiding and enhanced understanding, and even post-visit for recalling memories [10]- [12]. In this work, the focus is set on using digital technologies while visiting a site. Among these technologies, Virtual Reality (VR), Augmented Reality (AR). Recently, Mixed Reality (MR), hereinafter would also be mentioned as immersive technologies (also covered under the initials XR), is considered to be significantly beneficial [13], [14] since they interestingly present cultural heritage, the content is accessed on-demand. The supported interactions allow users to explore and learn in their way and create personalized experiences. Virtual Reality is probably the oldest immersive technology employed in cultural heritage. One of the first categories of Virtual Reality cultural heritage applications is the Virtual Museum category. Game engines have a significant impact on the development of virtual museums [15], [16] as they offer the functionality needed for a virtual environment to avoid its time-cost development from scratch. Virtual museums are excellent platforms for digital representations of physical museums, and they are used either as exact virtual representations or creatively designed to present a new identity. Another category of Virtual Reality applications on cultural heritage is the creation of virtual worlds integrating places of particular interest [17]- [20], e.g., archaeological, historical, religious, architectural, etc. Despite the obvious advantages of virtual museums and cultural heritage virtual worlds in cultural heritage communication, these applications have a noticeable constraint. The usefulness and value are dramatically reduced for on-site experiences. Not surprisingly, visitors and curators prefer to observe and explore the real rather than the virtual when both  [Q1] Which immersive technologies (as well as the approaches, the hardware, and software solutions) are currently employed in museums?  [Q2] Do these works reveal good practices or innovative ideas for future applications? In order to accomplish its mission, recent papers presenting developed projects should be collected and examined upon a set of eligibility criteria. The paper intentionally focuses on the latest works addressing the issues above. Specifically, works published between 1/1/2018 and 31/12/2020 are reviewed. The decision not to include any articles published in 2021 is made as the article search and analysis are performed during 2021, and the coverage of the current year would remain incomplete. Section II presents the technological and research background relevant to this work, the research questions driving this work, and the research methodology. Section III presents the list of reviewed papers followed by the respective analysis, the results, and their discussion to answer the defined research questions. Section IV summarizes this work and points towards directions for future research.

A. Research Background
Works are studying the use of immersive technologies on cultural heritage [13], while others [28]- [30] study the use of these technologies in museum settings, but a broad examination of the deployed technologies appears to be beyond their scope. Bekele et al. [31] provide a thorough discussion on the immersive technologies used in the cultural heritage domain and also provide some Augmented, Virtual, and Mixed reality applications for varying purposes, e.g., virtual museums, sites reconstructions, education, exploration, and exhibition enhancement. In their work, 21 papers, which are chronologically placed between the years 2001 -2016, are included providing a sufficient number of works that cover a relatively long period which gives us important information on the technologies that were used during that period, but unfortunately leaves us with unanswered questions regarding the latest advancements and approaches on the subject.
Carrozzino and Bergamasco [26] proposed a classification of Virtual Reality installations for cultural heritage applications based on their features in terms of interaction and immersion. Additionally, they present four projects in which VR technology is used in museum settings, (i) The Museum of Pure Form, in which, among other solutions, a fully immersive CAVE system accompanied by an exoskeleton is employed; (ii) The Virtual Museum of Sculpture, which consists of a panoramic stereo screen and accompanied by a trackball for user interaction; (iii) The Virtual Exploration of Turandot stage which is a multi-user and non-interactive system comprising a panoramic visualization system; and (iv) The Virtual Livorno which is also a multi-user and noninteractive stereoscopic installation in which viewers are required to use stereoscopic glasses. According to their article, immersive systems provide an enhanced experience as they induce deeper participation and involvement of users but are usually more expensive, more complex to set up, maintain and master, and require large, dedicated spaces. Shah and Ghazali [32] presented a systematic review of digital technology for enhancing user experience in museums. Their work is based on the following research questions: experience?" Their search results in twenty-two articles in which the subject of digital technology in museums is studied and which were published between 2013 and 2017. According to their analysis, ten works from the twenty-two use mobile devices for various types of applications "such as virtual reality, augmented reality, QR code, eye tracking, and 3D display," and three works use Kinect sensor, one work uses Oculus Rift VR headset and Samsung Gear VR headset. One work uses LeapMotion and employs a tangible tabletop, respectively. Two works use Bluetooth Beacons, and other works combine various elements such as Arduino boards, UHF RFID readers, projectors, 3D printed exhibits, touch sensors, audio haptic, and RFID. Regarding the software used for their applications in 2 of the works, the use of the Unity game engine is reported.
Efstratios et al. [33] discuss the creation of Cross/Augmented Reality applications for the Industrial Museum and Cultural Center in the region of Thessaloniki. Their application is designed for mobile devices, and they implement two different versions: one for iOS devices using Apple's ARKit AR framework and the second one for Android devices, which is implemented with the Google's ARCore framework. According to the article, the applications integrate storytelling and gamification elements to improve the overall quality of the experience. Kidd and McAvoy [34] discuss topics related to immersive experiences in cultural spaces regarding their potential, the storytelling aspects, the social experiences, the visitors' engagement, their learning mission, and the related challenges, but the technological aspects remain out of their scope.

B. Data Collection
The data collection process is designed to provide a set of works that have the potential to answer the considered research questions. The present survey is aligned with the PRISMA guidelines for systematic reviews and meta-analysis [35]. The guidelines for the literature research and the analysis were specified and documented in a protocol [36], while additionally, a search for other protocols describing similar research efforts has been conducted without providing relevant results.
The scientific databases Scopus, ACM, and IEEE are searched. The literature search for ACM and Scopus databases was performed from 20 March 2021 to 1 April 2021 and on 8 May 2021 for IEEE. In order to get the most relevant works, the following keywords are selected: "museum", "immersive", "virtual", "augmented," and "mixed". The advanced search of each database is used in order to retrieve the papers that contain the keyword "museum" and any other of the rest keywords in their title. The decision to search only the document titles is made as these keywords are commonly found in other parts of the papers without meaning that they are focused on this subject.
In order to be able to focus on the latest advancements on the subject, the search is limited by publication year, which is set to cover the years 2019-2020. Additional filters had to be added in order to get completed works that provide adequate information on the development of immersive applications and the deployment of immersive technologies in museums. For the IEEE database, the filter "Journals" is applied; the other two options, i.e., "Conferences" and "Magazines" were excluded because it is common for short papers and posters to be found among the results. The same filtering is applied to ACM results too by selecting the "Content-Type: Research Article" this option is considered to be more inclusive as it includes works published in both proceedings and journals while excluding the other available content types, i.e., Abstracts, Posters, Short Papers, Extended Abstracts and Invited Talks, which due to their size do not provide adequate information. The same filtering is applied on Scopus by adding an appropriate parameter limiting the results only to those which are tagged as articles. In Scopus, an additional filter is applied; since Scopus indexes, numerous scientific areas, the "Computer Science" filter is enabled. The exact queries are given in Table I. ( TITLE ( immersive OR virtual OR augmented OR mixed AND museum ) ) AND PUBYEAR > 2018 AND PUBYEAR < 2021 AND ( LIMIT-TO ( PUBSTAGE , "final" ) ) AND ( LIMIT-TO (DOCTYPE , "ar" ) ) AND ( LIMIT-TO ( SUBJAREA , "COMP" ) ) Using the above search queries and filtering, thirty-eight works resulted from Scopus, ten works from ACM, and two from IEEE. In total, 50 works from which the duplicates have to be removed. After removing the duplicate entries, forty-six works remain. Three works of that content are not available in any searched databases, even though they are indexed, are also removed from the dataset.

C. Screening Process -Eligibility Criteria
In order to examine a paper, its content must be comprehensible, and two works which their content is not written in English are excluded as their relevance, and their findings could not be judged with confidence. After removing duplicate, not accessible, and not comprehensible papers, 41 papers remain to be examined (Table II and Fig. 1). These 41 works are examined to ascertain if they are relevant to this review's subject.
In order for a paper to be selected for review, a minimum set of criteria should be met. Criterion No1 (C1) is that the paper should be focused on "Use of immersive technology in a museum setting". Museum-like settings and other cultural spaces, e.g., galleries and archaeological sites, also fall. This criterion is further analyzed in C1.1 "Use of Virtual Reality", C1.2 "Use of Augmented Reality" and C1.3 "Use of Mixed Reality". Criterion No2 (C2) is that the article should present a specific project or implementation and provide adequate information on the type of technology, software and hardware products used as well as a description of the application(s) developed and used in this setting.
Twenty-eight works have been excluded as their primary focus is not on the use of immersive technologies within museums or cultural spaces. Many of the excluded works (9) focus on virtual museums enabling users to navigate in a virtual setting without visiting the physical site. Given the aforementioned, 13 works are selected to be reviewed. It is emphasized that the final number of works to be analyzed may appear small (13), but in comparison with the number of works studied in other reviews [31], [32], [34] and given the time span that it aims to cover, the number of discussed works is relatively large.  Table III shows the selected articles and the main technology category they belong to (their proper citation is given in the References section). Articles not meeting the inclusion criteria were eliminated. The article analysis includes the equipment, software, and employed methods in the following subsections.

A. Articles Analysis
At this point, the included articles are analyzed in order to answer our first research question. The research question is coded as [Q1], Which immersive technologies (as well as the approaches, the hardware, and software solutions) are currently employed in museums?

1) Augmented Reality Applications:
The process has resulted in six articles (S2, S4, S6, S7, S8, and S12) focusing on Visual Augmented Reality which are discussed in the following paragraphs.
S2 presents an Augmented Reality application for a permanent exhibition hosting Tito's Rossini painting. According to the article, the application's goal is to support the visitors' learning experience within the exhibition. It is an Android application created with the Unity game engine and the Vuforia AR engine. The application can recognize a painting when the user targets it with the smartphone camera and then projects the designed augmentations on the smartphone screen. Designers have incorporated a virtual 3D assistant who asks users questions regarding the selected paintings. Information regarding the developed interaction mechanisms is not provided, but given the technologies used, we can safely deduce that all the interactions are performed through the smartphone's touch display. Finally, the authors report that the users who took part in the evaluation expressed positive feedback.
In S5, a problem-based Augmented Reality learning platform is proposed for museum use. The platform consists of an Android application designed for mobile devices with Unity game engine, and Vuforia Augmented Reality engine while also a server-side implementation hosting the exhibition manager and the database is used. According to the article, markers (QR codes) are placed in the environment and scanned by the application through the device's camera to trigger the defined actions. Additionally, Bluetooth beacons are reported, but no further information is provided. According to the article, the application allows users to scan markers to get information about a project or get a problem that they will then have to solve and interact with through the mobile device's touch screen. The article does not provide sufficient information to conclude if the application is completed, and no tests or evaluations are reported.
S6, in contrast to the majority of the included works, does not propose a system that is focused on the enhanced exhibit presentation and which will be used by visitors but a system to be used by the museum staff in order to support the artifact management. Their solution is called "AR-enhanced museum" (AREM) and consists of two main parts, the user application which is designed for mobile devices, and the server-side platform that supports its functionality. The rooms of the museum are equipped with Bluetooth beacons so that the application can recognize in which room the user is, while the ability for manual selection of position is provided in case of rooms where no beacon is available. When the room in which the user is detected, a room model is retrieved from the platform. The user is asked to move to the nearest marker (visual reference marker) placed in the room.
According to this marker, the user's position and orientation are determined to display the augmentations in alignment with the real environment. The augmentations are User Interface elements designed to provide access to the platform functionality regarding the artifacts management, and the user interacts with the application through the touch screen. The focus of this work is primarily set on the accuracy of calculating the 3D coordinates of a point that is targeted through the selection on the touch screen, as managers should be able to add or edit an artifact through the AR system. Their tests provide positive results, and the current implementation limitations are also discussed. An important limitation is that the system allows the user to rotate in the initial position, but any change of the position if the user walks to another point will not be tracked. This limitation is reasonable for a mobile application as 6 DoF (Degrees of Freedom) tracking on mobile devices is not yet fully deployed as it requires more information than those provided by the device's 280 inertial sensors. Regarding the development of the application, it is reported that the rooms are modeled using WebGL while a MySQL database and a PHP application are used to retrieve the rooms' models and send them to the user's device.
S7 presents an Augmented Reality system with natural interaction capabilities intended to "solve the problem of inaccessibility and non-interaction with the cultural heritage artifacts that are unavailable due to their fragile nature or other reasons". Their system consists of an AR application used with a low-cost Head Mounted Display (Google Cardboard) and a server-side service supporting the application's functionality. The application is developed using Unity game engine and Vuforia Augmented Reality engine. In order to implement the hand tracking that will provide natural interaction, a hand-tracking sensor (Leap Motion) with the ability to recognize detailed gestures is used. According to the article, the Leap Motion sensor is connected to the server and placed on a table surface so that the user has to stand in a specific position to control the presented objects with hand gestures. A visual marker is also placed on the table to enable tracking reference. In order to avoid losing track of the tracking image when user rotates the Vuforia's "extended tracking" feature is used. Users initially test the system in a science festival under conditions resembling a museum environment. Specifically, the system is used in a booth where users are children who wear the provided HMD and are asked to observe and interact with the virtual object projected on the table's surface. In addition, three museum curators are also reported to have to use the system. In these initial tests, authors report positive feedback which is used for improvements. It is also reported a set of tests in a gallery with 60 participants where the application was accepted positively, the participants' answers show that the experience was realistic and immersive, and the system is generally considered usable.
The system's description appears to be more of an installation than a system that users can freely use in a museum. Having the Leap Motion controller in a specific position requires that users have to stand in this position. Another disadvantage is that in this case, every exhibit should have its own LeapMotion controller connected to the server, thus increasing the implementation cost and, additionally, that no more than one user can interact with the exhibit. We consider that connecting the Leap Motion to the HMD would provide a satisfactory solution. Authors report that in the initial design, they tried this solution, but the result was not satisfying as the connection of the Leap Motion with an Android device was not officially supported, and they reported poor performance of the application.
In S8, four AR headset interaction techniques are presented and tested for their efficiency in reducing the cluttering because of AR augmentations' density. Their solution is specifically targeted at MR headsets (Microsoft Hololens), and their approach for reducing clutter is by appropriately arranging the augmentations in response to user's attention. The proposed techniques are based on two different methods for overlaying the augmentations, namely: (i) "scale" and (ii) "frame". The first method provides a full view of the objects that users look at while minimizing or hiding the overlays of objects that are not targeted. The second method displays a framed augmentation appearing to exist in front of the targeted object. These two methods can be combined with two interaction techniques, thus providing four different approaches, (i) "gaze"; and (ii) "walk". In gaze, the user's viewpoint direction is used to determine the targeted object, while the walk is based on user's proximity to the available objects. The described methods are tested, and the authors conclude that "Scale is a more effective way to reduce cluttering effects than Frame" while gaze and walk approaches are proposed for different occasions as the authors state that "For seeking tasks, the looking behavior or gaze movements are more efficient to interpret users' attentions" and "For counting tasks, the walking behavior or body movements are more accurate to interpret users' attentions". S12 presents an Augmented Reality application for mobile devices called AR-Sandi, for the Sandi Museum, Yogyakarta, Indonesia. The article provides information on the methodology and the development process that is used. Regarding the implementation, the Unity game engine and Vuforia AR engine are used, while the authors describe the development of two interactive features for rotating and zooming the displayed augmentations. Authors provide information about system testing regarding its acceptance by the test users, the evaluation of its functionality and the application performance, and report positive results. Unfortunately, the article does not provide further information about use case scenarios, User Interaction mechanisms, technological issues, or other challenges that had to be addressed and that would be of particular interest for our review.
2) Audio Augmented Reality Applications: There are two works (S1 and S4) focusing on Audio Augmented Reality, and these are discussed in the following paragraphs. S1 is categorized as Audio Augmented Reality and proposes a system titled "Sound Augmented Reality Interface for visiting Museum" (SARIM). According to its authors, "SARIM is a system allowing the visitor to have emerged in an audio soundscape that consists of ambient sounds and comments associated with artifacts," and the considered article focuses on the model that realizes the aforementioned goal. The SARIM model is composed of three parts, namely, (i) Scene model; (ii) Visitor model; and (iii) Navigation model. The Scene model is used in order to allocate the Audibility Zones, which are configurable areas defining the areas in which users should receive an audio augmentation. The SARIM's model design allows users to interact with the soundscape by using head-based gestures. According to the provided information, five gestures are recognized by the system: (i) the positive user response gesture "sayYes"; (ii) the negative user response gesture "sayNo"; (iii) the "extendedStop" gesture; (iv) the increase audio volume gesture; and (v) the decrease audio volume gesture. The first three gestures are used to enable and disable the Audibility Zones in specific visiting contexts while the control volume gestures appear to apply in all situations. Information on the motions that users have to perform in order to complete a gesture is given for three of the gestures: user's negative response is recognized when user turns his/her from left to right, the increased audio volume is recognized when user leans right and the decrease audio volume when leans left. To use SARIM, visitors have to wear SARIM device. The SARIM device is described as a helmet comprising a Bluetooth stereo headset and an orientation sensor while a computer performing the processing is reported to be needed, which in this case was a portable computer carried by the experimenter. The device does not perform position tracking, and it is reported that the user's position is simulated using Wizard of Oz technique [50]. The authors present a subjective user evaluation in which SARIM is compared to a conventional audio guide regarding the ease of use, usefulness, object location, and enjoyment with positive results for SARIM. However, it would also be interesting if the evaluation included aspects such as the acceptance of headbased gestures as they may be considered unpleasant to perform [51].
S4 falls into the Audio AR category by presenting a wearable Audio Augmented Reality prototype which is tested in a gallery scenario. According to the usage scenario, visitors navigate in a gallery where each painting has a set of associated sounds related to the subject of the painting. The article presents an interesting idea as the provided sounds are created to enliven the depicted scenes and objects. The prototype consists of: (i) a cap mounted with the visualinertial sensor; (ii) a laptop in a backpack running the sound simulation engine; and (iii) a pair of unmodified headphones. The application is developed using the Unity game engine, Vuforia Augmented Reality engine, to perform the initial user tracking by scanning the markers that have been placed in specific positions in the gallery space. The stereo camera having an integrated IMU (visual-inertial odometry sensor is available as a product under the title "ZED mini") is used to estimate the visitor's head pose while navigating in the gallery. It is noted that the use of the stereo camera allows the application to perform smoother tracking as it does not require continuous visual contact with the markers. A number of test users are asked to use and evaluate the system, and authors pay special attention to the tracking accuracy as it is necessary for the appropriate audio spatialization and affects user experience. The article does not provide information regarding the system's interaction mechanisms or evaluation of this aspect, but overall, the reported results are positive, and the authors draw guidelines for future improvements, including changes to make the system lightweight and use more portable devices.

3) Mixed Reality Applications:
There are three works (S3, S9, and S10) focusing on Mixed Reality. Their contents and presented solutions are discussed in the following paragraphs.
S3 presents the Ambient Information Visualisation Concept (AIVC). The purpose is to provide designers with a tool that enables them to allocate virtual objects that will be experienced as real objects through the use of holographic devices within a real environment. The AIVC puts the visitor in the center of a sphere of visuals. This sphere consists of three layers placed as mantles, of which the closest to the visitor layer is the layer holding the user interaction controls, the next layer is used to hold the virtual guide and storytelling elements and the outer layer is used to project virtual background objects that enhance the scene environment. The article presents an application titled "The Battle" in order to demonstrate and test AIVC. The Battle is a mixed reality application designed to be used in the Manchester Museum's Egyptian department and depicts ancient Egyptian kings and soldiers fighting their enemies. According to the AIVC structure described above, users of the application initially use the UI elements placed on the first layer. In the next layer a virtual narrator is presented. After the narrator's introduction, the visuals of the third layer are accessible as "the viewer can look around and see the temple of the king projected around accompanied with virtual supplementary characters such as guards and maids and with some visual effects". According to the article, the user can control (start/stop/pause) the story's narration by hand interactions and navigate between the narratives' scenes. The application is developed for the Hololens device with the Unity game engine, the HoloToolKit Mixed Reality library for Unity, and the Visual Studio development environment. According to the article, a Technology Acceptance Model (TAM) framework is used in order to measure the users' acceptability of the system. According to the presented results, participants reported that they felt immersed, and they enjoyed the storytelling. The system is also evaluated as easy to use and useful for visualizing historical events.
S9 presents a Mixed Reality application titled "MuseumEye" which is developed for use in the Egyptian Museum in Cairo. The application is developed using the Unity game engine, and it is built and tested on the HoloLens Mixed Reality headset. MuseumEye design provides visitors with visualizations of virtual characters representing historical personalities, virtual objects by 3D scanned antiques, and a virtual guide providing information and presenting the aforementioned visuals. It is noted that the virtual objects resulting from 3D scanned antiques allow visitors to closely observe them and perform handling actions that are not allowed to be performed in the real exhibits. The augmentation overlays are structured in three layers similar to those proposed in S3. The authors present a set of functions that MuseumEye implements to fulfill visitor needs and accomplish museum guide objectives. These functions are reported as immersive, and they are categorized into the following four categories: (i) visual communication, which comprises 3D representations, 3D scanned artifacts, and animated characters positioned in the virtual environment; (ii) guidance which comprises storytelling functions and alternative methods for providing narrations (text and audio) as well as a portal functionality for navigation in the scenes; (iii) interaction which comprises the necessary hand gestures functionality that is employed on the particular interaction functions, interaction features for an integrated game, interaction with the portals' functionality of the guidance category, manipulation of 3D objects, and the User Interface navigation and controls; and (iv) communication which comprises of functions designed to promote MuseumEye as a collaborative and social experience. The authors state that the application and the designed scenario enable users to move freely between scenes in contrast to what usually happens with prepared thematic tours that are performed by human or audio guides.
Moreover, they state that when the visitor has more control over his/her visit, the possibility of learning and enjoying the tour increases. The article provides evaluation results with positive feedback from the users regarding the Mixed Reality perception, the tour navigation, the user interaction, the storytelling narration, and the 3D multimedia representations. Regarding the device's usability, users' comments show that the device is easy to use, but some of the participants expressed concerns about its limited field of view, weight, or battery life. Moreover, some participants agreed that MuseumEye could be used as an educational tool and an independent guided tool. S10 presents a Mixed Reality application titled "Augmented Telegrapher" developed for the Telegraph Museum, based in Porthcurno, Cornwall, UK. The research question of this article is the following: "What methods are needed to effectively realize a social immersive cultural heritage installation in a small, rural museum context?" and the development decisions were mostly driven by the question "How can a niche museum in an extremely remote and rural location entice "experience seekers" to their location?". The answer to these questions comes in the form of an immersive game experience creating a Mixed Reality escape room in the settings of the Telegraph Museum. In order to achieve the desired immersion and the feeling of Mixed Reality, the application is developed for Microsoft's HoloLens headset. Interestingly, the developed application allows users to explore a scene or perform interactions and is also designed to be a game experience. The game's theme is to simulate a telegraphy training exercise, and the story takes place in World War 2, as the museum is located on the telegraph's premises which were built in that period for global communications. According to the game's story, players are to be trained as telegraph operators. Their role is of great importance due to the conditions of international telecommunications during WWII. Among the tasks, players learn and use Morse code and use equipment (galvanometer, telephone, hand-wheel) to diagnose and support the repair of a break in the undersea intercontinental communications cable to successfully complete the game. The application's design considers any disorientation issues that may occur while entering the experience and thus uses an animated character that acts as an orientation helper. In addition, this animated character is used for narration and for keeping users focused on specific tasks. The authors argue that the provided gesture interaction methods by HoloLens are not appropriate for museum use. Instead, they developed interaction methods with real objects in the environment (Morse device, galvanometer, telephone, and hand-wheel) by using appropriate sensors attached to the objects. Each interactive object has a WiFi-enabled microcontroller and sensors, and the interaction data are sent to the headset for their effects to be visualized. In addition, it reportedly uses beacons, but no further information about it is provided. A small usability study compares task completion between standard gestures and the custom physical interfaces described above. According to the provided results, the custom physical interfaces outperformed the gestures. Finally, an analysis using the Technology Acceptance Model (TAM) is performed with positive results.

4) Virtual Reality Applications:
Two articles (S11 and S13) belong to the Virtual Reality category. Their contents and presented solutions are discussed in the following paragraphs. S11 presents an interactive Virtual Reality simulation of a Neolithic settlement developed to be used in itinerant archaeological exhibitions. Specifically, two different experiences are designed, the first is a 360 o video tour in the Neolithic settlement designed to be used with a Samsung Gear VR headset, and the second is an interaction-rich application designed to be used with HTC VIVE VR headset in which users are able to explore the virtual environment, manipulate objects and activate animations. It is noted that the considered applications are developed using the Unity game engine. Regarding the use of the applications and the equipment in the exhibitions, authors state that they employed four Samsung Gear VR headsets with headphones used for the 360 o video tour and one HTC VIVE headset with its wireless hand controllers for the VR game. Additionally, for the HTC VIVE, a large zone (reportedly at least 3m × 4m) of the exhibition area is used in order to allow users to move freely in the application in room-scale mode. The article evaluates the presented system regarding its learning characteristics, overall experience, and usability. The results show that visitors highly valued the blended learning experience. Regarding the usability, the analysis shows that experienced users did not have difficulties in using the application, but a significant proportion of novice users had difficulties with the tested tasks. For the latter case, authors draw guidelines for future work, including training tutorials' design. S13 presents a virtual reality recreation of a photographic exhibition called "Thresholds" which has toured to multiple museums. Thresholds is a large scale (room size) VR installation suitable for multi-user experience (up to six, as stated in the article) and are designed to be tourable. Visitors use headsets with which they experience a virtual scene while they are able to walk in the installation physically. Authors stress the use of passive haptics [40], which are used to provide visitors with the haptic sensation of the displayed objects. The installation's physical space is similar to the VR space in terms of geometry, the installation objects and surrounding walls are all colored white, and visitors see through the headset its 3D reconstruction fully textured as a photographic exhibition. The installation dimensions are 8.5m × 6.5m × 2.5m which is considered to be relatively large by current VR standards. In order to provide a realistic and safe experience to the visitors, the installation needs to provide real-time and accurate positioning. This is quite important as users can see an object through the headset and at the same time, touch it with their hands. The presence of objects and other users in a space where users navigate without actually being able to see them as they are immersed in the VR scene imposes a safety threat, and it is important to prevent any collisions that could possibly harm visitors. An additional issue that is addressed is the visualization of other users in order to prevent collisions among them. Regarding the position tracking, the authors state that they tried a set of solutions, but they finally chose to use the HTC VIVE technology. HTC VIVE employs outside-inside tracking by using a pair of base stations that emit infrared pulses and are then received by sensors on the headsets and the controllers to calculate their position in relation to the base stations. It is reported that tracking errors are not too common, but a mechanism informing users to take off their headset in case of persistent errors is used in order to avoid any accidents.
Authors state that "At the time of writing, there are several inside-out solutions on the market, including the Vive Cosmos and Oculus Quest; however, when Thresholds was developed, outside-in tracking was the de facto standard.". HTC VIVE is a computer-powered headset, and it has to be connected to its link box via a cable. Obviously, having visitors navigate the installation with the cables attached to their headsets is impractical, and it increases the safety risks when multiple users are in the installation. In order to overcome this issue, the described solution was to use backpack PCs which users have to carry through their experience.
Regarding interactivity, the solution design describes that visitor should be allowed to to pick up a photo and zoom it in and out. HTC VIVE enables user interaction through its handheld controllers, but as stated in the article, the need to hold controllers was considered a drawback, and interactions using users' hands were chosen. In order to enable hand tracking and gesture recognition, the authors state that they had to use an additional controller (Leap Motion), which was mounted to the front surface of the headset. It is noted that newer versions of HTC VIVE (VIVE Pro, VIVE Focus) support hand tracking and even finger tracking, but this solution was probably not available to the team while developing their solution. The Thresholds application is developed with Unity game engine. A server application is needed to collect the users' position data from the headsets and visualize other users as "ghostly figures" to prevent collisions and provide a multi-user experience. The article provides adequate information on the testing and evaluation of the solution. It is noted that according to their observations, many visitors appeared to be physically comfortable in the space. The tactility provided by the passive haptics created a feeling of immersion, and additional details (sounds, a virtual fireplace, a ceramic heater emitting heat, etc.) also increased the immersion.

B. The Employed Equipment
The analysis is now focused on the employed equipment per solution. Table IV and V (two tables instead of one are used for readability reasons), present the equipment used in the selected articles with a notion of the main technology (AAR, AR, VR, MR) in order to facilitate the identification of relationships among the technology and the employed equipment. In the next few paragraphs, a discussion on the used technologies is given.

1) Mobile Platforms:
The analysis shows that many of the considered applications (5/13) are developed for use on mobile devices. It is also noted that the majority of the proposed solutions are AR applications (7/13), and a closer examination shows that all the above mobile applications are focused on AR.
The dominance of the mobile platforms should not surprise us as they are the most common solution for Augmented Reality applications. This is expected as mobile devices (smartphones and tablets) are relatively cheap, users are familiar with their use, most users bring their own mobile devices, and their portability makes them suitable for use in museum visits. The most used mobile platform is reported to be Android (4/5), a fact that is easily explained as Android is considered to be friendly for developers and test applications. Mobile devices share a common set of parts. The parts that are reportedly used in the discussed solutions are: (i) the touch screen (S2, S5, S6, S7, S12); (ii) the camera for scanning markers and providing images of the real environment (S2, S5, S6, S7, S12); (iii) Bluetooth for reading Beacons (S6); and IMU (Inertial Measurement Unit) for orientation tracking (S6, S7).

2) Standalone VR Headset 1, Google Cardboard:
Continuing our analysis on the mobile platforms, we observe that there is one proposed solution (S7) developed for Google Cardboard and paradoxically is used in an AR application (Google Cardboard is mostly used for low-cost VR applications, more information regarding AR with Google is given in [53]). Google Cardboard is a Virtual Reality platform developed for mobile devices. When a Google Cardboard application is executed on a compatible mobile device, the screen is separated into two different views to provide stereoscopic vision. The Inertial Measurement Unit (IMU) provides orientation tracking and assist interaction. Google Cardboard platform is completed with a potentially ultra-lowcost headset made even of cardboard in order to be able to hold the mobile device in the appropriate position in front of user's eyes. Google Cardboard is used in the considered work (S7) as authors needed a low-cost headset to provide a natural interaction AR experience. It is no surprise that the once predominant VR platform Google Cardboard is here found only in one solution. After Google stopped supporting it, other low-cost VR platforms emerged, and high-end headsets entered the market, decreasing its usage.

3) Standalone VR Headset 2, Samsung Gear:
Talking about mobile platforms and Google Cardboard, Samsung Gear can be considered to be Google's Cardboard rival. Samsung Gear is a headset designed to operate with specific Samsung's smartphones in order to provide improved performance for VR experiences. The attentive reader will notice that the case of Samsung Gear (S11) is not tagged as a mobile solution. This happens as the Samsung Gear applications are designed to operate with the headset.
Samsung Gear is found in one article in which the focus is on VR. The article does not provide information regarding the selection criteria or the specific functionality that it uses, but its relatively low-cost (in comparison to high-end HMDs), high performance, and high-quality materials are enough reasons to choose it.

4) PC-powered VR Headset, HTC VIVE: is a high-end
Virtual Reality Head Mounted Display (HMD). It consists of: (i) the headset; (ii) a pair of handheld controllers; (iii) a pair of base stations used for user's position tracking; and (iv) a device called a link box which is used to connect the headset to the PC. HTC VIVE requires a high-end PC to operate, and the total purchase cost is relatively high. It is undoubtedly one of the more powerful Head Mounted Displays available at the moment for purchase and1 provide a number of features that make it appealing for various VR applications [54], [55].
HTC VIVE provides two functional modes: (i) stationary, in which the user does not change his physical position during the experience; and (ii) room-scale, in which the user is allowed to move in a room-scale area two base stations surrounding order to track user's position and movements. HTC VIVE is employed in two of the presented works (S11 and S13), the only discussed works focused on VR. Both works make full use of the 6 DoF tracking; its hand controllers are used in S11, while in S13 the controllers are not used as the research team stated that the visitors should not have to hold controllers.

5) Mixed Reality Headset, HoloLens:
it is a relatively new solution designed to provide Mixed Reality experiences. It has a set of quite important advantages as the wide field-ofview transparent display that allows the projection of virtual objects (also called holograms) while user is still able to see the real environment, the spatial mapping functionality which scans and maps the surrounding environment allowing user's position tracking and giving the ability to provide holograms that are aligned with the real environment, and it is also wearable and lightweight. Initially, it was available for developers, and still, Microsoft considers it to be more a tool for business rather than a home user product. Nevertheless, and besides its increased purchase cost, it is observed that an increasing number of solutions are based on it.
In our survey, four works are based on it, of which three are categorized as Mixed Reality and one as Augmented Reality application. The advantages described above explain HoloLens presence in the solutions focused on Mixed Reality. HoloLens is able to perform: (i) head tracking (used in S3, S8, S9, S10); (ii) hand tracking (used in S3, S9, note that HoloLens 1 is able to track one hand, and HoloLens 2 is able to track both hands); (iii) eye-tracking (not available in HoloLens 1); (iv) voice commands; and (v) spatial mapping (used in S3, S8, S9, S10). Our analysis shows that the presented works take advantage of some HoloLens features, but new features have not been exploited yet, such as the eyetracking, which is introduced with HoloLens 2, that has the potential to enhance immersive experiences further.

6) Audio-Augmented Reality Custom Solutions:
Under the "Custom" category, we find equipment implementations that consist of available in the market components, but none of them consisting of components (if taken alone) provides adequate functionality for the intended purpose to be described as an off-the-shelf solution.
We find two solutions that both serve Audio Augmented Reality applications in this category. S1 presents a system called SARIM, which consists of a Bluetooth stereo headset (Philips SHB9100) and an orientation sensor (InertiaCube BT from InterSense), while a backpack PC is required for its operation. In the selected article, the SARIM device is not reported to have a position tracking system and the user's position is simulated by the Wizard of Oz technique [50], so the SARIM device is not considered fully functional. In S4, the proposed system appears to be fully functional. It consists of a visual-inertial sensor performing both position and orientation tracking, a pair of headphones, and a backpack PC supporting the system's functionality.
It is noted that both AAR solutions are based on backpack PC. At this point, the reader would possibly question if the need for the backpack PC is necessary. Literature provides us with examples [56] of Audio AR applications providing position and orientation tracking and orientation-dependent binaural stimuli without the need of a backpack PC.

7) Backpack PC:
The use of a backpack PC is reported in three works in which the equipment providing the audiovisual stimuli requires the use of a computer. Two of the three solutions requiring a backpack PC are categorized as AAR applications using custom equipment (S1, S4), while the third solution is a VR application (S13) in which the backpack PC is required for supporting the VR headset (HTC VIVE).

8) Hand tracking Sensor, Leap Motion:
Two works (S7, S13) report that they employ a hand tracking sensor in order to provide hand gestured interactions. The sensor used is the Leap Motion, a small and lightweight device that consists of Infrared cameras and Infrared LEDs used to scan a hemispherical area to a distance of about 1 meter. In S7, a LeapMotion is connected to a server for the hand tracking and the corresponding interaction to be performed, while in S13 the LeapMotion is attached and connected to the HTC VIVE headset. The above shows the ability of LeapMotion to interoperate with varying systems. 9) Beacons to assist indoor position tracking: An indoor position tracking technology is also mentioned as iBeacons. A beacon is a Bluetooth signal transmitter. Each Beacon emits a distinguishable signal so that when a Bluetooth-enabled device (usually smartphones and tablets) enters the Beacon's range, the application can recognize the area in which the device is located. By using beacons for location tracking, the need for visual markers or other visual information scanning and processing is not required. Nevertheless, Beacons technology alone cannot provide accurate position and orientation tracking. Beacons are reportedly used in three works, S5, S6, and S10, but only S6 provides adequate information about their use.

10) Server:
In four works, servers are employed. In S5, a server is used to host the exhibition manager platform and the database. The server employs the widely used software XAMPP, the PHP scripting language, and a MySQL database. In S6, a server is used to maintain artifacts' status, including their 3D location and respective room location. This server also uses the PHP scripting language and a MySQL database. In S7, a server that supports the LeapMotion functionality and returns the interaction results to the user's headset is employed. Finally, in S13, the server is used to maintain users' positions and broadcasts this information to the VR headsets in order to provide a multi-user experience and avoid collisions.

C. The Employed Software
The results confirm what we already know about the importance of specific software products in application development and the game engines used to develop interactive 3D experiences. The majority of the presented works are developed with Unity game engine, the Vuforia Augmented Reality library which also comes in the form of Unity plugin is also commonly deployed, the HoloToolKit is also reported to be used while there are works in which other software products are used or no such information is provided. Table VI provides information regarding the software products that are reportedly used in each work.
1) Unity: also called Unity3D, is used in nine out of thirteen (9/13) applications. This game engine provides important advantages, e.g., the ease of use, the ability to use it without purchase cost, the numerous supported platforms including PCs (Windows, MAC and Linux), gaming consoles, mobile devices, exports for the Web and the latest Head Mounted Displays, as well as the interoperability with libraries that extend its functionality (Google Cardboard, Vuforia, Mixed Reality ToolKit, etc.) so it is quite possible to fill most of the needs in a project.
2) Vuforia: is used in five solutions. Four of the solutions that are employed are focused on AR, while it is also used in one Audio AR solution. In total, six of the presented works focus on AR (without counting Audio AR, which is a separate category). In one AR article (S6) no information regarding the implementation software is given, and in S8, besides the fact that it is categorized as AR, the HoloLens headset is used, and the application is developed with the MRTK. Its wide adoption among the AR and A-AR solutions that are presented is expected as Vuforia is one of the most popular Augmented Reality engines. Vuforia comes in numerous versions: (i) Unity plugin; (ii) Android SDK; (iii) iOS SDK; and (iv) Universal Windows Platform SDK. According to our results, it is used in combination with Unity game engine in all reported cases, which is considered logical as Unity facilitates its use. Vuforia implements a marker-based approach, where images or objects can serve as markers, and when an application integrating Vuforia scans one of its known markers, an action is triggered.

3) Mixed Reality Toolkit: (MRTK) is an open-source
Microsoft-driven project initially developed to provide access to the HoloLens functionality. Latest versions reportedly support and other headsets, e.g., HTC VIVE, Oculus Quest, Oculus Rift [57]. Among the MRTK's features is the input system that supports user input and input sources such as 6 DoF controllers, the hand tracking functionality, the eyetracking functionality, the UI controls, spatial awareness, and speech input.

IV. CONCLUSION
In this section, a discussion regarding the above analysis and the conclusion is given while providing answers to the second research question: [Q2] Do these works reveal a set of good practices or innovative ideas for future applications? According to the above analysis, widely used practices are observed. Augmented Reality is a key technology in enhancing museum's visiting experience, and many works make use of it (S2, S5 -S8, S12) and explore new capabilities such as natural interaction (S7), artifact management (S6) or Audio AR (S1, S4), while a smaller number of works present interesting applications of Mixed Reality (S3, S9, S10) or Virtual Reality (S11, S13).
Augmented Reality is easy to be employed as it is mostly based on the Bring Your Own Device (BYOD) concept. BYOD has significant advantages as it reduces costs and it is easier for users to handle their own devices. Most of the presented AR solutions make use of point-and-touch interfaces in which users have to hold their mobile devices, point towards a direction with their camera and use the touch screen for interaction. But holding a smartphone during a visit may be tiring, while confining the interaction within the screen's boundaries increases the distance between visitor and exhibit.
Natural interaction in AR is an interesting concept proposed in S7 where a LeapMotion is employed to perform hand tracking. In this example, the LeapMotion sensor had to be placed on a stable surface and connected to a server. While the use of a hand tracking controller in a permanent position is appropriate for use on the exhibit, a portable solution would require the hand tracking operation to be performed anywhere. When the reported solutions that use Leap Motion (S7, S13) were developed, LeapMotion was considered to be one of the most, if not the only, appropriate solutions. Currently available mobile devices are not able to perform accurate hand tracking in 3 dimensions, but they can be used for 2D gestures recognition [58]. Their inability to perform hand tracking is due to the lack of stereo vision cameras, but advancements in mobile devices may change this soon. Moreover, the latest advancements in Virtual Reality headsets provide built-in accurate and real-time hand tracking so that LeapMotion could possibly be not used in such VR (S13) solutions. Additionally, as Mixed Reality headsets, e.g., HoloLens, are becoming increasingly obtainable, hand tracking and visualization can be performed on the same device. Nevertheless, the cost of purchasing Mixed Reality headsets is still considered one of the most significant limitations for museums to make use of them.
Another interesting approach to interaction is the use of physical interfaces (used in S10) as proposed (S10); realworld objects can be used as custom physical interfaces when combined with appropriate sensors. Interestingly, the reported evaluation results showed that custom physical interfaces outperformed hand gestures, and we can assume quite confidently that their use should feel more intuitive.
Alternative interfaces are also presented in Audio AR. S1 uses Head-based gestures. An important aspect of using headbased gestures is their acceptance and usability, as in many cases, these are not positively accepted [51], and their suitability in an application should be carefully examined.
An important aspect of Augmented Reality applications is indoor position tracking. While 3 DoF tracking is easily performed in most mobile devices using the IMU, 6 DoF tracking is not yet fully exploited. 6 DoF tracking requires additional information to track changes in the position, this information is mostly provided by cameras, and Visual SLAM (Simultaneous Localization and Mapping) is performed to assist the tracking [59]. It is noted that the widely used AR libraries, e.g., ARCore and ARKit, as well as Mixed Reality headsets, e.g., HoloLens, support SLAM. Specifically, the presented applications developed for Mixed Reality headsets take full advantage of spatial mapping and localization in order to assist 6 DoF tracking.
Regarding the advancements in VR systems as reported above, new standalone VR headsets (Oculus Quest, Oculus Quest 2) can perform inside-out position tracking without the need of base stations and perform accurate hand tracking available. In combination with the reducing purchase costs, the above makes us assume that room-scale applications with natural interactions will be more easily developed and used in museums and other cultural spaces.
Regarding the Mixed Reality applications, our analysis shows that the presented works take advantage of a number of HoloLens features, but their full potential of it has not been used yet as many capabilities were introduced in the latest HoloLens version (2). Although speech recognition was available since the first HoloLens edition, none of the presented works make use of it. This can be attributed to the fact that a visitor using a Mixed Reality device shares the same space with other visitors, and having multiple users trying to interact by using speech commands may not be an appropriate and effective solution. Moreover, users may consider it awkward to perform hand gestures in front of others, especially when others do not share similar experiences. In cases such as the above, eye-tracking functionality may be able to provide interesting interactions [60].
A considerable number of works (3/13) use backpack PCs, two of which are used in Audio AR applications and the other one in VR application. It is questionable whether users will accept carrying a backpack while visiting a museum. A TAM (Technology Acceptance Model) approach could provide interesting answers to this question. Furthermore, we can assume that the decision to carry or not a device depends on its size, weight, ergonomics, perceived usefulness, and expected enjoyment, with a possibly varying decision threshold existing among different users. Furthermore, we can hypothesize that the need for backpack PCs will be diminished as there are examples of Audio AR that solely need a mobile device [56] and new high-end standalone VR headsets are available.
Regarding the used software for creating immersive experiences, the dominance of Unity game engine is obvious as it is used in nine out of thirteen works. Also, Vuforia is a strong player in the AR domain as even though numerous other solutions exist (EasyAR, Wikitude, etc.) [61], none of them is still used in the discussed papers.
An interesting aspect that is highlighted by one work (S13) is the safety of users as they navigate in a real environment when parts of it, or all of it in case of VR applications, are occluded by virtual elements. An additional interesting concept presented in one of the discussed works (S13) is the use of passive haptics, which can be used to provide a touch experience in virtual-and mixed-reality environments. Audio augmentations also provide interesting approaches to communicating the exhibits as they can be used to expand the provided information and enhance understanding. An interesting idea presented (S4) is that soundscapes that relate to the theme of a painting can be provided. Moreover, users can experience the properties of 3D sound as they move with respect to the painting.
The ability of immersive applications to provide multimedia information that is not restricted in a window of limited size imposes the threat of sensory overload. One work (S8) is reportedly focusing on this subject by developing and testing methods to reduce the cluttering that occurs because of the density of AR augmentations. Two interesting approaches, the first based on gaze direction and the second on the distance between user and object, are tested. In addition, we suggest that eye-tracking could also be applied to this subject.
The presented review shows that despite the large number of publications related to the use of immersive technologies in museums or other cultural heritage spaces, a small number of these (13 screened articles) are designed for on-site use. It also shows that most of the applications follow the Augmented Reality paradigm, and the developed solutions are mostly intended for users' mobile devices (Android smartphones and tablets). The used hardware platforms significantly affect the capabilities of the interaction mechanisms. Specifically, the analysis reveals that the interactions are performed through the touch screen for mobile AR applications. While this approach has been proved to be effective in the early years of Augmented Reality, its usability is now questioned as new approaches that bypass the screen emerge. Apart from AR applications, VR and MR applications are also discussed, and these applications use other hardware platforms: the Samsung Gear VR headset, the HTC VIVE VR headset, and the HoloLens MR headset. In addition, custom hardware platforms are designed for use in Audio AR applications. There are also works pointing in the direction of Mixed Reality, which is quite promising, and the provided evaluations give positive feedback. It is quite likely that Mixed Reality solutions will be increased in the next few years as the technology becomes mature and the purchase and development cost is reduced. Virtual Reality is also employed in museums and other cultural spaces but with fewer solutions than those presented in AR and MR. One reason that fewer solutions follow the VR paradigm may be that VR isolates visitors from their surroundings, which can be opposed to the purpose of a museum.
The analysis shows that the most used software platforms are Unity game engine, Vuforia AR engine integrated into Unity projects, and Mixed Reality ToolKit. Besides the fact that there are alternative solutions, the aforementioned tools are widely adopted by the research community, a fact that can be attributed to their ease of use and their gentle learning curve. Relatively new approaches such as Web VR and AR frameworks, e.g., Three.js, A-Frame, AR.js, etc., are not detected in any of the discussed solutions. This can be attributed to the fact that many of these approaches are not yet mature, and there are ongoing challenges [62].
In summarizing, this review aims to recognize the immersive technologies and applications paradigms that are mostly used in museums and cultural environments. The analysis provided interesting insights shows that there is a wide field to be explored and draws the guidelines for future work regarding specific aspects of immersive experiences such as the interaction approaches and the design of User Interfaces and User Experience in highly immersive and interactive environments intended for museum use.