Intelligent Automation & Soft Computing DOI:10.32604/iasc.2021.013732 | |
Article |
Design and Development of Collaborative AR System for Anatomy Training
1Company for Visualization & Simulation (CVS), Duy Tan University, Danang, 550000, Vietnam
2Hue University of Medicine and Pharmacy, Hue, 490000, Vietnam
3Institute of Research and Development, Duy Tan University, Danang, 550000, Vietnam
4Faculty of Information Technology, Duy Tan University, Danang, 550000, Vietnam
*Corresponding Author: Dac-Nhuong Le. Email: ledacnhuong@duytan.edu.vn
Received: 18 August 2020; Accepted: 30 November 2020
Abstract: Background: Augmented Reality (AR) incorporates both real and virtual objects in real-time environments and allows single and multi-users to interact with 3D models. It is often tricky to adopt multi-users in the same environment because of the devices’ latency and model position accuracy in displaying the models simultaneously. Method: To address this concern, we present a multi-user sharing technique in the AR of the human anatomy that increases learning with high quality, high stability, and low latency in multiple devices. Besides, the multi-user interactive display (HoloLens) merges with the human body anatomy application (AnatomyNow) to teach and train students, academic staff, and hospital faculty in the human body anatomy. We also introduce a pin system to share the same knowledge between multiple users. A total of 5 groups have been considered for the study and the evaluation has been performed using two parameters: latency between the devices and position accuracy of the 3D objects in the same environment. Results: The proposed multi-user interaction technique is observed to be good in terms of 3D object position accuracy and latency between the devices. Conclusion: AR technology provides a multi-user interactive environment for teaching and training in the human body anatomy.
Keywords: Augmented reality; collaboration; anatomy; sharing hologram
Augmented Reality (AR) integrates real and virtual objects in a three-dimensional real-world environment and simulates through an interactive interface [1]. Solo AR devices are considered for a huge range of applications and lead to significant success. One of the features that make AR highly usable is the ability of multiple interfaces to virtually interact with each other [2]. Over the last few decades, AR has become one of the most promising technologies for supporting 3D display tasks. Advancement of research in this field is gaining popularity because of the AR system supply [3–6]. Therefore, developers and researchers are aiming to build optimal and practical AR products shortly on a global scale. Most AR outcomes target single user experience and lack share ability in the same virtual object. AR collaboration scenarios can be classified into two different types [7–9]:
a) Remote collaboration for building the illusion, which combines two or more participants sharing the same space through telepresence
b) Co-located AR collaboration, which enhances the shared and mutual workspace so the participants can be visible to each other at the time of collaboration.
The world has witnessed many drastic changes and fluctuations due to natural disasters [10], epidemics [11], and wars [12] that lead to diminishing human interaction and travel restrictions. Building virtual interactive, shareable applications can make remote training, evaluation, and diagnosis possible [13], which may eventually eliminate those current issues. It may also lead to improvised consultations in hospitals worldwide. Interactive environments can also be applied to teaching and training, and be considered for distance employee training workshops. Teaching and training workshops may be led by a team of experts and doctors and support visual utilities, such as photos, videos, or voice communication. One of the biggest challenges for medical experts is providing medical explanations in addition to lessons regarding specific details and notes in the classroom [14,15]. However, it is possible to address such issues by using a multiple-user AR system, which allows interaction and sharing among multi-users on the 3D objects [16,17]. Many researchers have proposed AR-based solutions for exploring the human body anatomy. In Stefan et al. [18–20], an AR-based system is proposed for the human body anatomy. The main motive of such systems is to learn the complex anatomy in a detailed and faster way as compared to traditional learning systems. Some studies [21–25] rely on exploring human body anatomy through mobile applications. Kurniawan et al. [26] proposed an AR-based mobile application for that purpose. A camera was responsible for capturing pictures, which were then split into several pieces and patterns to be matched with a database of images. In addition, a floating euphoria framework was deployed and integrated with an SQLite database. Kuzuoka [27] employed a video stream to simulate the actions of a skilled person. Local users were captured in the video and then transmitted to the remotely skilled person who could annotate the captured content. Baurer et al. [28] extended this approach and presented a specialist enabled mouse pointer on the head-mounted display (HMD) of the local workers. However, the position of the mouse is in the two-dimensional display, and stability is observed to be poor at the HMD movement time. Chasitine et al. [29] presented a system that exhibits three-dimensional pointers to represent the skilled person’s intention. However, 3D pointers considered for the study are quite slow. In Bottecchia et al. [30], an AR-based system that permits to put down 3D animated objects for observation via local staff. The purpose of this system is to demonstrate to the participants how to analyze or solve a problem. Researcher [31,32] brings forward a solution to diagnose heart disease, raise awareness about osteoarthritis and Rheumatoid Arthritis, Parkinson’s disease, and the diagnosis of autism in children. The invention of 3D Microsoft HoloLens has become a breakthrough in augmented reality. This had led to researchers and developers working on AR applications and academic publications rapidly. Some popular areas are medical visualization, molecular sciences, architecture, and telecommunications. Trestioreanu [31] offered a 3D medical visualization technique using HoloLens for visualizing the CT dataset in 3D. This technique is limited to a specific dataset and requires an external system for rendering the tasks. Si et al. [33] presented an AR interactive environment, especially for neurosurgical training. The study considered two different methods for developing the holographic visualization of the virtual brain. The first step is to redevelop personalized anatomy structures from segmented MR imaging and the second step is to deploy the precise registration method for the mapping of virtual-real spatial information. An AR-based solution [34] for testing clinical and non-clinical applications using HoloLens. In this study, Microsoft HoloLens was employed for virtual annotation, observing the three-dimensional gross, navigating full slide, telepathology, and observing the real-time correlation between the pathology and radiology. A comparative evaluation study [35] based on the optical AR system and semi-immersive VR table. A total of 82 participants participated in this study and the performance and preferences were evaluated via questionnaires. A mixed-reality application developed by Maniam et al. [36] for temporal bone anatomy teaching and simulated the same on the HoloLens. Through vertex displacement and texture stretching, simulation replicates the real drilling experience into the virtual environment and allows the user to understand the variety of ontological structures. Pratt et al. [37] asserted that AR techniques could help identify, dissect, and execute vascular pedunculated flaps during reconstructive surgery. Through computer tomography angiography (CTA) scanning, three-dimensional models can be generated and transformed into polygonal models and rendered inside the Microsoft HoloLens. Erolin [38] explored the interactive three-dimensional digital model to enhance medical anatomy learning and medical education. The study was used to develop virtual 3D anatomy resources that include photogrammetry, surface scanning, and digital format modeling.
The main objective of the study is to develop a multi-user human body anatomy application (AnatomyNow) for anatomy teaching and training. Multi-user interaction in the same environment can be challenging. To address this issue, we propose a sharing technique for the multi-user participants in the same environment and develop high-quality 3D objects for a detailed understanding of human body parts and organs. In this study, we consider two cases- first, different users in the same physical space, and second, users in other geographical parts. The developed anatomy application is also evaluated through these cases and analyzes the performance of the sharing technique in HoloLens.
The main objective of this paper is as follows:
• We integrate human anatomy applications (AnatomyNow) with the Microsoft HoloLens.
• We create a multi-user AR interaction platform for the human body anatomy and testing for two different cases.
• A pin system is introduced for the interaction of multiple users in the same environment.
• Performance evaluation has been conducted using two parameters: data lagging between the devices and distance between the models of different devices.
The rest of the paper is as follows: Section 2 describes augmented reality technologies. Section 3 discusses the methodology used in the proposed system. Section 5 illustrates the performance evaluation of the study followed by a discussion on the comparative analysis of the research. Section 6 concludes the proposed study.
2 Augmented Reality Technologies
2.1 On-device Augmented Reality
For on-device augmented reality, a typical AR application annotates the content from the real-world environment. This is followed by enumerating the pose of the on-chip or on-board camera and lining-up the virtual camera rendering for each frame. Here, the pose in six degrees of freedom includes three degrees of translation and three degrees of rotation. Two different types of mobile SDK are available for the on-device augmented reality, as discussed in Tab. 1:
2.2 Cloud-Based Augmented Reality
With the introduction of cloud computing, virtual reality and AR moved towards a new technological era. It authorizes the information and experiences real-time data with specific locations and exists across the user terminal and devices. Google lens [39] allows a user to search and identify objects by taking pictures. It also provides information about the product if the lens finds related information on its server. Many studies address the scalability problem [40,41], and combine the image tracking system with the image retrieval method. However, this approach was unable to tackle the mismatching problem. Another study [42] proposed a system that initializes image recognition and tracking with the region of interest (ROI). In Jain et al. [43–53], a cloud-based augmented reality solution was proposed to extract features in the images instead of the full picture. This method has been useful for saving the overall usage of the network. CloudAR performs better in terms of network latency and power consumption as compared to the traditional AR systems.
2.3 Augmented Reality Communications
Augmented Reality communication works as a bridge for ensuring communication between devices through the cloud server or direct contact (peer-to-peer). AR communication is classified into two different types (see Fig. 1):
1) Client-server architecture: In this architecture, when a user moves a 3D object, the performed action would be sent to the server in the form of information. The server checks whether the received data is correct or not. If the received data is correct, the applications will be updated accordingly and replaced with the latest status. The users on this server will be provided with the same information.
2) Peer to Peer architecture: The peer-to-peer (P2P) architecture is a “peer-to-peer” network model as there is no centralized server. Every “peer” or single user device receives or updates information flow from other sources. At the same time, all the status changes are automatically sent to other devices.
AnatomyNow is an application based on virtual reality (VR) systems for human anatomy. This project contributes immensely to medical training and teaching. The 3D simulation models support teachers and students for better visual observation of individual body parts and deep interaction in 3D space. Besides, this virtual model may facilitate hands-on clinical skills for medical students or doctors, which eventually saves costs and creates better chances of applying knowledge when samples are not available.
Anatomy Now: 3D human body simulation system illustrates the entire body of the full virtual person with more than 3924 units simulating organs and techniques in the human body. It includes bones, muscles, vascular system, heart, nervous system and brain, respiratory system, digestive system, excretory and genital systems, glands, and lymph nodes. The anatomical details are simulated precisely according to the characteristics and anatomical features of Vietnamese people. The accuracy of data science, shape, and position of all simulation models with details have been tested and examined by doctors, researchers, professors, and scientists working in medical universities and hospitals (Hue Medical University, Hanoi Medical University). The 3D simulation system provides learners with visualized body parts and details the aspects of anatomy. The interactive nature supports operations like rotate, hide, show, move, view name, scientific name, etc. It can also provide a brief description of the genus anatomy, marked on a specific organ’s surface in different ways. Moreover, it can describe movements, receiving signals, searching, listing, and diversifying anatomical units such as anatomical landmarks, anatomical regions, anatomical subjects, anatomical systems, and anatomical sites. However, it may not be compatible with anatomical pictures or templates.
The system allows users to interact directly in 3D space via 3D projectors, 3D glasses, or various glasses that are compatible with the VR system include oculus rift, Gear VR, HTC Vive, GPS, and touch devices. It also promotes interaction through a compatible screen with many operating systems like Windows, Mac, and Linux and customized depending on different devices like smartphones and tablets (Android or iOS).
The methodology of the proposed study discusses the architecture of the system, implementation, and algorithm as follows:
3.1 Architecture of the Proposed Work
This study considers two cases for the anatomical multi-AR interaction: 1) Different users in the same physical space; 2) Different users in the geographical part (see Fig. 2).
3.1.1 Case 1: Different Users in the Same Physical Space
In the first case, all devices observe the 3D objects placed in a physical position. For instance, how to put a virtual body on a floor or a table can be carried as follows (see Fig. 3):
Device A (a phone or other AR glasses like HoloLens):
• Step 1: Executed AR application on device helps to localize and map the real-world spaces by combining pre-fixed algorithms with sensors such as SLAM.
• Step 2: The device recognizes the surrounding space’s planes based on a feature point cloud shaping and saves these feature points in the storage.
• Step 3: The application establishes virtual planes found from feature point clouds to form a corresponding virtual world map including all detectable planes.
• Step 4: Users place a 3D object on a plane in real space by interacting with the virtual place positions such as touching the screen on the phone or air tapping on HoloLens. After that, users can set the appointment via anchor points, including location, rotation, and scale parameters.
Device B:
• Step 1: The application scans the actual space to record itself a local world map while it is in operating mode. The scanning process on device B continuously executes and compares world maps with device A until the two maps match in the point cloud. The calculation depends on certain factors such as lighting, current objects in the space, and different device sensors.
• Step 2: In the next step, anchor values get downloaded from the cloud server or peer device
• Step 3: AR application on device B re-creates the 3D objects on the device’s world map.
All information, interactions, or movements from any device (user) is sent to other devices in session by approaches like P2P messages, Client-Server. The user extracts information from the server or another device and makes appropriate changes in the scene as per the receiving directives. The message information includes the name of the interacting virtual object, position, rotation, and scale. In this instance, device A and device B can see the same 3D object in one place and can determine each other’s position through relative position with the anchor’s angular coordinates.
3.1.2 Case 2: Different Users in the Geographical Part
In the second case, the users join AR sharing experience and stay far apart geographically and the procedure to implement is similar with case 1. The difference is that when device B scans the surrounding space, the accuracy of the anchor point depends on whether the actual area around device B has been designed to coincide with the location of device A. The real space of the two devices is entirely different and the anchor point on device B gets recreated at a relative position (on any plane). This can be accepted as the joining users do not see each other. They are aware of each other’s position based on their avatars.
3.2 Implement AnatomyNow in an AR Environment
The implementation of the AnatomyNow in the AR environment is discussed as follows:
3.2.1 Create 3D Pin and Drawing
We create multi-users 3D interactive tools by marking the drop pin or interacting with color areas on a 3D object. Steps to create the 3D pin are as follows:
• Step 1: Draw a straight line in 3D space such that ‘line d’ passes through and has an Eigen frequency vector (see Eqs. (1) and (2))
• Step 2: Draw a plane P going to 3 points M0, M1, M2 (see Fig. 4).
where M1 and M2 are determined by the smallest plane via M0. The points have been clicked and hereby are designated by vector .
• Step 3: From M0, we define N0, which belongs to line d in top of the direction (so that the Pin stays outside the Object (see Fig. 5))
The code of vector obtained from the clicking on the plane
PickHandler:
osg::Vec3 normal = plane->getNormal();
osgUtil::LineSegmentIntersector::Intersection& result
osg::Vec3 normal = result.getWorldIntersectNormal();
The code of drawing a plane going to 3 points M0, M1, M2.
osg::Plane* plane = new osg::Plane;
plane->set(osg::Vec3(x0, y0, z0), M1, M2);
osg::Vec3 normal = plane->getNormal();
• Step 4: Attach the Pins to the selected location
Draw a line going to 2 points and then make a sphere with focus: . Figs. 6 and 7 represent the sphere with focus and resulting outcome from the pin interaction respectively.
3.2.2 Create Color Areas and Interact With 3D Objects
The method is based on selecting the RGB color area on the map of 3D objects as follows:
• Step 1: Draw before selection and identifier (see Fig. 8)
• Step 2: Create data into the table (see Fig. 9)
• Step 3: The interacting code with the selected area
Virtual void doUserOperations(osgUtil::LineSegmentIntersector::Intersection& result); osg::Texture* texture = result.getTextureLookUp(tc);
osg::Vec4 textureRGB = myImage->getColor(tc);
int red = textureRGB.r() * 255;
int green = textureRGB.g() * 255;
int blue = textureRGB.b() * 255; // Đọc ra màu Blue
• Step 4: Code processing in Fragment and Vertext (see Fig. 10)
uniform sampler2D baseMap;
varying vec2 Texcoord;
vec4 fvBaseColor = texture2D(baseMap, Texcoord);
vec3 color = (fvTotalAmbient + fvTotalDiffuse + fvTotalSpecular);
gl_FragColor.rgb = color;
3.2.3 Sharing Multi-Users Technique
Device A
• Step 1: When device A executes AnatomyNow, the application sets the device’s camera at the original location, Camera Position (0,0,0), together with actual angular rotation Camera Rotation (0,0,0) (see Fig. 11).
• Step 2: The camera scans the real space and shapes the world map. When planes get created from feature points, we determine the Vector3 coordinates of each plane. For example, Table plane made from coordinates of the table: Table Plane (x, y, z)
• Step 3: Place a 3D object into a plane by touching the screen (a phone) or air tapping (HoloLens). At the touching point, a ray cast hits a plane to position the 3D object. A brain 3D model is rendered at position Brain Position (x1, y1, z1), this point called as Anchor Point.
• Step 4: Calculate vector distance and angle deviation from Brain Position to Table Plane
Vector3 distance = TablePlane.transform.position – BrainPosition.transform.position; Vector3 angle = TablePlane.transform.eulerAngles - BrainPosition.transform.eulerAngles;
Device B: The following device B taking part in AR Sharing
• Step 1: Likewise, device A and device B also join the AR Sharing experience and scan the surroundings to determine the plane and Vector3 coordinates based on each device’s original coordinates. Device B includes TablePlane1 (x’, y’, z’) being in charge of scanning the environment and shaping the world map until a match between the two devices (the identical characteristics are quantity, scales, detected plane positions) is found. A random TablePlane1(x’, y’, z’) is created when there is no coincidence between world maps of participant devices (see Fig. 12).
• Step 2: Device B renders the brain 3D model at BrainPosition1 (x1’, y1’, z1’), calculated based on Device A’s distance-vector value and received angle.
Vector3 BrainPosition1.transform.position = TablePlane1 transform.position – distanceVector3
BrainPosition1. transform.eulerAngles = TablePlane1.transform.eulerAngles – angle
Multiple Interaction: After the two devices participate in the AR experience and share an anchor point, each device’s interactions on the 3D scene take anchor coordinates as the landmark. Multiple interactions are used for setting the calculations and parameters for other devices. When a user using device A implements a marking pin on the location of the 3D brain model, the acknowledgment is sent to another device. The process described is as follows:
• Step 1: First of all, make a ray cast on a 3D brain model. At the hit position, a pin is positioned with the coordinates of Pin Position (x2, y2, z2). Calculate the distance and angle values:
Vector3 distanceToAnchor = BrainPosition.transform.position – PinPosition.transform.position;
Vector3 angleToAnchor = BrainPosition.transform.eulerAngles - PinPosition.transform.eulerAngles;
• Step 2: In a session, a message of the newly- released values is sent to other devices with the following code (see Fig. 13):
CustomMessages.Instance.SendTransform (PinName, distance, angle);
• Step 3: All the devices participating in the session are created to get the messages, which are eventually extracted to handle parameters such as the name of the 3D object, distance, angle, etc.
The pin position on device B (proposition 1) may be calculated as follows:
Vector3 PinPosition1.transform. position = BrainPosition1 transform.position – distanceToAnchor
Vector3 PinPosition1. transform.eulerAngles = BrainPosition1.transform.eulerAngles – angleToAnchor
• Step 4: Changing the state on the 3D object at device B
When device B moves or rotates pin angle, it shares the same parameters information with device A. The two devices can fully interact on the 3D object in the AR Sharing experience. With the mechanism mentioned above, along with manipulating pin position or moving 3D objects, the system allows us to draw or select the RGB color area into 3D space together.
3.3 Workflow of the Multi-Device AR Interaction
This section classifies multi-device AR interaction into three categories: 1) create session 2) generate the anchors 3) locate the anchors. In the create session, Microsoft HoloLens is employed to develop the first session by executing a developed application (AnatomyNow). In the next two categories, HoloLens generates the anchor points and locates through the client device for the multi-device AR interaction.
Fig. 14 illustrates the workflow of the multi-device AR interaction. Device A executes AnatomyNow to generate anchor points, and sends it to the AZURE spatial anchor. Spatial anchors send an acknowledgment to device A as an anchor ID. After generating a successful anchor ID, device A sends this ID to the sharing services and retrieves another ID from the sharing services as an acknowledgment. Device B will try to connect through this anchor ID. Firstly, Device B sends information to the spatial services and retrieves the data, including the Anchor points. Moreover, it also provides permission to device B for entering the multi AR environment.
For the AnatomyNow testing with the multi-sharing technique, we ran our experiment in the center of visualization and simulation, Duy Tan University. This study’s primary goal is to ensure adequate training and teaching of the human body anatomy for students, academic staff, and healthcare staff who lack professional experience in Microsoft HoloLens. A total of 10 participants were registered for the experiment. The participants were split into five different groups. The first step was to provide the experiment description and sign the mandatory terms and conditions agreement. Before starting the experiment, instructors gave a 10 min presentation on how to use the HoloLens. After 40 min of investigation, five groups provided their feedback about the experiment in the last step. The feedback was evaluated using two parameters: 1) data lagging between the devices, 2) accuracy in the model’s position or distances between the models in different devices.
As per the Client-Server model, AnatomyNow services are executed on the server, and clients are connected to this server. While sharing at the same location, a socket-based communication protocol is implemented in the local area network (LAN) based on the mixed reality toolkit provided by Microsoft. All participating HoloLens users were connected to the same LAN (see Fig. 15).
The proposed system tracks user interactions and synchronizes them to be applied to shared objects. It includes a transformation matrix of objects and any state of minimal change. The sharing menu is designed to deal with 3D objects in the easiest way for any user.
Fig. 16 shows the AnatomyNow executed through multi-devices. For analyzing AnatomyNow on a multi-share environment, we consider two parameters, i.e., data lagging between the devices and how close the AnatomyNow models appears to be with another device. The purpose of this study was to get feedback about the experience of the AnatomyNow on the multi-device AR interaction. A total of 10 participants were considered for the multi-user AR environment experiment. The group was divided into five groups of 2 users each. All the participants were familiar with the AR anatomy application, HoloLens device, and goals and objectives of the proposed solution. Two parameters, i.e., data lagging between the devices, and distance between the two models in two different devices were used to analyze the application in the multi-user AR environment.
The experience of the participants was measured using the Likert scale from 1 to 5. On a Likert scale, 1 represents the most negative experience, and 5 represents the most positive experience. Tab. 2 presents the analysis of AnatomyNow in the HoloLens.
Fig. 17 illustrates the user experience on AR anatomy with HoloLens. Data lagging between the devices from group 1, group 2, group 3, group 4, and group 5 are 4, 4, 3, 5, and 4 respectively. The close distance between the models in different devices from group 1, group 2, group 3, group 4, and group 5 are 5, 4, 4, 4, 4, respectively. The overall results of this experiment are good, except for group 3. The reason behind data lagging between the devices in group 3 may be device faces data fluctuations in the internet speed or data latency.
Tab. 3 discusses the comparative analysis of the proposed solution with other solutions. The overall evaluation of this experiment is quite satisfactory. However, the number of devices that share the same environment in the HoloLens is limited. While integrating, there was an observed delay for device collaboration. Due to latency in the internet bandwidth, 3D objects render in different locations.
Human Anatomy training with real corpses faces multiple practical problems. Hence, it is essential to find alternative ways to replace the traditional anatomy training method, and a multi-user 3D interactive system is one of the preeminent methods. In this study, we have designed and developed an anatomy application and used it in a multi-user environment. A multi-user simultaneous control strategy carries out coordinated control of the user’s 3D virtual sets. It is also capable of avoiding concurrent conflicts that may be created during the interaction. The interactive multiplayer system has been designed to realize collaborative interaction for multiple HoloLens users’ in a similar 3D scene.
Furthermore, 10 participants used an AR interactive system that encourages AnatomyNow. The results show that using a comfortable, interactive mode leads to a better and robust interactive experience and also provides a realistic experience. It is an exciting concept that may be worth exploring. In the future, we would like to study multi-user interaction techniques to ensure better accuracy and low latency. Likewise, multi-user sharing techniques may be embedded with artificial intelligence techniques and computer vision algorithms to make systems more efficient and reliable.
Funding Statement: The authors received no specific funding for this study.
Conflicts of Interest: The authors declare that they have no conflicts of interest to report regarding the present study.
This work is licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |