Update: Indoor Real-Time Navigation with SLAM on Your Mobile

SLAM is already in our phones since the release of Apple’s ARKit and Google’s ARCore. It is the basis for tasks that cover tracking camera pose in mobile AR and VR, to the more complex high-level understanding of the real world seen through a camera.  It is in the heart of devices such as; HoloLens, solutions for self-driving cars, unmanned drones and planetary rovers. This blog explains the process of implementing indoor navigation using only Unity and the Google ARCore SDK with today’s handsets.

Introduction

GPS based map technology has embedded itself in our daily life. We use GPS applications such as Google Maps to find and get directions to any location and even get personalized data for driving, walking or public Transport. We use it to find the shortest route to a destination, to meet friends, to share with others where we are, to find any point of interest, and a long list of other uses. These are mostly from a personal use perspective, we must also recognise that almost all infrastructure and transport industry in some ways relies on GPS technology.

However, GPS technology does have its limitations. There are some important drawbacks that limit this technology for the more connected new world flooded by IoT devices, drones, home robots and Mixed Reality (MR) glasses.

  • Low precision (~ few meters)
  • The fact that it works only outdoors
  • Poor signal reception in cities with tall buildings due to “urban canyon” effect

This is where navigation based on SLAM technology will play a significant role as it can track position within 1-2 cm of error and it works indoors as well as outdoors. It could also be combined in outdoor tracking with GPS to improve applications overall accuracy.

Indoor navigation based on SLAM is already a possibility in our phones. In a previous blog, I explained how to use the SLAM technology provided by Google ARCore to achieve inside-out mobile VR tracking. In this blog, I want to share how to extend the use of ARCore tracking capabilities to indoor navigation using Unity and the Google ARCore SDK.

Although I have used Unity and Google ARCore to develop a simple indoor navigation app, it is possible also to use other engines and SLAM SDKs. For example, in Unreal; Google ARCore and Apple ARKit SDKs are both available.

Before I get into the technical side of the blog, I hope you will enjoy building your own indoor navigation app and experience the same satisfaction I and my colleagues had while walking through the building where we work knowing the position was perfectly matched by the red dot in the office map. 

Mobile indoor navigation using Google ARCore SDK in Unity

For this blog, I will assume that the reader is familiar with the steps needed to get Unity working and has the entire environment prepared to build Android apps. Google ARCore is only available on Samsung Galaxy S8, Google Pixel and Pixel XL so far, so you will need any of these phones to build your own indoor navigation app. Personally, I have used a Samsung Galaxy S8 with an Exynos 8895 SoC. This mobile device has an Arm Mali-G71 MP20 GPU capable of delivering high image quality @ 60 FPS while running Google ARCore in the background for positioning tracking. I have used Unity 2017.2.0b9 to develop this project.

The first step is to create an empty Unity 3D project and import the Google ARCore SDK Unity package. You will also need to download the Google ARCore service (an APK file) and install it on your device. In my previous blog I explain these steps in more detail and the file structure created under the folder GoogleARCore in your project Asset folder.

Now we can start integrating ARCore in our sample. Drag and drop the ARCore Device prefab that you will find in the Prefabs folder into the scene hierarchy. This time we will use the First-Person Camera prefab as our main camera. Under the camera I created a Canvas component (Canvas CamPose) with a Text component to print on the screen the camera position when debugging and any messages we need to show. Be sure the Canvas and Text components are configured correctly so you won’t miss an error message. For example, in the Editor when you select the First Person Camera, the camera preview should render the default message of the Text component.

The next step is to add our own ARCore controller. Create a new empty Game Object (GO) IndoorNavController and attach to it the script HelloARController.cs we will borrow from the GoogleARCore/HelloARExample/Scripts folder. At this point we have integrated ARCore into our project. We will come back later to these ARCore objects to apply some changes.

We need a map of the area we want to navigate. It is important the map to be well scaled. When creating the texture map crop the image to a square so it could be easily further scaled if needed. Create a 3D Plane and assign a material to it. Assign the image map as the Albedo texture. We should see the map on the plane as the image below. Now we need to scale the plane so the distances in the map equal the real distances. At this point we will need a couple of tools. A distance tool to measure distances in the Unity Editor and a “real” tape measure.

plane with texture map

Figure 1. The plane with the texture map as shown in the Editor.

You can download from the Asset Store any of the available measuring tools. Use the tape measure to measure the distance between two points in the “real world” you later can easily identify on the plane. Then use the measuring tool to check the distance between these two points on the plane in the Unity Editor and scale the plane until the tool gives the same distance you obtain with the tape measure. This step is very important, otherwise we won’t get the correct positioning on the map later on.

Finally, create a sphere (SpherePointer GO) and assign a material and a color. The sphere will identify our position on the map when navigating so scale it to an appropriate size and assign a well visible color on the map.

At this point the hierarchy of your scene should look as the picture below. The DistanceTool GO appears disabled as you only need it to set the correct scale to the plane or you can simply delete it after you finish setting up the plane scale.

Figure 2. All the game objects in the project hierarchy

Figure 2. All the game objects in the project hierarchy.

Now that we have got all the needed elements on the hierarchy let’s do some configuration work on the ARCore objects. Let’s start configuring the ARCore session to use exclusively what we need for indoor navigation. Click on the ARCore Device GO and you will see in the Inspector the scripts attached to it as in the picture below.

Figure 3. The AR Core Device prefab in the Inspector

Figure 3. The AR Core Device prefab in the Inspector.

In the Inspector, set the Tracking type to Position Only (I will explain this point later). Double click on the Default SessionConfig asset to open the configuration options and untick the “Plane Finding” and “Point Cloud” options as we don’t need them and they add a substantial load on the CPU. We need to leave “Enable AR Background” (pass-through mode) ticked otherwise the AR Session component won’t work and we won’t get any camera pose tracking. However, as we really don’t want the camera feed to be rendered, then you need to manually comment the line that calls _SetupVideoOverlay() method in the SessionComponent.sc script file, otherwise we will see the camera feed rendered as a background.

Figure 4. Session configuration set up

Figure 4. Session configuration set up.

The next step is to edit the HelloARController.cs script we previously attached to the IndoorNavController GO to remove some code we don’t need. The picture below shows how my IndoorNavController GO looks in the inspector.

Figure 5. The IndoorNavController GO in the Inspector

Figure 5. The IndoorNavController GO in the Inspector.

As you can see I have modified the HelloARController.cs to expose the text component I use for displaying messages, the camera and the target the camera will follow. You need to populate these public fields by dragging and dropping the respective objects from the scene hierarchy.

As you can see in the code below I removed from the original HelloARController.cs script the _ShowAndroidToastMessage class method as I will use the Text component to display any message. I modified accordingly the _QuitOnConnectionErrors method to use the CameraPoseText component to show the messages in the screen. All the code in the Update() method related with plane detection, touching events, anchoring and adding the Andy robot has been removed. To save space I left only the explanatory comments.

namespace GoogleARCore.HelloAR {
    using System.Collections.Generic;
    using UnityEngine;
    using UnityEngine.Rendering;
    using GoogleARCore;
    using UnityEngine.UI;
    public class HelloARController : MonoBehaviour {
        public Text camPoseText;
        public GameObject m_firstPersonCamera;
        public GameObject cameraTarget;
        private Vector3 m_prevARPosePosition;
        private bool trackingStarted = false;
        public void Start() {
            m_prevARPosePosition = Vector3.zero;
        }
        public void Update() {
            _QuitOnConnectionErrors();
            if (Frame.TrackingState != FrameTrackingState.Tracking) {
                trackingStarted = false;                      // if tracking lost or not initialized
                camPoseText.text = "Lost tracking, wait ...";
                const int LOST_TRACKING_SLEEP_TIMEOUT = 15;
                Screen.sleepTimeout = LOST_TRACKING_SLEEP_TIMEOUT;
                return;
            } else {
                // Clear camPoseText if no error
                camPoseText.text = "";
            }
            Screen.sleepTimeout = SleepTimeout.NeverSleep;
            Vector3 currentARPosition = Frame.Pose.position;
            if (!trackingStarted) {
                trackingStarted = true;
                m_prevARPosePosition = Frame.Pose.position;
            }
            //Remember the previous position so we can apply deltas
            Vector3 deltaPosition = currentARPosition - m_prevARPosePosition;
            m_prevARPosePosition = currentARPosition;
            if (cameraTarget != null) {
                // The initial forward vector of the sphere must be aligned with the initial camera direction in the XZ plane.
                // We apply translation only in the XZ plane.
                cameraTarget.transform.Translate (deltaPosition.x, 0.0f, deltaPosition.z);  
                // Set the pose rotation to be used in the CameraFollow script
                m_firstPersonCamera.GetComponent<FollowTarget> ().targetRot = Frame.Pose.rotation;
            }
        }
       private void _QuitOnConnectionErrors() {
            // Do not update if ARCore is not tracking.
            if (Session.ConnectionState == SessionConnectionState.DeviceNotSupported) {
                camPoseText.text = "This device does not support ARCore.";
                Application.Quit();
            }
            else if (Session.ConnectionState == SessionConnectionState.UserRejectedNeededPermission) {
                camPoseText.text = "Camera permission is needed to run this application.";
                Application.Quit();
            }
            else if (Session.ConnectionState == SessionConnectionState.ConnectToServiceFailed) {
                camPoseText.text = "ARCore encountered a problem connecting.  Please start the app again.";
                Application.Quit();
            }
        }
    }
}

The important bits are in the Update() method. I always retain the last camera position from the Frame.Pose to calculate the translation vector deltaPosition from frame to frame. This is the translation applied to the sphere as we walk (cameraTarget ). As we are showing our position on a plane we remove any translation in the Y axis to assure that the changes in height of the phone camera we hold won’t make our sphere to go below the plane. Finally, we pass to the FollowTarget script the rotation (quaternion) provided by Frame.Pose.rotation.

Initially, I tried to apply the translation and oreintation to the sphere (camera target) that is  provided by Frame.Pose, but I struggled in getting a smooth motion of the sphere on the plane. This is the reason I set in the ARCoreDevice prefab the Tracking Type to Position Only. So I decoupled the position and the rotation info provided by Frame.Pose in such way that I use the position to apply the translation to the sphere from frame to frame and the rotation to set the camera orientation in the FollowTarget script as a kind of third person camera that is always behind the sphere and shows the map in the direction of our movement.

I am also providing the code that makes the camera smoothly follow the sphere motion on the plane.

using UnityEngine;
using System.Collections;
// Attach this script to the camera that you want to follow the target
public class FollowTarget : MonoBehaviour {
    public Transform targetToFollow;
    public Quaternion targetRot;                      // The rotation of the device camera from Frame.Pose.rotation    
    public float distanceToTargetXZ = 10.0f;    // The distance in the XZ plane to the target
    public float heightOverTarget = 5.0f;
    public float heightSmoothingSpeed = 2.0f;
    public float rotationSmoothingSpeed = 2.0f;
    // Use lateUpdate to assure that the camera is updated after the target has been updated.
    void  LateUpdate () {
        if (!targetToFollow)
            return;     
        Vector3 targetEulerAngles = targetRot.eulerAngles;
        // Calculate the current rotation angle around the Y axis we want to apply to the camera.
        // We add 180 degrees as the device camera points to the negative Z direction
        float rotationToApplyAroundY = targetEulerAngles.y + 180.0f;
        float heightToApply = targetToFollow.position.y + heightOverTarget;
        // Smooth interpolation between current camera rotation angle and the rotation angle we want to apply.
        // Use LerpAngle to handle correctly when angles > 360
        float newCamRotAngleY = Mathf.LerpAngle (transform.eulerAngles.y, rotationToApplyAroundY, rotationSmoothingSpeed * Time.deltaTime);
        float newCamHeight = Mathf.Lerp (transform.position.y, heightToApply, heightSmoothingSpeed * Time.deltaTime);
        Quaternion newCamRotYQuat = Quaternion.Euler (0, newCamRotAngleY, 0);
        // Set camera position the same as the target position
        transform.position = targetToFollow.position;
        // Move the camera back in the direction defined by newCamRotYQuat and the amount defined by distanceToTargetXZ
        transform.position -= newCamRotYQuat * Vector3.forward * distanceToTargetXZ;
        // Finally set the camera height
        transform.position = new Vector3(transform.position.x, newCamHeight, transform.position.z);
        // Keep the camera looking to the target to apply rotation around X axis
        transform.LookAt (targetToFollow);
    }
}

After all these changes we are almost ready. We just need to set up the player settings. Google recommends the settings below:

  • Other Settings > Multithreaded Rendering: Off
  • Other Settings > Minimum API Level: Android 7.0 or higher
  • Other Settings > Target API Level: Android 7.0 or 7.1
  • XR Settings > ARCore (Tango) Supported: On

Initial device-map synchronization

At this point we have got all the ingredients of our indoor navigation app, but to make it work correctly we need a last step; to sync the initial position of the device camera with the position of the sphere on the map. What I did was place the sphere in the position and with the orientation of where I start my navigation. As you can see in the picture below my initial position is at the office door looking to the inside of the office, so the forward vector of the sphere is oriented in the same direction.

Initial position synchronization

Figure 6. Initial position synchronization.

Keep in mind that the initial position is the position of the phone camera, not your position. You will normally keep the phone at your eye level with approximately 50 cm separation. Once you are in the initial spot with the right orientation launch the navigation app and wait a couple of seconds till you see that the screen message “Lost tracking wait ..”  vanishes. From now on you can move freely, going in and out of the offices, and the app will track your position on the map pretty accurately.

Syncing the start position in the map can be automized very easily, for example, by pointing the camera to markers conveniently distributed in the building you are navigating. The marker will encode the exact map position and will tell the app where you are in the map. Markers could help as well to correct the intrinsic drift to SLAM technology.

But what if the device loses tracking? If for some reason you walk through an environment with no noticeable feature to track, for example a completely white corridor with nothing on the walls and floor then ARCore won’t have any feature to track and the app will show the message “Lost tracking, wait…”. Let’s see how ARCore can recover itself from this kind of situation.

Area learning

Although it is not exposed in the settings, if we look carefully in the SessionComponent.cs script we can see that the method _ConnectToService calls _GetSessionTangoConfiguration and pass to it the session configuration settings we have set by means of the DefaultSessionConfig.asset. The _GetSessionTangoConfiguration method initializes the Tango service the ARCore is built on top. This method uses the settings from DefaultSessionConfig.asset to initialize some Tango options but there are others as for example the Area learning that is disable:

tangoConfig.areaLearningMode = UnityTango.AreaLearningMode.None;

Area Learning is a powerful concept that allows the device to see and remember the key visual features of an area we walk through: the edges, corners and other unique features. In this way the device can recognize later this area. For that, ARCore stores a description of the visual features in a searchable index so it can check for matches against what it is currently seeing.

ARCore uses the area learning concept to improve the accuracy of the trajectory by correcting the intrinsic "drift” of the SLAM technology and to orient and position itself within an area that it has already learnt.

Let’s then activate the area learning option by changing the previous code line to:

tangoConfig.areaLearningMode = UnityTango.AreaLearningMode.CloudAreaDescription;

We can now run an interesting test to see how area learning works. Launch the navigation app and after walking for some time, with your hand, cover the phone’s camera so the app won’t be able to get any camera feed and will show the “lost tracking” message. Continue walking and you will see that the map is no longer showing your position correctly. At this point uncover the camera, go back to some point you have already been before you covered the camera and walk in the same direction you initially walked. You will then see how your physical position and the position in the map are synced again and the app starts working again as expected. The area learning functionality has done its job.

Conclusions

I hope you have been able to follow this blog and build your own indoor mobile navigation Unity project. Most importantly, you have been able to run your app and experience the new power libraries like Google ARCore and Apple ARKit bring to our phones. We can expect this technology to become available to more devices soon and hopefully we will be able to see it widely working in inside-out mobile VR tracking and in indoor navigation applications. SLAM based navigation can also be combined outdoors with GPS technology to improve the overall accuracy.

As always, I really appreciate your feedback on this blog and please leave any comments on your own indoor navigation experience below.

Update 7 November 2018

Almost one year ago I wrote this blog, and most weeks I still receive many questions and comments, a clear sign of interest in this kind of technology. Since then, Google has released several updates of the ARCore SDK for Unity and ARCore service.  I will list the main changes to this project to ensure you can get it working with the latest ARCore release. I will also answer to some of the questions I have received since I wrote this blog.

  1. The Google ARCore service now needs to be installed through the Google Play Store, just search for Google ARCore and install it like you would any app. If you can’t find it then your phone isn’t compatible and it is not allowed to download and install Google ARCore.

  2. There should be a 1to1 relationship between any distance between two points on the map and the correspondent distance in the “real world”. This way we guarantee that when moving in the physical world the sphere will move on the plane exactly the same amount. For measuring the distance between any two points on the map in the Editor, I used a free tool available in the Asset Store: Distance Tool. You need to scale the plane properly until this distance measured with the tool is exactly the distance between the same two points in the “real world”.

  3. Now when you access the DefaultSessionConfig component in the Inspector, the options available are a little different. Set the same values as shown below:

    Figure 7: DefaultSessionConfig component options

  4. You don’t need to hack the code anymore by commenting the call to _SetupVideoOverlay() method in the SessionComponent.sc to avoid rendering the camera feed as a background. Instead you just need to access the First Person Camera GO in the Inspector where you will see the AR Core Background Renderer component as shown below:
    ARCore Background Renderer

    Figure 8: First person camera GO in the Inspector showing the attached background renderer script

    Set ‘Background Material’ to None to avoid the camera feed being rendered. Nevertheless, in the latest event demo I prepared using this project I decided to render the background to make it more evident to demo viewers that the navigation uses as input the camera feed. For this I first made the floor plan image used as the plane’s texture semi-transparent in Photoshop. Then I changed the shader of the plane’s material and set it to a legacy Transparent/Diffuse shader as below.
    Plane Shader

    Figure 9: Plane GO in the Inspector showing the transparent material used

    This way I rendered on the phones screen the semi-transparent plane making the background visible through it as shown below:
    Semi-transparent camera example

    Figure 10: Example of semi-transparent map overlay on camera feed

  5. Initial synchronization is especially important; otherwise we won’t get the red sphere to accurately track our movements in the real world on the map. We need to sync both the initial position and the orientation. First, we identify an initialization spot. Every time we launch the app in the physical world, we will do it from the same initialization spot. The initial position of the red sphere on the map should be exactly the same as the initialization spot. When launching the app, the back of the phone (where the rear camera is) should be pointing to the same direction as the Z axis (blue axis) of the red sphere on the map (keep the phone perpendicular, don’t tilt it up or down). The picture below shows in the floor plan the initial position and orientation I used.

Initial Positional Tracking
Figure 11: Initialization spot and orientation

Convert your app into an audio guide

I have recently added a few improvements to this app that I would like to share. The accuracy we can reach when using ARCore for navigation is ~1-2 cm, so I decided to go further and add collision boxes around some points of interest (POI) in the map. When we approach a POI in the real world the sphere collides with the POI’s collision box and it triggers a collision event that we can use to perform any action. I additionally associated an audio file with each POI GO so when the collision takes place, the name of the POI is automatically displayed on the screen (see the picture of the phone in point 4) and the audio description of the POI is played.

Below is the Inspector view of one of the POI with the name “1- Mobile Asset Visibility”. It has a Box Collider and Audio Source components. The Audio Source component has associated an audio clip.
POI GO in the Inspector

Figure 12: POI GO in the Inspector showing the attached audio source

Below you can read the script associated with the red sphere that display the name of the POI and plays the audio file.

using System.Collections;
using System.Collections.Generic;
using UnityEngine;
using UnityEngine.UI;

public class CollisionManager : MonoBehaviour {

    public Text camPoseText;

    void OnTriggerEnter(Collider other)
    {
        camPoseText.text = other.gameObject.name;
        // To play an audio attached to the collision object the red pointer
 // is colliding with
        if (other.GetComponent<AudioSource> () != null) {
            other.GetComponent<AudioSource> ().Play ();
        }
    }

    void OnTriggerExit(Collider other)
    {
        // To stop playing the audio when the collision stops happening
        if (other.GetComponent<AudioSource> () != null) {
            other.GetComponent<AudioSource> ().Stop ();
        }
        // Clean the message
        camPoseText.text = "";
    }
} 

And below you can see how the red sphere GO looks in the Inspector:
Collision manager

Figure 13: Sphere pointer GO in the Inspector showing the attached collision manager

I hope these updates will provide clarification and help you to better understand the technology, and more importantly, enable you to get this app working. I also hope you will get some fun by transforming the app into an audio guide.

As always, I really appreciate your feedback and questions.

Anonymous
Graphics & Multimedia blog