Augmented reality overlays video onto camera-captured video in such a way that the computer-generated objects appear to have an absolute location in the real world or relative to a real world object like a page in a book. In the case of a book, the augmented reality app uses the mobile device camera to display an overlay video superimposed over the corresponding illustration in the book.
Adding augmented reality to a book involves the following components:
- A book with illustrations
- Marker images that reflect the “active” illustrations in the book which are augmented
- Video or 3D model overlays that are superimposed over the footage coming from the mobile device’s camera.
- A Content Management System that matches marker images with video overlays, and is used to place the overlays with respect to the markers.
- An Augmented reality enabled app that is loaded with the marker images and overlays. It also contains any logic or rules-based structures which define potential interactions with the overlays.
- A mobile device with a camera through which the augmented reality app can search footage for marker images. Ideally, the device has a light that can be used as a flashlight to illuminate the marker images in dim light.
- Image recognition middleware which interprets the footage from the mobile device camera, determines when marker images are being triggered and how to integrate virtual objects into the display stream.
- The display streams the merged footage of the camera and the overlays back to the user. A cell phone appears like a “porthole” or “magic mirror” into the augmented world of the book, while a wearable display can provide total immersion.
To operate the augmented reality book, the user starts the app and points the camera view at illustrations in the book. The app attempts to match the images shown in the camera view with the database of markers. When a match is found, the app retrieves the overlay corresponding to the marker that was found and the overlay’s placement with respect to the marker from the database. The app sends the overlay to the display, which is placed as specified over the illustration in the book with the marker, and streamed to the user along with the footage from the camera.
How Image Recognition Works
The most complex of the components is image recognition or computer vision. The augmented reality book uses instance-level recognition to identify specific visual markers in the streamed real world footage from the camera. (Category-level recognition has the much more difficult task of classifying image content into known categories such as pedestrians, faces, dogs, plants, etc.) In this type of image recognition, the system stores image descriptors in a search indexing data structure and operates in three stages:
- Key point detection: The bottleneck in instance-level recognition systems – involves mathematical operations to detect corners and blobs – regions where discontinuities occur and thus can carry a lot, if not all, of the image information. Corners are regions in an image that are sharply different from their immediate neighboring regions. Blobs are regions darker or brighter than it’s surrounding region. In instance-level recognition, only these regions are further processed. Dominant orientation direction can also be extracted to make the recognition process rotation-invariant. The scale-space approach makes the process scale-invariant. Using greyscale markers and adding a greyscale filter on the camera footage prior to key point detection makes the process hue-invariant.
- Descriptor extraction: Descriptors are small patches around the key point, that are extracted in way that introduce tolerance to distortions and illumination issues of the scene. The number of such points/descriptors is typically between 100 and 2,000 per image depending on its complexity, thus it is important to find matching descriptors quickly between a database and the observed scene.
- Matching and verification: Once matches are found, those that are not strictly consistent with a particular model of the scene/object are filtered out, based on the probability of object presence. Thus this stage determines whether the marker is actually present in the illustration or not.
- Augmented Reality, Virtual Reality and Mixed Reality Explained – Simple explanations
- Our 5G Future – A Cool View of an Augmented World
- The Gigantic List of Augmented Reality Use Cases – It’s Mindboggling
- Augmented reality advances learning especially in informal science education context – AR helps learning
- Augmented Reality Apps for Books is the Future of Publishing – A non-exhaustive list of the types of AR Books
- Augmented Reality and the Future of Printing and Publishing – Great resource!
- Lack of AR/VR presents opportunity – Books are content and many children books brands have multi-media assets
- Hyper-Reality – A cautionary tale about augmented reality
- Strange Beasts is yet another cautionary tale about augmented reality – really it is more about mixed reality 🙂