Long time no see, how is AR developing
Three years and three years, the original hot virtual technology has disappeared? This article will take you through the current state of AR technology development, briefly introduce the key technologies involved in AR, and finally show you how to run an AR Demo by hand. Please feel free to like, comment and leave comments if you have any help.
I. Introduction of virtual reality technology
1.1 AR/VR/MR Concept differentiation
AR(Argumented Reality) : An augmented Reality. The device identifies the Reality (shape, position, action, and edge) to overlay virtual information in the Reality.
Playground AR game (image from Apple App Store)
VR (Virtual Reality) :
World of Diving
MR (Mixed Reality) :
View blood vessels on your body (image from Internet)
At the time of writing this article in December 2021, VR technologies and devices were already in their early stages, such as various VR games and OculusQuest 2 devices with consumer maturity, which were hotly hyped based on the above concept of the ecouniverse.
Compared with VR, AR still has many technical difficulties to achieve a mature consumer experience. This paper mainly discusses the development context and current situation of AR, VR technology is not in the main category of discussion.
1.2 Historical development and current situation of AR
- The beginning of AR history
The term AR was coined by researchers at Boeing in 1992, and the concept has grown in popularity over the years, though it has remained in the laboratory.
- The first AR SDK
By 1999, Kato Hirokazu was leading the AR Toolkit project, which could do 2D object recognition and make virtual objects follow physical objects. In 2000, THE ARToolkit project was open source, and in 2005, it supported Symbian system, becoming the world’s first mobile terminal open source AR-SDK. Since 2008, it has been available for IPhone and Android. As of December 2021, ARToolkit supports most of the world’s popular platforms and can be integrated into Unity to develop AR games with each other.
- The first AR motion sensing game
In 2003, SONY released EyeToy on the Playstation, which uses an external camera to detect player movements and interact with games. Motion-sensing AR games became popular all of a sudden.
- Google Glass exploded the AR concept
Although AR technology has been steadily advancing, it is Google Glass that has really brought AR concepts to the masses. Google Glass uses optical reflection projection (HUD), in which a tiny projector projects light onto a reflective screen that is then refracted through a convex lens to the eyeball to achieve what is known as first-order magnification, creating a virtual screen large enough to display simple text and data
I don’t think this picture infringes
While Google Glasss is more of a “phone to wear on your face,” many of the technologies involved in AR, such as environment detection, are not involved in user interaction. But it involves AR display technology, which projects light information to the eye. Therefore, its emergence has aroused wide discussion among people, one is wearable device, the other is AR technology.
- HoloLens, the first generation of black technology in AR field
In 2015, Microsoft released HoloLens, an AR headset that shocked the world. It can detect the external environment, integrate virtual objects with the display environment, and complete the interaction by tracking the user’s gestures. This is the first relatively mature AR device in the true sense. It is small, has independent computing and storage capabilities, and does not affect the user’s view of the environment when wearing. In 2019, Microsoft released HoloLens 2, which solved the first-generation device’s narrow field of vision and significantly improved the accuracy of gesture tracking.
Aided Industrial Design (image from web)
HoloLens2, while relatively mature, is expensive, costing $3,500 at the time of writing. In addition, although it is smaller than other AR devices, it still weighs 500GB, while the iPhone 12 weighs 152G, which makes it difficult for average users to accept the weight of wearing three phones on their head. The good news is that Microsoft has revealed that the HoloLens3 will weigh less than 90 grams.
- Other players in the AR world
Magic Leap was widely questioned as “PPT cheater company” Magic Leap released its first generation of products in 2017 Magic Leap One. In addition, Apple and Samsung are both rumored to release their AR glasses in 2022. Domestic enterprises have coolpad Xview series, bright vision, NrealLightAR launched AR products and equipment.
Concept of Apple smart Glasses
- Iconic game in AR world
Niantic released Pokemon Go in 2017. The lBS-based game became a global hit, raking in more than $200 million in its first month and $2.65 billion after three years on the market. The success of Pokemon Go quickly exploded the AR gaming market. Since then, AR games continue to emerge, such as construction, chess and so on.
Pockmon Go and My Country game interface
- Popular SDK for AR on mobile
At present, the FRAGMENTATION of AR SDK on mobile terminal is serious. Dozens of different SDKS have been produced at home and abroad. The following will list some SDKS with important influence.
- Apple’s ARKit: Released in 2017, it is considered the most powerful and commercially promising SDK to date. Working with Apple’s framework libraries like SceneKit, SpriteKit, RealityKit, etc., can make cool apps that work seamlessly on almost any iOS device (iOS 11.0+, iPhone 6s+).
- Google ARCore: It supports both Android and iOS. Because it has no control over hardware, it is not as good as ARKit in overall measurement, but there is no significant difference for consumers.
- Long-standing ARToolkit: As mentioned earlier, the open source community is not very active.
- Commercial SDK Vuforia: it has good performance and supports Android, iOS and UWP. If possible, it will use ARKit or ARCore at the bottom. If it does not meet the use conditions, it will use its own AR engine capability.
- Domestic influential AR companies: SenceAR of Sensetime, Taixu AR, Laoyun, etc.
Most SDKS are supported as a development plug-in in Unity. At the same time, in order to reduce the SDK learning cost of developers, Unity official packaged several SDKS including ARkit and ARCore into ARFoundation SDK again. Of course this SDK is only suitable for development in the Unity editor.
1.3 Status summary and outlook
At present, AR technology still has many pain points, mainly focusing on the following aspects:
- The calculation of detection environment is very large, so the real-time performance is not good. Once many devices move fast, the fusion of virtual object and environment will produce drift error.
- Power consumption, computing and frequent use of cameras have resulted in low battery life for AR devices, most of which are less than 5 hours.
- Wearable AR display technology is not mature, most of the equipment in the definition, field of vision, experience have some deficiencies, and even produce certain “motion sickness”.
- It’s not miniaturized, and even renderings of Apple glasses are much bulkier than regular glasses.
- The software ecosystem is immature.
Based on the above reasons, the overall user scale of AR ecosystem is smaller than that of VR, and AR glasses are mainly applied in b-end enterprise customization fields, such as medical treatment, military industry and remote assistance. Mobile AR content applications mainly focus on tools and games, and most of them are only suitable for “slow moving speed” and “narrow map environment (indoor)” application scenarios.
The development of 5G technology is considered to be a major boost to AR technology infrastructure. Based on the concept of cloud computing, the high transmission speed brought by 5G can change the architecture of AR equipment: namely, to give up the computing structure, cloud resources are responsible for the computing task, and AR equipment is only responsible for the intake and display of pictures. In this way, AR devices will be able to overcome the disadvantages of computing performance, power consumption, heating, and large size.
Ii. Introduction to key Technologies in AR Field (Learning Principles)
2.1 Environment detection techniques (self-positioning, DETECTION of 3D objects, environmental modeling)
The first unique technology of AR is how the device “understands” the external environment. Most AR technologies record the external environment as a “dot cloud”, such as a 3D object, and the device identifies its edges, dimensions, and key feature points. These points, together with depth (or distance) information, form the three-dimensional coordinates of the solid, forming a “lattice cloud”. Generally speaking, the process of establishing the modeling of the entire environment map is as follows:
- The user arrives at a new location and turns on the camera.
- The equipment determines its height and orientation according to the gyroscope and other sensors, and takes this as the origin of the spatial coordinate system.
- Identify user camera input and conduct “dot matrix cloud” modeling.
- The user moves around until the entire map is scanned and modeled.
The process, known as SLAM (Simultaneous Localization And Mapping), is widely used in robotics, such as sweeping robots And food delivery robots.
A key technical point in SLAM technology is how to obtain depth information (distance information) of a point. The industry has the following solutions:
- Binocular matching (two cameras)
This is a mode for learning depth information from human eyes. Because people have two eyes, the distance between objects and human eyes can be calculated by comparing the images seen by the two eyes. Therefore, binocular matching is to use two cameras to shoot pictures and calculate the distance by using the difference between objects in the abscissa of the two pictures. The following figure is a conceptual diagram, but the real situation is not so simple.
Conceptual diagram of calculating distance by binocular matching
- Single visual distance
How is it possible to get proximity information about objects with one eye closed? Mainly through life experience and some eye muscle memory. This “experience” corresponds to the trained algorithm model in the program. Apple has made a strong case for visual distance measurement. Many older machines have only one camera. Apple also uses image algorithms to recognize how far objects are. This solution is the most cost-effective one and fits seamlessly into most existing equipment.
- Depth camera
Depth point clouds are obtained by the reflection of lidar or structured light (RGB-D) on the surface of the object; Use infrared ray to illuminate the object, measure ToF time and so on. Most sweeper robots use depth cameras to probe the terrain. The iPhone12 Pro series has an extra LiDar underneath the camera, which uses lasers to judge distance and depth
The robot uses depth cameras to probe the terrain
SLAM technology has been widely applied in various industries, not limited to THE AR field, such as autonomous driving, 3D modeling, robotics and unmanned aerial vehicles.
2.2 Display Technology (HMD, Light field imaging, holographic projection)
Another key technology of AR is display technology, which can be divided into three categories:
- Visual difference display technology
The principle is the same as 3D movies, using two eyes to see different images to form a visual difference. Using two different monitors or light projectors that allow the left and right eyes to see images from different perspectives of the same object, the brain combines the two images to achieve a “three-dimensional perception.”
- Holographic projection technology
Holographic projection involves projecting light onto air or a special medium, such as glass or a holographic film, to create a 3D image. People can view the image from any Angle and get the exact same visual effect as in the real world. At present, many stages use holographic projection technology, such as The Chinese Spring Festival Gala, Hatsune Miku Concert and so on.
Hatsune Miku Concert
- Light field imaging technology
Without taking advantage of poor vision, the device projects a 3D object light field directly to the human eye, simulating the most realistic reflection of light from nature. The concept comes from Magic Leap’s promotional video from 2015, which turned out to be a post-production video. Magic Leap has been valued at $4.5 billion by numerous investors, including Google, Alibaba, Qualcomm and others. As of December 2021, Magic still has not come up with any light field imaging technology equipment, so it is widely questioned and criticized by the industry.
Magic Leap’s naked eye AR concept video
Display technology is related to the most direct experience of AR. At present, none of the AR devices on the market can achieve the “natural level” experience. If light field imaging technology is mature, the convergence distance and focus distance of AR objects will be the same, effectively alleviating the problem of “motion sickness”.
2.3 Interactive Technology
So far, in addition to the traditional touch, button, voice and other technologies, AR has roughly three virtual interaction technologies:
- Motion capture: Mainly hand motion capture. At present, Hololens2 has the most abundant technology in this area. Without any peripherals, the device can be smoothly controlled directly through gestures.
Hololens 2 gesture tracking
- Eye movement tracking: use a camera to capture images of human eyes or faces, and then use algorithms to detect, locate and track human faces and eyes, so as to estimate the changes in the user’s line of sight. At present, there have been a number of mature eye tracker on the market, with which it is more convenient to play 3D games. For example, the lens can be moved by eye rotation, without the need for mouse movement. Eye tracking has many applications, from polygraphs to mindfulness tests. A number of distance education companies are using eye tracking devices to determine whether students are paying attention to a teacher.
Eye tracking – Locates a person’s pupil
- Brain-computer interfaces: Operate devices that recognize electrical signals from brain activity. Musk’s Neuralink company detects and transmits brain signals by implanting neural wires a tenth the width of a human hair into the brain. This scheme is more sensitive and accurate in signal detection and processing than non-implanted scheme. Below Neuralink shows a video of a monkey playing a video game through a BRAIN-computer interface, with the portal here.
Youtube video – Monkeys play games through a BRAIN computer interface
All of these interactive technologies look very exciting, not just for AR, but for VR as well. I believe that we are on the eve of something far more dramatic than mobile Internet adoption over the next decade (Facebook officially changed its name to Meta on October 28, 2021).
Introduction to developer technology
3.1 Technology selection
At present, head-mounted devices are not popular, and each manufacturer plays with each other. Microsoft Hololens2 is the strongest head-mounted AR device at present, and its development method is Unity+MRTK plug-in. Currently, Hololens is mainly used in enterprise-level scenarios. Such as military, remote assistance and so on. Other AR headsets can be abandoned for now and wait for the big players to enter (Apple, Samsung, etc.).
Mobile AR app developers are advised to stay in the lap of Google and Apple respectively. According to section 1.2, you can learn ARCore and ARKit respectively. Both SDKS will theoretically be used by future AR headsets as well, so it’s not a wasted effort to get in at this point.
Cross-platform developers are advised to use Unity’s ARFoundation SDK, which is officially packaged with Unity, to block out platform differences.
3.2 Common functions of AR SDK
The following table, from Unity’s official website, provides a good overview of most of the FEATURES of the AR SDK. Developers can get a first glimpse of how AR works.
function | describe |
---|---|
Equipment tracking | Track the location and orientation of the current device in physical space. |
Ray projection | It is usually used to determine the display position of virtual content, that is, to project a virtual line to determine the relative position of virtual objects and real objects. |
Plane detection | Detect the size and position of horizontal and vertical surfaces such as coffee tables and walls. These surfaces are called “planes”. |
Reference point (feature point) | Track the position of the plane and feature points over time. |
Point cloud detection | Important mechanisms in SLAM refer to section 2.1 above |
gestures | Gestures are recognized as input events based on the human hand. |
Facial recognition | Face detection, such as the appearance of a face on the AR effect attached to the face. |
2D image tracking | Detect specific 2D images in the environment. For example, to identify a specific TWO-DIMENSIONAL code to start the AR effect |
3D object tracking | In the same way in |
Environmental detection | Detect light and color information in specific areas of the environment, which helps to make AR content blend better with the environment, such as creating shadows. |
The grid | Generate a triangular mesh that corresponds to the physical space and thus is more able to interact with and/or visually overlay details on the representations of the physical environment. |
2D and 3D body tracking | Say no more, experience it yourself. |
Distance and occlusion | The distance to objects in the physical world is applied to the rendered 3D content to achieve a realistic mixture of physical and virtual objects. |
Multi-device tracking | Share and display the locations of other AR devices in the same AR session (game). |
3.2 ARFoundation SDK (Mac+iOS)
I’m an Android developer, but I don’t have a device that supports AR, so I chose iOS to try out AR. Download the necessary tools first:
- Unity: You need to register a Unity ID, which is similar to the Apple ID. If you are just learning to use it, you can download the Unity personal version (UnityHub is recommended to download and manage Unity).
- Xcode: The Unity compilation process is completed as an Xcode project, and the user configures parameters in Xcode to compile the final app.
- Visual Studio Code and its plug-ins: Unity has no Code editing capability, and clicking on the script will default to the user-set editor to write the Code.
Start a new project:
- To create a new project through UnityHub, there is a default AR project template.
- After Unity is created, you can see that a Scene has been created in the upper left corner by default, which contains AR Session Origin and AR Session.
- After the project is created, switch the target compilation platform to iOS in Unity File->Build Setting.
- Click on Player Settings.. To install the AR suite at startup.
- Reading the demo code (C#), you can see that the demo is designed to detect a plane, and the user clicks on a plane to place an object.
- Create a Cube prefab in Unity and drag it onto the member variable of the AnchorCreator. Cs script (don’t panic, that’s how Unity works).
- Finally, you can start compiling and running. Click File->Build AndRun in Unity, an Xcode project will be generated by default, and Xcode will be automatically opened. Log in AppleId in Xcode for real machine debugging, and configure the signature and other information.
- The results are as follows. First, the carpet is detected as a flat surface covered with white dots. Click on the carpet to create a Cube on the ground, and click again to create another Cube.
Limited by space and energy (lazy?) The demo will not be expanded further. Readers can try the following for themselves:
- Replace the dung yellow Cube with a prettier AR object, such as a small dinosaur, a tank, or a milk dog.
- Add simple interactive events to AR objects, such as following your device around randomly (dog walking?). .
- For example, if you click on a tank, it fires bullets.
If the reader has the need to leave a message in the comment section, the author will be based on the needs of a detailed article on how to achieve more AR effects.
conclusion
At present, AR technology has achieved considerable maturity. I am looking forward to the launch of AR glasses by Apple and Samsung in 2022, which may officially open the ERA of AR for all things. Mobile Internet tuyere you did not grasp, might as well in AR or VR tuyere share a slice of the soup. Will some people worry about entering at this time will be useless (cannon fodder)? I don’t think so. Developers and interaction designers can gain experience in mobile AR/VR first. The interaction modes and technologies of AR/VR are significantly different. These accumulated experiences and technologies can be copied to AR glasses, which can be completely regarded as a mobile phone worn on the face, right?
Now that both Apple’s and Google’s technologies have reached a plateau, and AR Glasses will continue to use existing SDK technology, it’s a good time for developers to enter the market.
So what technologies should we stock up on as an average developer?
- Unity tech stack (C#) : Unity is a must-learn for game development or not, and is perfect for building 3D content.
- ARKit SDK tech stack (not much on Apple’s ecosystem) : The biggest tech stack in the consumer market, you can’t make money without it.
- Hololens technology stack (C#) : optional, the most powerful AR wearable device on the surface, you can’t ignore the possibility of its sudden explosion, and the track is very new.
- ARCore SDK(JAVA+kOTLIN) : Optional, the same reason you learn Android, occupy many devices, the threshold is low, the application field is very wide, the most potential technology stack, in the future maybe you to develop robot AR program also maybe, is not?
I currently prefer Unity+ARKit ecology (lack of money), so I will continue to work in this area, if you need, please leave a comment.
Reference documentation
Web and virtual reality webAR:kstack.corp.kuaishou.com/article/433…
Well quickly AR:mp.weixin.qq.com/s/0LcJwlJor all things…
The difference between AR and MR: www.zhihu.com/question/39…
The history of apple AR: zhuanlan.zhihu.com/p/420474476
AR introduction to development of total article: zhuanlan.zhihu.com/p/87017830
ARToolkit introduction: www.cnblogs.com/polobymulbe…
Monocular depth of field of study: www.zhihu.com/question/38…
Monocular visual range incarnations: zhuanlan.zhihu.com/p/56263560
Magic Leap of the light field technology explore: www.zhihu.com/question/42…
Eye tracking technology: zhuanlan.zhihu.com/p/101479231
Unity Chinese Community Classroom: learn.u3d.cn/
The Unity of official document: docs.unity3d.com/cn/current/…
Unity B station official: space.bilibili.com/386224375/c…
Hi, I’m yuntai from Kuaishou E-commerce
Kuaishou e-commerce wireless technology team is recruiting talents 🎉🎉🎉! We are the core business line of the company, which is full of talents, opportunities and challenges. With the rapid development of the business, the team is also expanding rapidly. Welcome to join us and create world-class e-commerce products together
Hot job: Android/iOS Senior Developer, Android/iOS expert, Java Architect, Product Manager (e-commerce background), Test development… Plenty of HC waiting for you
Internal recommendation please send resume to >>> our email: [email protected] <<<, note my roster success rate is higher oh ~ 😘