Over the last few years, the growing pervasion of 3D sensors like radar, sensing cameras, and Lidar has created a need for scene understanding technology to process data captured by flagship devices. Such technologies enable Machine learning systems that use sensors such as robots and cars to operate and navigate in the real world while creating an enhanced Augmented Reality (AR) experience on mobile devices.
Recently, Google launched TensorFlow 3D using depth sensors and LiDAR for a state-of-art AR experience. This is undoubtedly one of the revolutions in the world of AR by Google. So, what is it up to? Let’s have a look:
3D Scene Understanding
Recently, the field of computer vision started making 3D scene understanding, including transparent object detection, mobile 3D object detection, etc. However, entering this field was a bit challenging due to the limited availability of resources and tools applied to 3D data.
To improve the 3D scene experience and reduce limitations, Google launches TensorFlow 3D (TF SD), an efficient and highly modular library designed to bring 3D deep learning capabilities into TensorFlow.
TensorFlow, also referred to as TF SD, offers a set of data processing tools, popular operations, loss functions, metrics, and models, enabling the wider research community to create, train, as well as deploy advanced 3D scene understanding models. It basically contains evaluation and training pipelines for 3D instance segmentation, 3D object detection, and 3D semantic segmentation, along with the support for distributed training.
The 3D Semantic Segmentation models
The 3D Semantic Segmentation model allows apps to differentiate between background scenes and the foreground object, with the Zoom’s virtual backgrounds.
Google has implemented the same technology with virtual video backgrounds for YouTube. On the flip side, The 3D Semantic Segmentation model identifies a set of objects like individual objects by putting virtual masks on more than one person in the camera view, just like Snapchat lenses.
The 3D Object Detection Model
When it comes to classifying objects in view, it takes instance segmentation a step further. These capabilities have been demonstrated with standard smartphone cameras, time of flight sensors, and LiDAR’s in-depth data, which open up new possibilities for state-of-art AR experiences.
TensorFlow has even contributed to some nifty AR experiences without the 3D repository. TensorFlow was also leveraged by Wannaby for its nail polish try-on tool and assisted Capital One with mobile features that identify cars and overlay data about them in Augmented Reality. Even independent developers also used TensorFlow to turn a rolled-up paper into a lightsaber with InstSaber.
Google has undoubtedly harnessed machine learning through TensorFlow for other Augmented Reality purposes as well. In fact, it is also the technology behind the Augmented Faces API of image detection Google Lens by MobileNets that brings Snapchat-like filters to mobile devices.
TensorFlow 3D model and codebase release has been the result of Google researchers’ collaboration with testing and feedback from the product group. It will be interesting to see what new advanced TensorFlow 3D makes possible for the future of the digital world.
For more information, latest news, and updated on Augmented Reality and Machine learning, stay tuned!