Joe Hindy / Android Authority
TL;DR
- Niantic is building a new type of AI model that can understand and navigate the physical world.
- The company is training its AI on data gathered from its mobile apps, like Pokemon Go and Scaniverse.
- It’s suggested this AI could be used for supporting AR, robotics, content creation, and more.
AR mobile game maker Niantic is currently working on a new type of AI model meant to help computers better understand and navigate physical spaces. As with any AI, this model requires data to train itself on. It appears the company is leaning on the copious amounts of data its players provide for this task.
If you have even a passing interest in Pokemon, you might recognize Niantic as the company behind the popular AR game Pokemon Go. It has also created a number of other AR games and apps, such as its 3D scanning app Scaniverse. These games and apps take scans of the surrounding environment for their AR features to work.
In a blog post, first spotted by 404 Media, Niantic has announced that it is developing what it calls a large geospatial model (LGM). Drawing comparisons to large language models (LLM) — like Gemini and ChatGPT — that train on collections of text to generate written language, the company explains its LGM trains on “billions of images of the world, all anchored to precise locations on the globe” allowing computers to “perceive, comprehend, and navigate the physical world.”
As to what data this LGM is training on, Niantic reveals that it is using the scans collected through its mobile games and Scaniverse:
Over the past five years, Niantic has focused on building our Visual Positioning System (VPS), which uses a single image from a phone to determine its position and orientation using a 3D map built from people scanning interesting locations in our games and Scaniverse.
If you have played Pokemon Go, you have likely experienced this VPS through the Pokémon Playgrounds feature. Pokemon Playgrounds allows a user to place a Pokemon at a specific location. That data is able to stay in that location, allowing other players to interact with the digital creature when they enter that area.
According to the company, it has trained over 50 million neural networks, each representing a specific location or viewing angle. These networks are able to compress thousands of mapping images, creating a representation of a physical space. This representation can offer precise positioning for a location with “centimeter-level accuracy” when given a query image. Multiple networks could combine this knowledge to map an area and understand any location, even at unfamiliar angles.
An example the firm provides is standing near a church where only one angle has been seen. The LGM would allow an AI to fill in the blanks for how that building could look based on other similar images:
Imagine yourself standing behind a church. Let us assume the closest local model has seen only the front entrance of that church, and thus, it will not be able to tell you where you are. The model has never seen the back of that building. But on a global scale, we have seen a lot of churches, thousands of them, all captured by their respective local models at other places worldwide. No church is the same, but many share common characteristics. An LGM is a way to access that distributed knowledge.
The scale of Niantic’s operation is pretty impressive, to say the least. It claims that it receives over a million new user-contributed scans of real-world places per week.
How do you feel about Niantic using your data to train its LGM? Let us know in the comments below.