Through the looking glass: The number of potential use cases for generative AI tools is growing fast. With its newly introduced model in the Genie line, Google now wants to provide "creatives" who have no worldbuilding skills with a straightforward and rapid way to prototype their ideas.
Genie 2 is a new foundation model capable of generating an "endless" amount of 3D environments that can be controlled by the user, Google said. The generative AI can sprout out an entire virtual world based on a single textual prompt and sample image, which can be either user-made or generated by external AI models.
Other companies are trying to turn generative AI tech into a worldbuilding wonder, though results may vary. The Oasis experiment was designed to generate a Minecraft-like experience frame by frame, but it just resembles a low-definition Minecraft world affected by dementia at this point. Genie 2 can maintain an apparently consistent world for up to a minute, Google assures, though we're still talking about a low-res, garbled, and very unpleasant-to-watch visual mess.
Anyhow, Google doesn't seem interested in visual repulsion or uncanny valley issues in the slightest. Mountain View said that games play a key aspect in AI research, providing an ideal environment to test new capabilities. The Genie 2 worlds can be controlled by a puny human using a traditional keyboard and mouse combo, with the generative model simulating all the consequences of the players' actions.
Unlike the recently unveiled SIMA, Genie 2 can provide "intelligent" visual reactions in an endlessly generated virtual environment. The AI model can generate different routes, or "counterfactual experiences" for training agents, starting from the same basic frame. With different actions taken by the human player, the world around changes and hallucinates accordingly.
Genie 2 can also remember previously generated parts of the virtual world that are outside the player's camera, and even render them "accurately" when they come back in the frame. The model can create different player perspectives, including first-person view, isometric views, and third-person driving cockpits. Complex 3D structures and object interactions are also part of the mix.
Additional capabilities of the new foundation model include character animation, NPCs, physics, smoke, gravity, lighting, and reflections. Google said that Genie 2 and similar generative AI tech could be useful to prototype and experiment interactive experiences, with gaming being the first potential application that comes to mind. The research is still in its early stages, which means that there is a lot of room for improvement during the next few model training sessions.