By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
GamebixbyGamebixby
  • PC Games
  • Game Review
  • Upcoming Games
  • PC Game Pass
  • PS5 Games
  • Virtual Reality
Reading: Why Genie 3 suggests that AI “world model” is the path to photorealistic interactive VR
Share
Notification
GamebixbyGamebixby
Search
  • Home
  • Gaming News
  • PC Games
  • Game Review
  • Upcoming Games
  • Virtual Reality
  • PS5 Games
  • Anime
  • Manga
  • Comics
  • PC Game Pass
Follow US
© 2024 All Rights reserved | Powered by Gamebixby
Gamebixby > Virtual Reality > Why Genie 3 suggests that AI “world model” is the path to photorealistic interactive VR
Virtual Reality

Why Genie 3 suggests that AI “world model” is the path to photorealistic interactive VR

Published August 7, 2025 11 Min Read
Share
11 Min Read
Why Genie 3 suggests that AI "world model" is the path to photorealistic interactive VR
SHARE

Could the AI “world model” be the most viable path to fully photorial and interactive VR?

Many of the “virtual reality” seen in science fiction are either wearing a headset or connecting a neural interface to enter an interactive virtual world that looks completely realistic. In contrast, high-end VR today may have relatively realistic graphics, but that is still not clearly realistic, and these virtual worlds need to build hundreds of thousands or millions of dollars with years of development. Worse, mainstream standalone VR has graphics, and in absolute best cases it looks like an early PS4 game, approaching the late PS2 on average.

With each new quest headset generation, Meta and Qualcomm doubled GPU performance. Impressive, this pass takes decades to achieve performance on today’s PC graphics cards. You don’t have to worry about going near photorealism. And much of the future profit will be spent on increasing resolution. Techniques like rendering eye-catching resolutions and neural upscaling are useful, but they can only be done so far.

Gaussian Splutting allows for optical realistic graphics in standalone VR, but splats only represent instantaneously, so they must be captured from the real world or pre-rendered as a 3D environment in the first place. Adding real-time interactivity requires a hybrid approach that incorporates traditional rendering.

But there may be a completely different path to a photorial interactive virtual world. One stranger has its own problems, but is potentially far more promising.

https://www.youtube.com/watch?v=pdkhuknuqdg

Yesterday, Google Deepmind revealed the Genie 3, an AI model that generates real-time interactive video streams from text prompts. It’s essentially a nearly photorealistic video game, but each frame is completely AI-generated and has no involvement in traditional rendering or image input.

Google calls the Genie 3 a “world model,” but can also be described as an interactive video model. Initial input is a text prompt, real-time input is a mouse and keyboard, and output is a video stream.

Like many other generator AI systems, what’s surprising about the Genie series is the incredible pace of progress.

The original Jeannie, revealed in early 2024, focused primarily on generating 2D side scrollers at a resolution of 256 x 256, and only had 2 or 2 seconds of sample clips displayed, as the world could only run a few dozen frames before glitching and filling it into inconsistent mess.

See also  The Bigscreen Beyond 2 has sharper, wider, adjustable lenses and optional eye tracking

Demon 1from February 2024.

Then in December, Genie 2 surprised the AI industry by achieving a world model of 3D graphics. This was surprising using first person or third person control via standard mouse look and WASD or Arrow key controls. It outputs at 360p 15fps and can run for about 10-20 seconds, then the world begins to lose consistency.

The Genie 2’s output was also blurry, low polarity, and had a distinctly AI-generated appearance that could be recognized from older video generation models a few years ago.

December (left) and Genie 3 Jeanie 2

The Genie 3 is a big step forward. Outputs very realistic graphics at 720p 24fps, the environment is perfectly consistent for a minute and “almost” consistent for a few minutes.

If you’re still not sure what Genie 3 actually does, spell it out clearly. Enter a description of the virtual world you want and it will appear on the screen within seconds and pass through controls of standard keyboard and mouse movement.

And these virtual worlds are not static. The doors open as they approach them, and there are dynamic shadows for moving objects, and as the objects get in the way, you can even see physical interactions such as splashes and ripples in the water.

In this demo, the character’s boots on the ground appear to be in the way.

Without a doubt, the most fascinating aspect of the Genie 3 is that these behaviors emerge from the underlying AI models developed during training and are not pre-programmed. Human developers often spend months simulations of only one aspect of physics, but Genie 3 simply burns this knowledge into it. That’s why Google calls it the “world model.”

By specifying prompt interactions, more complex interactions can be achieved.

In one example clip, “POV action camera in a tanned house painted by a first-person agent with a paint roller” was entered, producing an essentially photorealistic wall painting minigame.

See also  XR and Metaverse Fair Tokyo showcase interesting ideas in the nervous industry outlook

Prompt: “POV action camera in a tanned house painted by a first-person agent with paint rollers”

Genie 3 also adds support for “Speed World Events” from changing weather to adding new objects and characters.

These event prompts can come from the player via voice input or be pre-scheduled by the creators of the world.

This could one day allow for an almost endless variety of new content and events in a virtual world.

Genie 3’s “Speed World Events” are active.

Of course, the 720p 24fps is far below what modern gamers expect, and gameplay sessions last much longer than a minute or two. However, given the pace of progress, these basic technical limitations could disappear in the coming years.

When it comes to adapting models like Genie 3 to VR, other more common problems arise.

The model should take at least 6DOF head pose as input and as directional movement, ideally, it should incorporate hand and body poses unless you want to roam the world without directly interacting with the object.

Although not impossible in theory, the model may require much wider training datasets and significant architectural changes.

Also, of course, you need to output a stereoscopic image. However, the other eye can be synthesized either by conventional techniques such as AI view synthesis or Yolo.

Latency is also a concern, but Google claims that Genie 3 has 50ms end-to-end control latency. This should not be a problem if the future model runs at 90 fps and can be combined with VR reply.

Google also shows that Genie 3’s action space is limited, and it cannot model complex interactions between multiple independent agents, and it cannot simulate real locations with full geographical accuracy. These issues are described as “ongoing research issues.”

However, there is another much more fundamental problem with AI’s “world models” like Genie 3 that limits the scope. So traditional rendering won’t disappear anytime soon.

The question is called maneuverability – how closely the output matches the details of the text prompt.

See also  Best Headset Discounts and Discount Deals for Black Friday 2024

In recent years, we may have seen impressive examples of very realistic AI image generation, as well as in recent months that AI video generation (such as Google Deepmind’s VEO 3). However, if you are not using it yourself, you may not notice that while these models follow your instructions in a general sense, they often do not match the details you specified.

Furthermore, even adjusting and removing the prompt often fails when the output contains unnecessary things. As an example, I recently asked VEO 3 to generate a video with someone having a hot dog with only ketchup and mustard. But no matter how harshly I emphasize the details, the model doesn’t produce a hot dog without mustard.

Traditionally rendered video games allow developers to see exactly what they are trying to see. The art direction and style details create a unique atmosphere for the virtual world. This is often accomplished through hard work through years of refinement.

In contrast, the output of the AI model comes from the potential space shaped by patterns of training data. Text prompts are closer to higher dimensional coordinates than truly understood commands, so they do not exactly match what the artist had in mind. This becomes even more difficult when rapid world events are involved.

Of course, the maneuverability of the AI world model will also improve over time. However, it is a much more demanding task than strengthening the resolution and memory horizon, and may never allow precise control of traditional game engines.

Prompt: “In the classroom on the blackboard in front of the room, there is a Genie-3 memory test and below it is a beautiful picture of an apple chalk, a coffee mug and a tree. The classroom is empty except for this.

Still, it’s stupid to not see the appeal of the ultimate photorealistic interactive VR world, which can exist by simply speaking or typing descriptions, even with maneuverability issues. The AI World Model appears to be uniquely positioned to realize Star Trek’s holodeck promise, even assets that generate AI for traditional rendering.

To be clear, we are still in the early days of the AI “world model.” There are some big challenges to solve, and it will probably take years for VR-available arrivals that can run hours on your headset. But the pace of progress here is surprising, and the possibilities pique the appetite. This is a field of research that we are extremely meticulous.

You Might Also Like

New VR games and releases in December 2024: Quest, SteamVR, PS VR2, and more

Viture’s “The Beast” display glasses come with industry-leading FOV & brightness

The Neolithic Dawn Practice: Survival of the Fit

Rival Stars Horse Racing: VR Edition Review – Bronze Medals for now

Reach invites high-strength VR action with arms crossed

TAGGED:Virtual RealityVR
Share This Article
Facebook Twitter Copy Link

Latest News

Dying Light The Beast is Techland's most brutal, tactical and stunning game
Dying Light The Beast is Techland’s most brutal, tactical and stunning game
Gaming News
Card Battler Bazaar finally comes to steam, but it's no longer available for free play
Card Battler Bazaar finally comes to steam, but it’s no longer available for free play
Gaming News
Senua’s Saga: Hellblade 2 Enhanced Edition PS5 Review – The Pursuit of Peace
Saga of Senua: Hellblade2 Enhanced Edition PS5 Review – Pursuit of Peace
Upcoming Games
Rust August Update makes the survival game hardcore mode even more realistic
Rust August Update makes the survival game hardcore mode even more realistic
Gaming News
Madden NFL 26 vs NFL 25 - What's new?
Madden NFL 26 vs NFL 25 – What’s new?
PS5 Games
Mafia: The Old Country Interview – Game Design, New Protagonist, World, and More
Mafia: An Old Country Guide – 10 Tips and Tips to Keep in mind
PC Games

You Might also Like

VR is the basis of our augmented future, says one of the most successful VR studios in the industry
Virtual Reality

VR is the basis of our augmented future, says one of the most successful VR studios in the industry

June 3, 2025
Hands-on: Samsung's Android XR headset is an interesting combination of Quest and Vision Pro, with one standout advantage
Virtual Reality

Hands-on: Samsung’s Android XR headset is an interesting combination of Quest and Vision Pro, with one standout advantage

December 12, 2024
Ember Souls Review: An Uneven Journey
Virtual Reality

Ember Souls Review: An Uneven Journey

December 4, 2024
The Bigscreen Beyond 2 has sharper, wider, adjustable lenses and optional eye tracking
Virtual Reality

The Bigscreen Beyond 2 has sharper, wider, adjustable lenses and optional eye tracking

March 21, 2025
gamebixby gamebixby
gamebixby gamebixby

At Gamebixby, we live and breathe the ever-evolving world of gaming, committed to bringing you the latest and most captivating updates. Our mission is to keep you informed, entertained, and inspired, offering a fresh perspective on the gaming universe.

Editor's Picks

Follow Us on Socials

We use social media to react to breaking news, update supporters and share information

Facebook Twitter Telegram
  • About Us
  • Contact Us
  • Disclaimer
  • Privacy Policy
  • Terms of Service
Reading: Why Genie 3 suggests that AI “world model” is the path to photorealistic interactive VR
Share
© 2024 All Rights reserved | Powered by Gamebixby
Welcome Back!

Sign in to your account

Lost your password?