Рубрики NewsTechnologies

Sora neural network is capable of modeling digital virtual worlds — demo on the example of Minecraft

Опубликовал
Катерина Даньшина

First video generator from OpenAI seems to have a great chance of succeeding in creating film or game projects.

In an article by the company’s researchers titled «Video generation models as world simulators» reveals key aspects of Sora’s architecture: for example, neural network can generate videos with any resolution and aspect ratio (up to 1080p) based on a text request; and perform a number of image and video editing tasks — from creating looping videos and extending videos forward or backward in time to changing the background.

However, the most intriguing mention of «digital world modeling» is that during the experiment, the researchers provided Sora with cues with the word «Minecraft» and had it recreate a convincingly game-like interface and dynamics while simultaneously controlling a character.

So how does Sora do it? Nvidia senior researcher Jim Fan (via TechCrunch) notes that this neural network is more like a «data-driven physical engine» than a creative mechanism. It’s not just creating a single image or video, but determining the physics of every object in the environment — and recreating the photo or video (or interactive 3D world, as the case may be) based on those calculations.

«These capabilities indicate that continued scaling of video models is the way to develop powerful simulations of the physical and digital worlds, as well as the objects, animals, and people who live in them», — OpenAI researchers write.

Sora seems to be able to pave the way for more realistic — perhaps even photorealistic — games created from text descriptions alone. This is both exciting and terrifying (given the problems with dipfakes) — which is probably why OpenAI has opened it up with rather limited access.

Disqus Comments Loading...