Sora — a new AI model from OpenAI that generates video from text

OpenAI’s latest model transforms textual cues into «complex realistic scenes with multiple characters, specific types of movement, and precise object and background details» — lasting up to a minute.

The company also states that Sora can understand how objects «exist in the physical world», as well as «accurately interpret props and generate convincing characters that express vivid emotions». The model can also generate video from a still image and fill in missing frames in an existing video or extend it.

Demo videos generated by Sora and published on X include a camera flyover of a snowy Tokyo street — though if you look closely, you can find signs of artificial intelligence (just look at the trees).

Introducing Sora, our text-to-video model.

Sora can create videos of up to 60 seconds featuring highly detailed scenes, complex camera motion, and multiple characters with vibrant emotions. https://t.co/7j2JN27M3W

Prompt: “Beautiful, snowy… pic.twitter.com/ruTEWn87vf

— OpenAI (@OpenAI) February 15, 2024

A few years ago, it was text-to-picture generators such as Midjourney that attracted a lot of attention to the AI industry, but now companies such as Runway and Pika have begun to improve the technology for video. Google’s Lumiere can currently be considered OpenAI’s main competitor in this area (although the duration of the video in this model is limited to 5 seconds).

Prompt: “A stylish woman walks down a Tokyo street filled with warm glowing neon and animated city signage. she wears a black leather jacket, a long red dress, and black boots, and carries a black purse. she wears sunglasses and red lipstick. she walks confidently and casually.… pic.twitter.com/cjIdgYFaWq

— OpenAI (@OpenAI) February 15, 2024

Currently, Sora is only available to «red teams» who evaluate the model for potential harm and risks. OpenAI also offers access for some artists, designers, and filmmakers to get feedback.

Earlier this month, OpenAI announced that it is adding watermarks to your DALL-E text-to-picture tool 3, but notes that they can be «easily removed».

Sora — a new AI model from OpenAI that generates video from text

Your comment (optional):