OpenAI Unveils New AI Model that Creates 60-Second Films with Just a Sentence!
From engaging in seamless conversations with humans and writing code to passing Google engineer interviews, OpenAI’s generative AI has showcased numerous capabilities. Now, they have added a new skill to their repertoire: video creation. The newly introduced AI model called “Sora” allows users to generate realistic videos lasting up to one minute with just a short sentence.
“Introducing Sora, our text-to-video model. Sora can generate videos up to one minute long while ensuring visual quality and adhering to user prompts,” OpenAI stated on their official website.
The generated videos are highly lifelike, showcasing OpenAI’s latest image generation technology.
AI-generated videos are not a new concept, with technology giants like Google and Meta, as well as startups like Pika Labs, which have been established for less than a year, all releasing AI video generation techniques. However, Sora stands out for its exceptional realism.
According to Wired, the realism achieved by Sora is unparalleled in other AI video generation models, and the generated videos are longer compared to other models.
According to OpenAI’s website, Sora can generate complex scenes with multiple characters, specific action types, and intricate details. The AI not only understands various objects mentioned in the user prompts but also knows how these objects exist in the real world, creating a stunning sense of realism.
Furthermore, Sora demonstrates a deep understanding of language, accurately presenting the content mentioned in the prompts. It generates captivating characters and can create multiple different shots in a video while maintaining the style of the characters and visuals.
OpenAI has also revealed numerous demonstration videos on their website. For example, in a short film featuring a woman walking on the streets of Tokyo, the prompt is as follows:
“A stylish woman walks down a Tokyo street filled with warm glowing neon and animated city signage. She wears a black leather jacket, a long red dress, and black boots, and carries a black purse. She wears sunglasses and red lipstick. She walks confidently and casually. The street is damp and reflective, creating a mirror effect of the colorful lights. Many pedestrians walk about.”
In this one-minute film, although there are some flaws such as signage text, road layout, and overly smooth movements of pedestrians, it still appears extremely realistic at first glance. If the focus is on the fashionable woman, one might not immediately notice that this is a video entirely generated by AI.
Not only does Sora create realistic modern videos, but it also adds a vintage filter to a prompt like “Historical footage of California during the gold rush,” giving it a sense of the era. However, there are still some inconsistencies in details such as architectural layout that can be noticed with careful observation.
OpenAI acknowledges that the current model has weaknesses in accurately simulating the physical principles in complex scenes and understanding causality. For example, if asked to generate a video of a person eating a cookie, the video might show a person taking a bite but the cookie remaining intact. Sora also has difficulty distinguishing left from right and accurately representing events that change over time.
As for how long it takes to generate such a realistic video, OpenAI has not disclosed specific timeframes but has only mentioned it takes approximately “as long as going out to eat a Mexican burrito.”
Sora has additional functionalities that have not been publicly showcased, such as generating short films from images or filling in missing frames in existing videos, and extending content. OpenAI researcher Bill Peebles stated, “It’s a really cool way to enhance storytelling ability. You can draw an idea and make it real.”
Currently, Sora cannot revolutionize the film industry as the generated content varies each time, making it impossible to string together 120 one-minute videos into a movie. However, for short video platforms like TikTok, it could be a disruptive new tool. Even ordinary people can utilize AI technology to generate high-quality short films.
The general public will have to wait a bit longer to use Sora! OpenAI is currently collaborating with various stakeholders to address security concerns.
However, what if this highly realistic video generation capability is used to create fake news? This is one of the reasons why OpenAI has not publicly released Sora. Currently, the model is only available to the red team for attack simulations and a select few artists, designers, and filmmakers.
OpenAI emphasizes that they are developing tools to detect fake news and plan to embed metadata from C2PA, similar to how previous image files generated by Dall-E 3 showed that they were created using Dall-E. Additionally, OpenAI claims they will include the usage policy of Dall-E 3, rejecting the generation of celebrity images as well as violent, sexual, or hateful content.
OpenAI states that they are collaborating with governments, educators, and artists from various countries to understand concerns and promote positive usage. “Just as we cannot predict all the positive use cases, we also cannot foresee all the malicious ones,” they stated on their official website. “This is why we believe that learning from real-world use, building, and deploying safer AI systems is crucial.”
Sources:
OpenAI, Wired, The Verge