Generative video is currently being tested by the AI startup behind Stable Diffusion.

By thawPublished 7 months ago • 3 min read

Generative video is currently being tested by the AI startup behind Stable Diffusion. Developer Stability AI has announced that generative art created by Stable Diffusion can now be animated. With the help of a new product called Stable Video Diffusion, users can now produce videos from just one image, according to the company's research preview. "This state-of-the-art generative AI video model represents a significant step in our journey toward creating models for everyone of every type," the business stated.

Two image-to-video models that can produce 14 to 25 frames per second at 3 to 30 frames per second at 576 × 1024 resolution are the new tool that has been released. With fine-tuning on multi-view datasets, it can perform multi-view synthesis from a single frame. It was compared to text-to-video platforms Runway and Pika Labs. "At the time of release in their foundational form, through external evaluation, we have found these models surpass the leading closed models in user preference studies," the company stated.

Stable Video Diffusion is available only for research purposes at this point, not real-world or commercial applications. Potential users can sign up to get on a waitlist for access to an "upcoming web experience featuring a text-to-video interface," Stability AI wrote. The tool will showcase potential applications in sectors including advertising, education, entertainment and more.

The samples displayed in the aforementioned video seem to be on par with competing generative systems in terms of quality. Nevertheless, the company noted that it has certain drawbacks, including the fact that it produces comparatively short videos (less than 4 seconds), lacks photorealism, is only capable of slow pans for camera motion, lacks text control, produces unreadable text, and might not correctly create faces and people. Stability AI only states that it used publicly available video for research purposes. The tool was trained on a dataset of millions of videos and then refined on a smaller set. The data set's source is significant because Getty Images recently filed a lawsuit against Stability AI for allegedly scraping its picture archives.

Because video can make content creation easier, generative AI has made video one of its main objectives. It is also a tool that has the greatest potential for misuse due to copyright violations, deepfakes, and other issues. Additionally, TechCrunch pointed out that Stability has burned through cash quickly and had less success commercializing its Stable Diffusion product than OpenAI, which has had success with its ChatGPT product. Additionally, Ed Newton-Rex, vice president of audio at Stability AI, resigned last week due to concerns about using copyrighted material to train generative AI models.

In a commitment to transparency and collaborative development, Stability AI has released the source code for Stable Video Diffusion on GitHub. Furthermore, the necessary weights for local deployment are available on the Hugging Face platform.

Stable Video Diffusion is highly customisable for various tasks, allowing users to configure the model to generate videos from a single image as a reference point. Beyond its immediate applications, it serves as the foundational platform for a family of derivative models, demonstrating Stability AI’s dedication to building a comprehensive ecosystem.Even with its amazing powers, the current version of stable video diffusion is not without flaws. It is unable to produce motionless videos or slowly pan the camera, render text readable, consistently create realistic faces, or allow for complete text input control.

Stability AI has big plans for the platform in the future. Users should anticipate being able to use a web interface to create videos based on text descriptions in the near future.

Most importantly, the project is just getting started. It is not intended for the current version of the model to be used to develop fully functional or commercial applications. Rather, it functions as a scientific project with the goal of obtaining insightful user input.

tech news humanity

About the Creator

thaw

Enjoyed the story?
Support the Creator.

Subscribe for free to receive all their stories in your feed. You could also pledge your support or give them a one-off tip, letting them know you appreciate their work.

Subscribe For Free