Google DeepMind recently introduced VEO, a new generative AI model for video production. Developed in response to OpenAI’s Sora and Luma Labs’ Dream Machine, VEO is characterized by its ability to generate high-resolution 1080p videos that go beyond one minute and accurately translate text input into visually appealing videos.
Important features of VEO:
- Advanced speech recognition and visual semantics: Veo captures the nuances and tones of text input and can implement a variety of cinematic effects such as time-lapse or aerial shots.
- Consistency across video frames: An outstanding feature is the visual consistency ensured by latent-diffuse transformers. This technology reduces unwanted flickering or morphing between frames and thus improves the realism and coherence of the videos.
- Advanced creative control: Veo offers features to customize specific sections of video with masks, allowing for precise changes. This is particularly useful for filmmakers and creative professionals.
- Responsible use: All generated videos are watermarked with the SynthID watermark to ensure the identification of AI-generated content and minimize potential privacy and copyright risks.
Google invites filmmakers and creatives to test VEO and contribute to its further development by providing feedback.
For more details and a hands-on demonstration, you can visit the official DeepMind Veo page.
Sources:
https://deepmind.google/technologies/veo/
https://www.ultralytics.com/de/blog/generating-video-with-google-deepmind-veo
https://techcrunch.com/2024/05/14/google-veo-a-serious-swing-at-ai-generated-video-debuts-at-google-io-2024/
https://ar5iv.org/html/2401.03048v1
https://ar5iv.org/abs/2401.03048
https://blog.google/technology/ai/google-generative-ai-veo-imagen-3
https://9to5google.com/2024/05/14/google-launches-veo-an-ai-video-generation-tool-alongside-imagen-3-upgrade/