HyperAI
Back to Headlines

Veo 3 Adds Image Input Feature, Generates Talking Videos

7 days ago

Google's Veo 3 video model has introduced one of the most highly anticipated features for its latest iteration: support for image inputs. This update allows users to generate video content from a single image, opening up new possibilities for creating dynamic and personalized video content. With this feature, you can upload a portrait photo and feed a script into the prompt field to produce a video of the subject speaking. Users can also make their subjects sing, read promotional lines, or even tell jokes, all generated from a static image. This update addresses a significant challenge in AI video creation—maintaining consistent characters across multiple scenes. By generating multiple portraits from a single trained image model, users can ensure that the same character appears consistently throughout different video clips. The image input feature is part of a series of updates rolled out in June 2025. When you open the Flow app on Google Labs, you'll see a message welcoming you to the new capabilities: "You can now make your images talk with Veo 3." The first-frame-to-video function now supports speech, allowing you to upload a picture of your character and give them a voice. However, it's important to note that the audio feature is still in beta, so there may be occasional issues with sound in the generated videos. Google reminds users to ensure they have the necessary rights to any content they upload. Avoid generating content that infringes on intellectual property or violates user agreements. This update marks a significant step forward in Google's generative AI capabilities and enhances the flexibility and utility of the Veo 3 model for a wide range of applications.

Related Links