Google launches Veo 3.1 image generation model, enhancing image-to-video conversion capabilities
Google announced an update to its AI video generation model, Veo, to version 3.1, boasting improved performance in following prompts and converting images into video. Veo 3.1 is currently available for trial via Google's Gemini API and is integrated into Google's Flow video editing tool. Veo 3.1's technical upgrades build upon Veo 3, unveiled at this year's Google I/O conference. According to Google, the new model performs better in following prompts, more easily creating videos from user-uploaded image "materials" combined with text prompts. Furthermore, Veo 3.1 adds the ability to simultaneously convert images into video and generate audio, a feature not present in Veo 3. Enhanced Flow editor functionality: In the Flow video editor, Veo 3.1 supports a new "Scene-to-Video" feature, allowing users more precise control over the generated video. Users can upload start and end frames, and the AI will automatically generate the intermediate video content. While Adobe's Firefly offers similar functionality, Flow's unique feature is its ability to generate audio simultaneously. This audio generation capability also applies to the editor's video extension and object insertion functions. Regarding the current state of the technology and its application prospects, based on the samples shared by Google, videos generated through Veo 3.1 still have a slightly unrealistic feel, and the effect varies greatly depending on the prompts and themes. Although it may not yet be as realistic as OpenAI's Sora 2, Google is trying to make Veo more practical for professionals actually working on videos, rather than just a source of social media spam. With the rapid development of AI video generation technology, competition among tech giants in this field is becoming increasingly fierce, and Google, through continuous updates to the Veo model, demonstrates its determination to remain competitive in the creative tools market.







