Tag: I see

Google launches Veo 3.1 image generation model, enhancing image-to-video conversion capabilities

Google launches Veo 3.1 image generation model, enhancing image-to-video conversion capabilities

Google announced an update to its AI video generation model, Veo, to version 3.1, boasting improved performance in following prompts and converting images into video. Veo 3.1 is currently available for trial via Google's Gemini API and is integrated into Google's Flow video editing tool. Veo 3.1's technical upgrades build upon Veo 3, unveiled at this year's Google I/O conference. According to Google, the new model performs better in following prompts, more easily creating videos from user-uploaded image "materials" combined with text prompts. Furthermore, Veo 3.1 adds the ability to simultaneously convert images into video and generate audio, a feature not present in Veo 3. Enhanced Flow editor functionality: In the Flow video editor, Veo 3.1 supports a new "Scene-to-Video" feature, allowing users more precise control over the generated video. Users can upload start and end frames, and the AI ​​will automatically generate the intermediate video content. While Adobe's Firefly offers similar functionality, Flow's unique feature is its ability to generate audio simultaneously. This audio generation capability also applies to the editor's video extension and object insertion functions. Regarding the current state of the technology and its application prospects, based on the samples shared by Google, videos generated through Veo 3.1 still have a slightly unrealistic feel, and the effect varies greatly depending on the prompts and themes. Although it may not yet be as realistic as OpenAI's Sora 2, Google is trying to make Veo more practical for professionals actually working on videos, rather than just a source of social media spam. With the rapid development of AI video generation technology, competition among tech giants in this field is becoming increasingly fierce, and Google, through continuous updates to the Veo model, demonstrates its determination to remain competitive in the creative tools market.

YouTube announces several new tools for creators, including AI video editing, voice narration, and automatic song generation

YouTube announces several new tools for creators, including AI video editing, voice narration, and automatic song generation

YouTube unveiled a series of new features at a creator event, most of which are closely related to AI applications. The goal is to make content creation more intuitive and faster, and to help creators improve video performance and audience engagement. Among these new features is an AI-powered ability to automatically edit footage into "draft videos." This not only automatically selects the best shots but also adds transitions, music, and voice-over narration, allowing creators to quickly create usable video footage for further adjustments before uploading directly to YouTube. This feature is currently in testing and is expected to be rolled out gradually over the next few weeks. Additionally, YouTube will launch an automatic voice-over narration feature, initially supporting English and Hindi. It is expected to be integrated into the YouTube Create App and the Shorts video creation interface, launching later this year. Even more interestingly, YouTube will provide a tool that can convert spoken dialogue in videos into catchy background music, making it easier for creators to create content that fits the rhythm of Shorts videos, increasing the likelihood of sharing and derivative works. YouTube also announced the expansion of Veo 3's application in Shorts videos, allowing users to generate higher-quality short videos from simple text prompts and add sound effects or music directly. This update adds character animation and stylized effects while improving the accuracy of prompt matching, making it easier for creators to create personalized short videos. In addition to AI creation features, YouTube Studio will also introduce a "conversational AI assistant" to help creators quickly obtain analytics information such as traffic sources and viewing data, helping to optimize channel management strategies. This feature is already being rolled out gradually. On another front, YouTube has added several features to enhance collaboration and interaction, including the ability to co-create videos with up to four other creators and have them featured on all participating channels, increasing reach and discovery. This feature will be launched globally in the coming weeks. YouTube also announced the official launch of the long-awaited A/B testing feature, allowing creators to set three different titles for the same video for testing, thereby observing which version delivers the best viewership performance. Overall, this YouTube update clearly uses AI as the core driving force, enabling content creation and data analysis to be completed more quickly through smart tools, reducing the workload of creators and allowing more people to try entering the field of video creation with minimal barriers.

Google Veo 3, OpenAI Sora, Runway, Pika, Firefly: Which is the most powerful AI video generation platform currently?

Google Veo 3, OpenAI Sora, Runway, Pika, Firefly: Which is the most powerful AI video generation platform currently?

Since tech giants like Google, OpenAI, and Adobe entered the AI ​​video generation field, within just six months, several strong products have emerged in the market, including Veo 3, Sora, Firefly, Runway Gen-3, and Pika Labs, each focusing on different image styles and application scenarios. This article compares these mainstream platforms in light of Veo 3's official launch in Taiwan, examining which AI video generation tool is best suited for which user needs. The comparison focuses on: Image Quality, Semantic Understanding, and Style Support. Platform Comparison: Image Quality Performance, Semantic Understanding, Dynamic Coherence, Ease of Use, Style Diversity, and Features. Google Veo 3: ★★★★★ ★★★★★ ★★★★☆ ★★★★★ ★★★★☆ Gemini (Native integration, supports Chinese commands, natural and detailed lighting). OpenAI Sora: ★★★★★ ★★★★★ ★★★★★ ★★★★☆ ★★★★★ ...

Google Vee 3's video generation feature officially launched: Using AI to create dreamlike images

Google Vee 3's video generation feature officially launched: Using AI to create dreamlike images

Following its debut at Google I/O 2023, Google's AI video generation tool, Veo 3, which has been heavily promoted at subsequent events, is now officially available in Taiwan and other regions. This allows creators, video enthusiasts, and general users to create high-quality, cinematic-quality dynamic images with simple commands. From macro-level bubble tea to dreamlike lighting, the boundless imagination brought by AI is evident in the demonstration video shared by the Google team. Simply inputting a description like "bubble tea under a macro lens, with pearls floating and glowing, arranged in a circle to form 'VEO 3'; a light touch of the finger creates dreamlike ripples," Veo 3 automatically generates delicate and artistic dynamic visuals, perfectly interpreting the user's creative imagination and presenting extremely high-quality lighting and detail effects. Compared to the previously released Veo 2, the now-available Veo 3 generates more delicate, natural, and accurate video content, while also producing corresponding sound content, thus enhancing the practicality of video generation and allowing users to express various imaginative creative ideas through AI. Taiwanese users can use the Veo 3 feature starting today, but a Google AI Pro subscription is required. Taiwanese users can access the feature through the Gemini app, but a Google AI Pro subscription is still necessary. To ensure video content security, Veo 3 has undergone extensive security testing and red team verification, and strictly adheres to Google's AI security policies to prevent the generation of inappropriate or harmful content. Furthermore, all videos created using Veo 3 will have a visible watermark and an invisible SynthID digital watermark, indicating that the video is AI-generated, ensuring content transparency and creative responsibility. Future optimizations will continue to promote the popularization of AI video creation. Google stated that it will continue to collect user feedback through a "like/dislike" feedback mechanism and continuously adjust the generation quality and functions of Veo 3. In the future, it will also expand to include more image styles and interactive effects, allowing every user to easily create their own visual narratives using text descriptions, promoting AI video creation towards a more widespread and diverse stage.

Google announces new versions of its video generation tool, Veo 2, and image generation tool, Imagen 3, offering more possibilities for image creation.

Google is making its Veo 2 video creation tool more accessible to more people through its Gemini Advanced subscription plan.

Veo 2, a video generation tool announced late last year and whose pricing was recently revealed, will be made available to more people through Google's Gemini Advanced subscription plan. However, compared to the version offered to businesses via Google Cloud, the Gemini Advanced subscription version can only generate videos up to 8 seconds long with a resolution of only 720p, and even only in a 16:9 aspect ratio. This is likely to prevent abuse and unnecessary impact. Videos generated through the Gemini Advanced subscription version can still be uploaded to sharing platforms like YouTube or downloaded as MP4 files for video editing. Google is also currently using Veo 2 functionality in its experimental AI service, Whisk, allowing users to generate additional image content based on text and images, which can then be further processed into video content by Whisk.

Google adds Lyria, a text-to-music model, to its Vertex AI platform to accelerate the creation of richer content.

Google adds Lyria, a text-to-music model, to its Vertex AI platform to accelerate the creation of richer content.

In addition to continuously expanding its Vertex AI artificial intelligence model resources, Google announced that it will offer the Lyria text-to-music model in preview form through the Vertex AI platform, making Vertex AI the only platform currently available to provide models for generating image, voice, video, and music content. Lyria can quickly create high-resolution audio content with detailed sound differences and rich musical styles through text commands. It can help brands quickly create soundtracks for product marketing, launch events, or in-store immersive experiences, and can customize details according to brand image. For creators, it can also reduce the production time of videos, podcasts, and other content, allowing them to create music that meets the needs of the context in just a few minutes, without worrying about copyright infringement issues. On the other hand, Google has also updated the Veo 2 video generation tool launched at the end of last year, adding more editing functions and camera angle control options, allowing creators to more accurately adjust the presentation details of video content, thereby quickly generating the desired video content, and even adjusting video details, such as removing a character in the video or changing the video display ratio. Other updates include upgrades to Imagen 3's text-to-image model for content generation, adding the ability to reconstruct missing or damaged details in images and improving image quality after object removal. Chirp 3, used for generating audio content and supporting over 35 languages, now allows for real-time customization of voice content. Users can input a 10-second audio message to generate customized speech, and can integrate AI-generated speech into existing real speech, or transcribe dialogue into text and annotate to distinguish different speakers. In this update, Google further touts Vertex AI as the only platform currently offering models for generating image, audio, video, and music content, allowing users to quickly generate still images via text commands, intuitively create video content and background music, and add custom narration, thereby producing a marketing video in a short time. Similar to its previously released AI tools, Google emphasizes that features like Lyria will incorporate SynthID digital watermarking, security filtering mechanisms, and adhere to data governance principles. Google will also take responsibility for copyright disputes arising from users' use of its services, offering compensation and other responsible measures. Currently, advertising agency Goodby, Silverstein & Partners and the Dalí Museum have used Veo 2 and Imagen 3 to bring Salvador Dalí's 1937 screenplay "Giraffes on Horseback Salad" (or "The Surrealist Woman") to life. L'Oréal SA has also used Veo and Imagen to create marketing materials for its products, and Kraft Heinz...

Google explains how it used AI to bring the 1939 film "The Wizard of Oz" to life in its giant Sphere theater.

Google explains how it used AI to bring the 1939 film "The Wizard of Oz" to life in its giant Sphere theater.

On the eve of the Google NEXT'25 conference, Google explained how it used Google Cloud services and artificial intelligence technology to bring the 1939 film "The Wizard of Oz," shot on film, to the Sphere, a giant dome theater in Las Vegas. ▲The 1939 film "The Wizard of Oz" will be brought to the Sphere, a giant dome theater in Las Vegas. To present a 1939 film, shot on film with a 4:3 aspect ratio and much lower resolution than today's standards, through the Sphere's 16K resolution OLED screen, while also fitting the immersive curved screen design, is clearly not as simple as just using artificial intelligence to increase image resolution. ▲How to present the 1939 film "The Wizard of Oz" in the most suitable format on the Sphere screen was a significant challenge. This project combined the technical resources of Google Cloud and Google DeepMind, and also collaborated with companies including Sphere Studios, California software company Magnopus, and Warner Bros. Discovery. ▲ Combining the technological resources of Google Cloud and Google DeepMind, and collaborating with industry players including Sphere Studios, California software company Magnopus, and Warner Bros. Discovery, to maintain a better viewing experience, simply upgrading the image quality and enlarging it to fill the entire screen area could lead to excessive visual oppression for viewers. Therefore, the best approach was to use artificial intelligence to generate more content within the original screen area, making the size of the characters more in line with the natural viewing experience. ▲ Simply enlarging the image according to its original proportions would result in the characters appearing too visually oppressive, so the overall image display on the Sphere screen had to be readjusted. ▲ In addition, more audio-visual interactive effects were added to provide users with a more immersive experience. To make the overall image presentation more natural and appropriate, Google also had to adjust details such as shooting angles and character positions for specific segments to ensure the best viewing experience on the giant Sphere screen. For example, when the Cowardly Lion first appeared, the camera was originally focused on the scarecrow...

Google announces new versions of its video generation tool, Veo 2, and image generation tool, Imagen 3, offering more possibilities for image creation.

Google announces pricing for its new video creation tool, Veo 2, starting at 50 cents per second and $30 per minute.

Google recently announced the pricing for its new video generation tool, Veo 2, unveiled late last year. The new version will be charged at 50 cents per second, meaning $30 per minute of video and $1800 for a one-hour video. Google emphasizes that Veo 2 is not primarily designed for long videos; its main purpose is to help creators fill gaps in video length or generate content more cost-effectively, thus increasing flexibility in video creation. Veo 2 primarily improves the realism of generated images by capturing realistic physics, human movement, and subtle facial expressions. Furthermore, Veo 2 incorporates cinematic terminology, allowing users to request low-angle panning shots, generate footage based on an 18mm lens, and even create shallow depth-of-field effects to blur backgrounds and focus attention on subjects. Currently, Veo 2 can generate videos with a maximum resolution of 4K and a length of up to 2 minutes, which is almost four times the highest resolution and more than six times the length of Sora proposed by OpenAI. In addition, it can reliably follow the user's input instructions and generate videos based on real physical phenomena, and it also claims to reduce the proportion of hallucinations.

Google announces new versions of its video generation tool, Veo 2, and image generation tool, Imagen 3, offering more possibilities for image creation.

Google announces new versions of its video generation tool, Veo 2, and image generation tool, Imagen 3, offering more possibilities for image creation.

Following the announcement of the video generation tool Veo in May, Google DeepMind recently unveiled a new version, Veo 2, and a new version of the image generation tool Imagen 3, along with a new wave of experimental projects. Veo 2 primarily improves the realism of generated images by making them more consistent with real-world physics, human movement, and subtle facial expressions. Furthermore, Veo 2 incorporates cinematic terminology, such as allowing users to request low-angle panning shots, generate footage based on an 18mm lens input, and even create shallow depth-of-field effects to blur backgrounds and focus attention on subjects. Currently, Veo 2 can generate videos up to 4K resolution, with a maximum length of 2 minutes—almost four times the highest resolution and over six times the length of Sora proposed by OpenAI. It also reliably follows user input and generates videos based on real-world physics, boasting a reduction in the occurrence of illusions. The newly released Imagen 3 can generate more composed and brighter images, and can produce artistic styles such as realistic, impressionistic, abstract, or anime-style images based on instructions, while also showing greater detail and texture. Google will begin accepting access through Google Labs starting today, awaiting Veo 2 support, with plans to begin using it on services such as YouTube Shorts next year. The new Imagen 3 has already been deployed in over 100 countries and regions, and is available through Google Labs' image generation tool, ImageFX. Google Labs also simultaneously launched a new experimental tool called Whisk, which boasts the ability to generate more expressive images. It integrates Imagen 3 with the new Gemini model, using computer vision analysis to understand and generate instructions, and then Imagen 3 produces entirely new images. It is currently available in the United States.

Google has launched a number of artificial intelligence model applications at once, further competing with OpenAI, which will launch several new features by the end of the year.

Google has launched a number of artificial intelligence model applications at once, further competing with OpenAI, which will launch several new features by the end of the year.

Perhaps to compete with emerging AI technologies like OpenAI, Google recently made its Imagen 3 image generation model available to all Vertex AI platform users. Veo, which can generate video content from text, has also begun offering it in personal preview. Google DeepMind has further launched Genie 2, an AI model capable of generating 3D scenes interactively with a mouse and keyboard, and from a single image. They've also touted GenCast as an AI model that can generate more accurate climate change predictions for the next 15 days. Imagen 3 and Veo are clearly designed to compete with AI startups like OpenAI, generating static images or one-minute 1080p resolution videos from a single text description and image content. They can also incorporate cinematic camera work and visual effects, making the generated videos more professional. Previously, Veo was primarily tested with select creators through the VideoFX app, with future integration into YouTube Shorts planned. Imagen 3, on the other hand, will initially be available through Google Labs. This update allows Veo to be available for personal preview through the Vertex AI platform, while Imagen 3 will be available to all Vertex AI users starting next week. Agoda, a travel service provider, has already begun using AI tools such as Veo, Gemini, and Imagen to simplify the production process of promotional videos. Both Veo and Imagen 3 will use SynthID digital watermarking technology to protect content. In addition to being ahead of OpenAI's recently launched "Sora," an automated generative AI technology that can generate up to one minute of realistic videos from text and still images, allowing more people to create vivid videos with Veo, Google DeepMind has further launched the Genie 2 AI model, boasting the ability to generate 3D scenes from a single image and allowing interactive operation via mouse and keyboard. Furthermore, the newly launched GenCast can predict climate change over the next 15 days and boasts higher accuracy than existing models on the market. It is based on DeepMind's GraphCast climate prediction model proposed last year, using automated generation to increase accuracy, and emphasizes that this model will be open-source and available for use. Regarding OpenAI, CEO Sam Altman confirmed that starting from December 5th, they will continue to release new features every 12 consecutive days, which is expected to include a new version of the "Sora" artificial intelligence model.

Pages 1 to 2 1 2

Welcome back!

Login to your account below

Retrieve your password

Hãy nhập tên người dùng hoặc địa chỉ email để mở mật khẩu