Google Launches LLM: Google launches LLM to generate videos from text, audio input

Companies like OpenAI, Microsoft and Adobe have launched AI chatbots that are powered by specific types of large language models (LLMs) that turn a text input into an image. Google has also been in the fray and it has now taken a step forward by releasing an LLM, called VideoPoet, that can turn text to videos.
To showcase VideoPoet’s capabilities, Google Research has produced a short movie composed of several short clips generated by the model.
How VideoPoet model works
For example, Google explains that for the script, it asked Bard to write a series of prompts to detail a short story about a travelling raccoon. It then generated video clips for each prompt, and when the model stitched together all resulting clips, it prepared a final YouTube Short.
“VideoPoet is a simple modelling method that can convert any autoregressive language model or large language model (LLM) into a high-quality video generator,” Google said.
There is a pre-trained MAGVIT V2 video tokenizer and a SoundStream audio tokenizer which transform images, video and audio clips with variable lengths into a sequence of discrete codes in a unified vocabulary.
These codes are compatible with text-based language models, facilitating an integration with other modalities, such as text. The LLM learns modalities to predict the next video or audio token in the sequence.
“A mixture of multimodal generative learning objectives are introduced into the LLM training framework, including text-to-video, text-to-image, image-to-video, video frame continuation, video inpainting and outpainting, video stylisation, and video-to-audio,” the company said, noting that the result is an AI-generated video.
In layman’s words, VideoPoet has multiple separately trained components for different tasks integrated into a single LLM.

FOLLOW US ON GOOGLE NEWS

Read original article here

Denial of responsibility! Swift Telecast is an automatic aggregator of the all world’s media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, all materials to their authors. If you are the owner of the content and do not want us to publish your materials, please contact us by email – swifttelecast.com. The content will be deleted within 24 hours.

Leave a Comment