X
1
0
Introduction to Text to Video Generation Huggingface Pipeline and PyPI Package text2video
In this blog, we will give you a brief introduction to the Huggingface Text to Video Pipeline and the wrapper API. Since installing these pipeline requires many dependencies of python package, such as transformers, torch, diffusers, we provide an API wrapper of common text to video interfaces for non AI or machine learning related experts and put it into the pypi package text2video (https://pypi.org/project/text2video/). Right now this package is still in development stage, and we will keep updating this blog. This package API Wrapper is open to contribution also.
1. Illustration of a Huggingface Pipeline of Text to Video Generation models
1.1 Huggingface Text to Video Pipeline
This python scripts will generate a short video with the default length of 16 frames (2s at 8 fps). The models are using "damo-vilab/text-to-video-ms-1.7b" models.
## code for huggingface diffusion pipeline import torch from diffusers import DiffusionPipeline from diffusers.utils import export_to_video pipe = DiffusionPipeline.from_pretrained("damo-vilab/text-to-video-ms-1.7b", torch_dtype=torch.float16, variant="fp16") pipe = pipe.to("cuda") prompt = "Spiderman is surfing" video_frames = pipe(prompt).frames[0] video_path = export_to_video(video_frames) video_path
1.2 text2video package Wrapper API of the Pipeline
## code for text2video text 2 video wrapper import text2video as t2v input_dict = {"text": "Text to Video"} res_dict =t2v.api(input_dict, model=pipe, api_name="hf_diffusion_pipeline") video_path = res_dict["video"]
2. API to Download Latest Text to Video Papers from arxiv.org
Let's start by an example of fetching the latest top 3 papers with keywords "Text to Video" and print it out.
## code for text2video latest research papers download import text2video as t2v import json input_dict = {"text": "Text to Video"} res = t2v.api(input_dict, model=None, api_name="ArxivPaperAPI", start=0, max_results = 3) paper_list = json.loads(res["text"]) print ("###### Text to Image Recent Paper List:") for (i, paper_json) in enumerate(paper_list): print ("|" + paper_json["id"] + "|" + paper_json["title"].replace("\n", "") + "|" + paper_json["updated"] )
###### Text to Image Recent Paper List: |http://arxiv.org/abs/2410.08211v1|LatteCLIP: Unsupervised CLIP Fine-Tuning via LMM-Synthetic Texts|2024-10-10T17:59:59Z |http://arxiv.org/abs/2410.08210v1|PointOBB-v2: Towards Simpler, Faster, and Stronger Single Point Supervised Oriented Object Detection|2024-10-10T17:59:56Z |http://arxiv.org/abs/2410.08209v1|Emerging Pixel Grounding in Large Multimodal Models Without Grounding Supervision|2024-10-10T17:59:55Z