Official implementation from Papers with Code · Matched via arXiv identifier search
- Stars
- 429
- Last push
- Aug 25, 2025 (289d ago)
Risk flags
- No CI pipeline detected
- No tagged releases
- No Docker setup
Haonan Qiu, Menghan Xia, Yong Zhang, Yingqing He, Xintao Wang, Ying Shan, Ziwei Liu
Core AI workload signals detected from paper context and implementation/artifact evidence.
With the availability of large-scale video datasets and the advances of diffusion models, text-driven video generation has achieved substantial progress. However, existing video generation models are typically trained on a limited number of frames, resulting in the inability to generate high-fidelity long videos during inference. Furthermore, these models only support single-text conditions, whereas real-life scenari ...
os often require multi-text conditions as the video content changes over time. To tackle these challenges, this study explores the potential of extending the text-driven capability to generate longer videos conditioned on multiple texts. 1) We first analyze the impact of initial noise in video diffusion models. Then building upon the observation of noise, we propose FreeNoise, a tuning-free and time-efficient paradigm to enhance the generative capabilities of pretrained video diffusion models while preserving content consistency. Specifically, instead of initializing noises for all frames, we reschedule a sequence of noises for long-range correlation and perform temporal attention over them by window-based function. 2) Additionally, we design a novel motion injection method to support the generation of videos conditioned on multiple text prompts. Extensive experiments validate the superiority of our paradigm in extending the generative capabilities of video diffusion models. It is noteworthy that compared with the previous best-performing method which brought about 255% extra time cost, our method incurs only negligible time cost of approximately 17%. Generated video samples are available at our website: http://haonanqiu.com/projects/FreeNoise.html.
No concrete benchmark grounding is available yet. Treat the page as context or an implementation starting point only.
With the availability of large-scale video datasets and the advances of diffusion models, text-driven video generation has achieved substantial progress.
AILab-CVC/FreeNoise is the strongest maintained implementation based on ranking signals. License is declared (Apache-2.0). Dependency/environment manifests are present.
Open AILab-CVC/FreeNoiseEvidence graph: 3 refs, 3 links.
Utility signals: depth 55/100, grounding 75/100, status medium.
Compare maintenance quality, reproducibility coverage, and evidence confidence before choosing a reproduction baseline.
Official implementation from Papers with Code · Matched via arXiv identifier search
Risk flags
Official implementation from Papers with Code · Repository link is mentioned in the paper metadata
Risk flags
Official implementation from Papers with Code · Matched via arXiv identifier search
Risk flags
[ICLR 2024] Code for FreeNoise based on VideoCrafter
Preserved for provenance. Not recommended as the default path for new builds.
Dependencies pinned, manual setup needed
Quick start
git clone https://github.com/AILab-CVC/FreeNoise.git
pip install -r requirements.txt No benchmark numbers could be verified. You will not be able to validate reproduction correctness against published numbers.
[ICLR 2024] Code for FreeNoise based on AnimateDiff
No additional community repositories detected yet.
These repositories had low-confidence matching signals and are hidden by default.
No direct paper-linked artifacts were found. Showing strongest curated related artifacts for faster exploration.
No trustworthy model matches right now.
Search models on Hugging FaceNo trustworthy dataset matches right now.
Search datasets on Hugging FaceTasks
None detected
Methods
Transformer, Diffusion
Domains
Computer vision
Evaluation & Human Feedback Data
Open this paper in HFEPX to review benchmark signals, evaluation modes, and human-feedback protocol context.
Open in HFEPXExplore Similar Papers
Jump to Paper2Code search queries derived from this paper's research context.
Need human evaluators for your AI research? Scale annotation with expert AI Trainers.