How to Install and Use Wan Vase AI Video Model in ComfyUI

Wan Vase represents the latest breakthrough in AI video generation, allowing users to create stunning videos from simple text prompts, transform images into dynamic content, or control video sequences using reference footage. This comprehensive guide will walk you through installing and using this powerful model in ComfyUI.

Getting Started with ComfyUI Updates

Before diving into Wan Vase, ensure your ComfyUI installation is current. Navigate to the manager and click "update all." If this automatic update fails, locate your ComfyUI installation folder, find the update directory, and run the "update ComfyUI BAT" file. After updating, restart ComfyUI and run "update all" once more to refresh all nodes.

Essential Components for Wan Vase

The workflow requires minimal setup - you only need the Guff node, which should already be installed. If missing, access the manager, open custom nodes manager, search for "Guff," and install it. Remember to restart ComfyUI after installation.

Model Selection and Installation

Choose your model size based on your hardware capabilities and patience level. The Q4 version offers a balance between quality and speed, while Q8 provides higher quality at the cost of generation time. For users with powerful graphics cards, the 16 version delivers optimal results. Those with limited VRAM should consider the Q3 version.

Place the downloaded model in the diffusion models folder within your ComfyUI installation. Additionally, download the FPScaled clip model (different from Flux models) and place it in the text encoders folder. Finally, download the required VAE and place it in the VAE folder.

Setting Up Your First Text-to-Video Workflow

The basic workflow includes several key components: positive and negative prompts, Wan Vase to video node, case sampler, trim video latent node for frame management, and video creation nodes. ComfyUI's built-in video nodes eliminate the need for additional custom installations.

Configure your dimensions carefully - avoid exceeding 1280 pixels width to prevent extremely long generation times. Use multiples of 32 for optimal results. The frame calculation follows this formula: desired seconds multiplied by frames per second (typically 16) plus one additional frame. For example, a 3-second video requires 49 frames (3 × 16 + 1).

Creating Better Prompts

Leverage AI assistants like ChatGPT to generate effective video prompts. Simply describe your vision and specify camera movements if needed. The AI will provide detailed prompts that you can directly paste into ComfyUI's positive prompt field.

Image-to-Video Transformation

Converting to image-to-video workflow is straightforward. Add a load image node to the canvas, upload your reference image, and connect its output to the reference image input. Ensure your video dimensions match the uploaded image's aspect ratio for best results.

For optimal outcomes, focus your prompts on elements visible in the source image. Avoid describing movements of objects not present in the original image, as this can cause the AI to generate unwanted artifacts or glitches.

Advanced Video-to-Video Control

The video-to-video workflow requires the aux node (similar to ControlNet). Install this through the manager if not already available. This workflow uses a reference video to control motion, applying preprocessing techniques like canny, depth, or pose detection.

Load your control video and add appropriate prompts describing both the subject and desired motion. The AI will follow the reference video's movement patterns while applying your specified styling and content changes.

Optimizing Generation Speed with LoRA

Speed up generation significantly using the RG3 Power LoRA loader node. Download the specific Wan Vase LoRA and place it in the LoRAs folder. Set the strength to 0.25 for balanced results, though experimentation may yield better settings for your specific use case.

Adjust generation parameters: use 4-6 steps instead of the standard 20, set CFG to 6, and select Euler ancestral with beta scheduler. These optimizations can reduce generation time by more than half.

Color Correction for LoRA Results

LoRA acceleration sometimes introduces color shifts and increased contrast. Counter this by installing the easy use node and adding a color match node after VAE decode. Use your original input image as reference to maintain color accuracy throughout the generated sequence.

Quality Comparison and Expectations

While Wan Vase offers impressive free video generation capabilities, paid services like Kling AI currently deliver superior results. However, Wan Vase continues evolving and provides excellent value for users seeking cost-effective AI video creation.

Best Practices for Success

Keep video lengths under 5 seconds for optimal results and reasonable generation times. Use appropriate hardware - even high-end graphics cards like RTX 4090 require several minutes per generation. Experiment with different model sizes and settings to find your optimal balance between quality and speed.

Test various seeds if initial results contain artifacts or don't meet expectations. Small prompt adjustments can significantly impact final output quality.

Level up your team's AI usage—collaborate with Promptus. Be a creator at https://www.promptus.ai

View more workflows