Wan 2.2 in Promptus with ComfyUI

A hands-on guide to Wan 2.2 in Promptus with ComfyUI—covering text, image, and video inputs, parameter cheatsheets, onboarding tips, and practical FAQs to help you create professional AI videos faster.

CosyFlows are curated ComfyUI workflows that hide the plumbing but keep the creative control. Under the hood, Wan 2.2 is a latent video diffusion system: it (1) compresses frames into a latent space, (2) denoises over time with a spatio-temporal U-Net guided by your prompt(s)/frames, then (3) decodes to video and optionally post-processes (debanding, slight sharpening, frame pacing).

Two ready-to-run flows:

(cosy) Wan 2.2 5B Video Generation – fast iteration, great detail at 720p (and draft 1080p). Use for ideation, social, reels, marketing snippets.
(cosy) Wan 2.2 14B First–Last Frame to Video – maximum fidelity at 1080p, optimized for keyframe-driven storytelling (smooth transitions between two designed frames).

Promptus hosts the compute, so you don’t install nodes or models; you focus on prompts, references, and a few high-impact knobs.

Three creative modes

1. Text → Video

5B, quick ideation; 14B via keyframe prompts if you provide frames

Input: A descriptive prompt (optionally negative prompt).
Mechanics: The model synthesizes a temporally consistent scene. Your words steer content, camera, mood, and motion.
Key controls to dial:

Duration (s): Shorter = crisper motion & fewer artifacts (e.g., 3–6s).
FPS: 24–30 for natural motion; higher = smoother but costlier.
CFG / Guidance: Higher = stick closer to prompt; too high may oversaturate or “lock” weird details. Start ~5–7.
Steps / Sampler: More steps = more detail/coherence (diminishing returns past a point).
Seed: Lock it to make comparable variations; change to explore.

Prompt formula (works great):
[Subject] + [Action/Motion] + [Camera] + [Lighting] + [Style/Medium] + [Era/Lens] + [Color/Grade] + [Mood/Adjectives] + quality tags (e.g., film grain, high detail) + NEGATIVE: [unwanted stuff]

Example:
“golden retriever splashing through shallow lake, handheld medium shot, backlit sunset, cinematic color grade, gentle bokeh, natural film grain, warm tones — NEGATIVE: text overlays, watermark, motion blur, double faces”

2. Image → Video

5B best; 14B if you treat the still as a “first frame”

Input: One reference image (style/subject).
Mechanics: The image anchors structure & style; diffusion invents plausible motion around it.
Key controls to dial:

Init Strength / Denoise Strength (often called “strength”):
- Lower (~0.3–0.45) = preserve more of your image (gentle parallax, breathing, small camera moves).
- Higher (~0.5–0.65) = allow new content/motion; risk drifting off-style.
Motion Presets (if available) or simple prompt verbs: “slow dolly-in”, “subtle breeze”, “light camera sway”.
Duration / FPS as above.

Tip: Add a motion description (“soft wind moving grass; camera dolly left 10%”) so the model adds believable dynamics rather than hallucinating large actions.

3. Video → Video

5B for speed; 14B for upscale polish via two keyframes or short chunks

Input: A source clip (live-action, 3D render, or a plain draft).
Mechanics: The model stylizes or modifies the input while preserving core motion.
Key controls to dial:

Denoise Strength:
- 0.35–0.5 = keep structure/motion, add style (best for brand consistency).
- 0.5–0.65 = allow bigger redesigns (costs fidelity to original).
Style Prompt: Be explicit about medium (cel animation, oil paint, photoreal), grade, lens, era.
Negative Prompt: “no text, no extra logos, no heavy blur, no jitter”.
Frame Rate Match: Matching your source fps reduces jitter.

Pro move: Feed a clean, contrasty source with consistent exposure. Garbage in = flicker out.

14B First–Last Frame

This flow shines when you upload two art-directed frames (first & last) and describe the transition:

Frames: 1920×1080 PNG/JPG with consistent grade (white balance, contrast).
Transition Prompt: Describe what changes over time (lighting, weather, pose, camera path).
Frame Rate: 24fps is a great baseline for cinematic pacing.
Duration: 3–8s sequences tend to look most “premium” and coherent.

Example brief:
First frame: “Forest at dawn” → Last frame: “Same forest at dusk”
Prompt: “sun rises then warms to golden hour; slow crane-up, leaves rustle lightly; cinematic”

High-impact parameter cheatsheet

Prompt / Negative Prompt → semantic steering & guardrails.
Seed → repeatability; lock to compare tweaks apples-to-apples.
CFG (Guidance Scale) → prompt adherence vs. freedom (start 5–7).
Steps → detail/coherence (start mid; go higher if mushy frames).
Strength (init/denoise) → how much to deviate from input image/video.
Duration & FPS → total frames; affects motion smoothness and artifact risk.
Resolution → 720p for drafts; 1080p for finals (esp. 14B).
Motion/Camera Hints → dolly, pan, tilt, zoom, parallax—small numbers feel real.

Proven recipes

Draft fast, finish premium:
5B @ 720p (short clips) → iterate prompts → lock seed → 14B @ 1080p with tight transition description (if using keyframes).
‍
Cohesive brand style (video→video):
Keep denoise ~0.4–0.5, strong style prompt (“clean commercial look, soft key light, neutral backgrounds”), and negative (“no extra logos, no vignette”).
‍
Image→video parallax loop:
One hero still + prompt “subtle camera push-in, shallow depth of field, gentle hair movement”—strength ~0.4 to preserve identity.

Troubleshooting

Flicker/“texture swim”: Shorten duration; increase steps slightly; add “stable textures, no flicker” to negative; reduce strength.
Faces/hands drift: Tighten prompt (“single subject, clean face geometry”), reduce strength, raise steps; try a new seed.
Motion too wild: Lower strength; add explicit “slow” camera/action verbs; drop FPS or duration.
Blurry frames: Increase steps a bit; try another sampler; ensure 1080p on 14B for finals.
Color/exposure pops (first–last): Match grading between keyframes; describe lighting evolution clearly.

Quick start checklist

Pick Flow: 5B for drafts / 14B for keyframed finals.
Write the prompt: subject + action + camera + lighting + style + negative.
Set basics: 720p/1080p, duration, fps, steps ~mid, CFG ~5–7, lock seed.
(Image/Video inputs?) Set strength ~0.4–0.55 depending on how much change you want.
Generate → Review: If off-style, lower strength or add more style words; if off-prompt, raise CFG slightly.
Finalize: Re-run best take at 1080p (14B) with matched grades and explicit transition description.

When to choose which

5B Text→Video: Ideation, social cuts, rapid A/B prompts, storyboarding.
5B Image→Video: Photographic parallax “alive stills,” gentle motion logos/packshots.
5B Video→Video: Consistent stylization of recorded footage or CG playblasts.
14B First–Last: Hero transitions, brand reveal sequences, cinematic micro-stories.

Wan 2.2 in Promptus ComfyUI – FAQ

What is Wan 2.2 in Promptus with ComfyUI?

Wan 2.2 is a professional latent video diffusion model offered as CosyFlows in Promptus. It packages curated ComfyUI node graphs into a no-code cloud workflow so creators can generate videos from text, images, or video without installing nodes or model files. Everything runs on cloud GPUs with simple, ComfyUI-like controls.

How do I choose between Wan 2.2 5B and Wan 2.2 14B in Promptus for text-to-video and keyframe animation?

Pick 5B for fast drafts and social clips (great for text→video and image→video at 720p; some setups can draft 1080p). Pick 14B for final-quality 1080p and keyframe-driven sequences (first→last frame). A common workflow is: draft on 5B, finalize on 14B.

What are the best prompt strategies for Wan 2.2 text-to-video in Promptus (camera, lighting, style, negative prompts)?

Use a structured prompt: Subject + Action + Camera + Lighting + Style/Medium + Lens/Era + Color Grade + Mood; include Negative terms to block artifacts. Example: “coastal lighthouse at blue hour, slow dolly-in, soft fog, cinematic grade, 35mm look — NEGATIVE: text overlay, logos, heavy blur.”

How does image-to-video work with Wan 2.2 in ComfyUI, and what strength/denoise value should I use?

Upload a still to anchor composition and style; Wan 2.2 synthesizes motion around it. Start strength ~0.35–0.45 to preserve the image (parallax, subtle push). Increase to ~0.5–0.65 for more creative change. Describe desired motion in the prompt.

How do I run video-to-video style transfer with Wan 2.2 in Promptus without losing motion coherence or brand look?

Import a clean source clip, set denoise/strength ~0.4–0.55 to preserve structure, add a precise style prompt (medium, grade, lens), and a negative prompt to avoid artifacts. Match source fps for stability and favor shorter durations for cleaner results.

What settings deliver the cleanest results in Wan 2.2 (duration, FPS, guidance/CFG, steps, seed, 720p vs 1080p)?

Aim for 3–8s duration, 24–30 fps, CFG 5–7, and increase steps if frames look soft. Lock the seed when iterating for consistent comparisons. Draft at 720p on 5B; finalize at 1080p on 14B.

How do I use Wan 2.2 14B First–Last Frame to create cinematic 1080p transitions in Promptus?

Upload two graded frames (e.g., 1920×1080), describe the transition over time (lighting, weather, camera path), choose ~24 fps and a concise 3–8s duration, then render. Matching color and contrast between frames is essential.

What are common artifacts in Wan 2.2 (flicker, blur, face drift) and how do I fix them in the Promptus workflow panel?

For flicker, shorten duration, add “no flicker” to negatives, raise steps, lower strength. For blur, raise steps or try another sampler, and finalize at 1080p on 14B. For face/hand drift, reduce strength, add identity details, test a new seed, and keep shots shorter.

Can I iterate fast at 720p with Wan 2.2 5B and upscale or finalize at 1080p with 14B—what’s the ideal workflow?

Yes. Iterate ideas quickly with 5B at 720p (lock seed, refine prompt and strength), then re-run the best take with 14B at 1080p for a cinematic final.

Do I need a local GPU for Wan 2.2 in Promptus, and how does cloud rendering compare to running ComfyUI locally?

No local GPU is required. Promptus uses distributed cloud GPUs for immediate runs. Local ComfyUI offers full DIY control but requires VRAM and maintenance; CosyFlows removes that setup while keeping creative control.

How do I keep color and exposure consistent between first and last frames for 14B keyframe animation in Promptus?

Grade both frames before upload (same white balance, contrast, or LUT). Describe the lighting evolution in the prompt, and avoid mixing radically different grades unless that change is intentional.

What file formats, aspect ratios, and frame rates does Wan 2.2 in Promptus support for export-ready MP4 videos?

Exports are MP4 for easy sharing. Common presets include 1280×720 (16:9) and 1920×1080 (16:9); additional aspect ratios depend on the specific CosyFlow. Typical frame rates are 24–30 fps—match your platform’s recommendations.

Written by:

Duni

Duni is an Artificial Intelligence engineer at Promptus, specializing in AI workflow design. Duni builds and documents ComfyUI workflows that empower creators to push the boundaries of what’s possible with Promptus and ComfyUI.

Try Promptus Cosy UI today for free.

Most recent wikis

Claudia Perez

News