Multi Modal Input

Feed in text, voice narration, or short audio clips to guide both visuals and sound in your video. Promptus with ComfyUI lets you combination nodes to generate a unified result.

Rich Guidance

Use text or voice or both to describe your creative vision. Provide nuanced direction for both visuals and audio.
Try Promptus for free ➜
promptus ai local app
promptus ai local app

Hands‑Free Option

Create videos by speaking, without typing a word.

Which audio formats are supported? Can I combine text and audio?

Multi Modal AI Generation

Multi Modal Input adapts to your workflow, letting you speak or write as you prefer. It delivers richer, more precise video outputs by combining inputs.

Start using Promptus
Just create your
next AI workflow
with Promptus
Try Promptus for free ➜