What we make

Versegen.AI assembles music videos automatically. Drop in a song and a folder of video clips; Versegen analyses both locally, generates ten beat-synced cut variants, and lets you pick the one that lands hardest. Export to FCPXML, Premiere XML, DaVinci XML, or universal EDL and finish in your existing NLE.

The thesis: generation is cheap, evaluation is what humans are good at, and microsurgery on a timeline is what existing NLEs are best at. Each layer should do what it's best at — Versegen owns the cut order, your NLE owns the finish.

Why local-first

Unreleased music and unreleased video both deserve tighter custody than "trust the SaaS bucket." Versegen runs every AI model on the user's own machine, so no byte of original material ever leaves the laptop. The upside compounds: no per-render bill, no rate limits, no content policy, no risk of the service being sunset next quarter.

Who builds it

Versegen is built by Versefactory.AI, Inc. in Tokyo, Japan. Founded in 2026 by Makoto Yamada. Indie shop, not VC-funded — the local-first stance is a deliberate consequence of that.

Open source

Source code is MIT-licensed on GitHub. Bundled ML model weights retain their individual licenses (mostly MIT or Apache; the face-swap weights are research-only).

Contact

For press, partnerships, or anything that needs a real human: hello@versefactory.ai.