動画生成AI「Wan 2.5-Preview」リリース！　新アーキテクチャ採用によりテキスト・画像・動画・音声のより強力なモーダルアライメントを実現、1080p・10秒・24fps出力対応

中国・アリババグループは9月24日（水）、動画生成AIモデル「Wan 2.5-Preview」をリリースした。新しいアーキテクチャの採用により、音声と映像を一体で生成し、高度な画像編集や顔・スタイルの一貫性維持を実現。1080p・10秒・24fpsのアウトプット、音声からの動画生成などにも対応する。オープンソースのWan 2.2とは異なり、Wan公式サイトやAPIを通じて利用できる。公式サイトでは、FreeプランでもWan 2.5-Previewを用いた生成は可能だが、待ち時間が発生する。

One Prompt, Master Image Magic: Meet Wan 2.5-Preview!⁰Wan 2.5-Preview is now live with upgraded image generation capabilities!
Enhanced Aesthetic Quality⁰Realistic lighting, refined details, excels in diverse aesthetic styles and design expressions.
Stable Text Generation… pic.twitter.com/PkMRmhzwxr
— Wan (@Alibaba_Wan) September 26, 2025

Wan2.5: One Prompt, Perfect 'Vibe PSing'!
Wan 2.5-Preview is now live with image editing.
Instruction-based Image Editing.Supports a wide range of image-editing tasks and reliably follows instructions.
Visual Elements Consistency.Supports generation from single- or… pic.twitter.com/LyMEuDNv3Y
— Wan (@Alibaba_Wan) September 26, 2025

「Wan 2.5-Preview」の基盤となるアーキテクチャの設計思想は「ネイティブマルチモーダルと深いアライメント」。テキスト・画像・動画・音声の入出力を柔軟にサポートし、それらを共同で学習させることにより、より強力なモダリティ間のアライメント（連携）を実現する。その結果として音声と映像の同期（A/V Sync）が可能となり、プロンプト追従性が大幅に向上しているという。また、本モデルには人間のフィードバックからの強化学習（RLHF）が導入されており、モデルを継続的に人間の好みに合わせている。

Wan2.5: One Prompt, Perfect 'Vibe PSing'!
Wan 2.5-Preview is now live with image editing.
Instruction-based Image Editing.Supports a wide range of image-editing tasks and reliably follows instructions.
Visual Elements Consistency.Supports generation from single- or… pic.twitter.com/LyMEuDNv3Y
— Wan (@Alibaba_Wan) September 26, 2025

Wan2.5: Let Sound Take the Director’s Chair!
Today, we’re excited to unveil another major feature in our powerful Wan 2.5 Preview: Native Audio-Driven Video Generation.
Now you can use audio input directly for both text-to-video and image-to-video generation. Combine audio… pic.twitter.com/9wQZq4zEEz
— Wan (@Alibaba_Wan) September 28, 2025

最大10秒・1080p・24fpsをサポートする動画生成機能の核は「A/V同期とシネマティック品質」。忠実度と一貫性の高い、音声（複数人の声、効果音、BGM、ASMR、音楽など）付きの動画生成をネイティブでサポートする。また、音声入力を直接用いたText-to-VideoやImage-to-Videoによる動画生成も行える。

Wan2.5: Where Visuals Find Their Voice
Wan2.5-preview is now available:
Natively equips video generation with high-fidelity and synchronized audio, covering human voice (including multi-speakers), ASMR, sound effects, music, and beyond.
Offers multilingual voice support,… pic.twitter.com/5CGQKYTJ7q
— Wan (@Alibaba_Wan) September 29, 2025

Wan 2.5: Know You Better. Create Whatever You Want!
Wan2.5 Preview is now live!
Significantly improves instruction understanding and following capability, comprehending complex instructions, continuous motion sequences, and advanced camera motions, supporting structured… pic.twitter.com/50pvpGT7ZF
— Wan (@Alibaba_Wan) September 30, 2025

Wan 2.5-Previewは高度な画像生成・編集機能も備える。フォトリアル品質や多様なアートスタイルに対応し、クリエイティブなタイポグラフィ（中国語や英語など）もサポートする。また、グラフやフローチャート、データビジュアライゼーション、建築図、整形された表などを、テキストの埋め込みと共に生成できる。画像編集面では、対話形式で指示に基づく幅広い画像編集タスクを実行可能。複数のコンセプトの融合、素材の変換、製品の色の交換といったタスクをピクセルレベルの精度でサポートする。

■Wan公式サイト
https://wan.video/

プランと価格

Wanの利用プランは無料のFreeプランに加え、有料のProプラン（年払いで月5ドル、約750円）、Premiumプラン（年払いで月20ドル、約2,960円）が用意されている。FreeプランでもWan 2.5-Previewを用いた生成は可能だが、生成までの長い待ち時間が発生する。

■Free Generation & Membership（Wan公式ドキュメント）
https://alidocs.dingtalk.com/i/nodes/14lgGw3P8vxjwogPCjeO32jdV5daZ90D

CGWORLD関連情報

●動画生成AI「Kling AI 2.5 Turbo」リリース！　一貫性・安定性向上、価格30％引き下げ、ノードベースのワークスペース「Kling Lab」も全ユーザーに開放

快手が動画生成AI「Kling AI 2.5 Turbo」をリリース。プロンプトの忠実性と時間的制御の向上、ダイナミックなシーンでの流動性と安定性の向上、多様なスタイルとの一貫性維持などを実現している。また、ノードベースのワークスペース「Kling Lab」が全ユーザーに公開され、2.5 Turboがサポートされた。
https://cgworld.jp/flashnews/01-202510-Kling-AI-25-Turbo.html

●動画生成AIモデル「VEED Fabric 1.0」リリース！ 1枚の画像と音声ファイルから、人物が話しているリアルな動画を生成

VEED.IOがオンライン動画編集プラットフォーム「VEED」の動画生成機能に同社開発の動画生成AIモデル「VEED Fabric 1.0」を追加。1枚の画像と音声ファイルから、人物が話しているリアルな動画を生成できるモデルで、有料サブスクライバー向けに提供される。
https://cgworld.jp/flashnews/01-202509-VEED-Fabric.html