This document introduces how to use LightX2V for video generation, including basic usage and advanced configurations.
- Environment Setup
- Basic Usage Examples
- Model Path Configuration
- Creating Generator
- Advanced Configurations
Please refer to the main project's Quick Start Guide for environment setup.
A minimal code example can be found in examples/wan_t2v.py:
from lightx2v import LightX2VPipeline
pipe = LightX2VPipeline(
model_path="/path/to/Wan2.1-T2V-14B",
model_cls="wan2.1",
task="t2v",
)
pipe.create_generator(
attn_mode="sage_attn2",
infer_steps=50,
height=480,
width=832,
num_frames=81,
guidance_scale=5.0,
sample_shift=5.0,
)
seed = 42
prompt = "Your prompt here"
negative_prompt = ""
save_result_path="/path/to/save_results/output.mp4"
pipe.generate(
seed=seed,
prompt=prompt,
negative_prompt=negative_prompt,
save_result_path=save_result_path,
)Pass the model path to LightX2VPipeline:
pipe = LightX2VPipeline(
model_path="/path/to/Wan2.2-I2V-A14B",
model_cls="wan2.2_moe", # For wan2.1, use "wan2.1"
task="i2v",
)When there are multiple versions of bf16 precision DIT model safetensors files in the model_path directory, you need to use the following parameters to specify which weights to use:
dit_original_ckpt: Used to specify the original DIT weight path for models like wan2.1 and hunyuan15low_noise_original_ckpt: Used to specify the low noise branch weight path for wan2.2 modelshigh_noise_original_ckpt: Used to specify the high noise branch weight path for wan2.2 models
Usage Example:
pipe = LightX2VPipeline(
model_path="/path/to/Wan2.2-I2V-A14B",
model_cls="wan2.2_moe",
task="i2v",
low_noise_original_ckpt="/path/to/low_noise_model.safetensors",
high_noise_original_ckpt="/path/to/high_noise_model.safetensors",
)The generator can be loaded directly from a JSON configuration file. Configuration files are located in the configs directory:
pipe.create_generator(config_json="../configs/wan/wan_t2v.json")You can also create the generator manually and configure multiple parameters:
pipe.create_generator(
attn_mode="flash_attn2", # Options: flash_attn2, flash_attn3, sage_attn2, sage_attn3 (B-architecture GPUs)
infer_steps=50, # Number of inference steps
num_frames=81, # Number of video frames
height=480, # Video height
width=832, # Video width
guidance_scale=5.0, # CFG guidance strength (CFG disabled when =1)
sample_shift=5.0, # Sample shift
fps=16, # Frame rate
aspect_ratio="16:9", # Aspect ratio
boundary=0.900, # Boundary value
boundary_step_index=2, # Boundary step index
denoising_step_list=[1000, 750, 500, 250], # Denoising step list
)Parameter Description:
- Resolution: Specified via
heightandwidth - CFG: Specified via
guidance_scale(set to 1 to disable CFG) - FPS: Specified via
fps - Video Length: Specified via
num_frames - Inference Steps: Specified via
infer_steps - Sample Shift: Specified via
sample_shift - Attention Mode: Specified via
attn_mode, options includeflash_attn2,flash_attn3,sage_attn2,sage_attn3(for B-architecture GPUs)
create_generator(), otherwise they will not take effect!
Significantly reduces memory usage with almost no impact on inference speed. Suitable for RTX 30/40/50 series GPUs.
pipe.enable_offload(
cpu_offload=True, # Enable CPU offloading
offload_granularity="block", # Offload granularity: "block" or "phase"
text_encoder_offload=False, # Whether to offload text encoder
image_encoder_offload=False, # Whether to offload image encoder
vae_offload=False, # Whether to offload VAE
)Notes:
- For Wan models,
offload_granularitysupports both"block"and"phase" - For HunyuanVideo-1.5, only
"block"is currently supported
Quantization can significantly reduce memory usage and accelerate inference.
pipe.enable_quantize(
dit_quantized=False, # Whether to use quantized DIT model
text_encoder_quantized=False, # Whether to use quantized text encoder
image_encoder_quantized=False, # Whether to use quantized image encoder
dit_quantized_ckpt=None, # DIT quantized weight path (required when model_path doesn't contain quantized weights or has multiple weight files)
low_noise_quantized_ckpt=None, # Wan2.2 low noise branch quantized weight path
high_noise_quantized_ckpt=None, # Wan2.2 high noise branch quantized weight path
text_encoder_quantized_ckpt=None, # Text encoder quantized weight path (required when model_path doesn't contain quantized weights or has multiple weight files)
image_encoder_quantized_ckpt=None, # Image encoder quantized weight path (required when model_path doesn't contain quantized weights or has multiple weight files)
quant_scheme="fp8-sgl", # Quantization scheme
)Parameter Description:
dit_quantized_ckpt: When themodel_pathdirectory doesn't contain quantized weights, or has multiple weight files, you need to specify the specific DIT quantized weight pathtext_encoder_quantized_ckptandimage_encoder_quantized_ckpt: Similarly, used to specify encoder quantized weight pathslow_noise_quantized_ckptandhigh_noise_quantized_ckpt: Used to specify dual-branch quantized weights for Wan2.2 models
Quantized Model Downloads:
- Wan-2.1 Quantized Models: Download from Wan2.1-Distill-Models
- Wan-2.2 Quantized Models: Download from Wan2.2-Distill-Models
- HunyuanVideo-1.5 Quantized Models: Download from Hy1.5-Quantized-Models
hy15_qwen25vl_llm_encoder_fp8_e4m3_lightx2v.safetensorsis the quantized weight for the text encoder
Usage Examples:
# HunyuanVideo-1.5 Quantization Example
pipe.enable_quantize(
quant_scheme='fp8-sgl',
dit_quantized=True,
dit_quantized_ckpt="/path/to/hy15_720p_i2v_fp8_e4m3_lightx2v.safetensors",
text_encoder_quantized=True,
image_encoder_quantized=False,
text_encoder_quantized_ckpt="/path/to/hy15_qwen25vl_llm_encoder_fp8_e4m3_lightx2v.safetensors",
)
# Wan2.1 Quantization Example
pipe.enable_quantize(
dit_quantized=True,
dit_quantized_ckpt="/path/to/wan2.1_i2v_480p_scaled_fp8_e4m3_lightx2v_4step.safetensors",
)
# Wan2.2 Quantization Example
pipe.enable_quantize(
dit_quantized=True,
low_noise_quantized_ckpt="/path/to/wan2.2_i2v_A14b_low_noise_scaled_fp8_e4m3_lightx2v_4step.safetensors",
high_noise_quantized_ckpt="/path/to/wan2.2_i2v_A14b_high_noise_scaled_fp8_e4m3_lightx2v_4step_1030.safetensors",
)Quantization Scheme Reference: For detailed information, please refer to the Quantization Documentation
Supports multi-GPU parallel inference. Requires running with torchrun:
pipe.enable_parallel(
seq_p_size=4, # Sequence parallel size
seq_p_attn_type="ulysses", # Sequence parallel attention type
)Running Method:
torchrun --nproc_per_node=4 your_script.pyYou can specify the cache method as Mag or Tea, using MagCache and TeaCache methods:
pipe.enable_cache(
cache_method='Tea', # Cache method: 'Tea' or 'Mag'
coefficients=[-3.08907507e+04, 1.67786188e+04, -3.19178643e+03,
2.60740519e+02, -8.19205881e+00, 1.07913775e-01], # Coefficients
teacache_thresh=0.15, # TeaCache threshold
)Coefficient Reference: Refer to configuration files in configs/caching or configs/hunyuan_video_15/cache directories
Supports loading distilled LoRA weights, which can significantly accelerate inference.
Usage Examples:
# Qwen-Image Single LoRA Example
pipe.enable_lora(
[
{"path": "/path/to/Qwen-Image-2512-Lightning-4steps-V1.0-fp32.safetensors", "strength": 1.0},
],
lora_dynamic_apply=False,
)
# Wan2.2 Multiple LoRAs Example
pipe.enable_lora(
[
{"name": "high_noise_model", "path": "/path/to/wan2.2_i2v_A14b_high_noise_lora_rank64_lightx2v_4step_1022.safetensors", "strength": 1.0},
{"name": "low_noise_model", "path": "/path/to/wan2.2_i2v_A14b_low_noise_lora_rank64_lightx2v_4step_1022.safetensors", "strength": 1.0},
],
lora_dynamic_apply=False,
)Parameter Description:
lora_configs: List of LoRA configurations, each containing:path: Path to LoRA weight file (required)name: LoRA name (optional, used when multiple LoRAs are needed, e.g., Wan2.2)strength: LoRA strength, default is 1.0
lora_dynamic_apply: Whether to dynamically apply LoRA weightsFalse(default): Merge LoRA weights during loading, faster inference but uses more memoryTrue: Dynamically apply LoRA weights during inference, saves memory but slower
LoRA Model Downloads:
- Wan-2.1 LoRA: Download from Wan2.1-Distill-Models
- Wan-2.2 LoRA: Download from Wan2.2-Distill-Models
- Qwen-Image LoRA: Download from Qwen-Image-2512-Lightning or Qwen-Image-Edit-2511-Lightning
Using lightweight VAE can accelerate decoding and reduce memory usage.
pipe.enable_lightvae(
use_lightvae=False, # Whether to use LightVAE
use_tae=False, # Whether to use LightTAE
vae_path=None, # Path to LightVAE
tae_path=None, # Path to LightTAE
)Support Status:
- LightVAE: Currently only supports wan2.1, wan2.2 moe
- LightTAE: Currently only supports wan2.1, wan2.2-ti2v, wan2.2 moe, HunyuanVideo-1.5
Model Downloads: Lightweight VAE models can be downloaded from Autoencoders
- LightVAE for Wan-2.1: lightvaew2_1.safetensors
- LightTAE for Wan-2.1: lighttaew2_1.safetensors
- LightTAE for Wan-2.2-ti2v: lighttaew2_2.safetensors
- LightTAE for HunyuanVideo-1.5: lighttaehy1_5.safetensors
Usage Example:
# Using LightTAE for HunyuanVideo-1.5
pipe.enable_lightvae(
use_tae=True,
tae_path="/path/to/lighttaehy1_5.safetensors",
use_lightvae=False,
vae_path=None
)