Text-to-Image
Diffusion Single File
comfyui

FP8 Quantized model of ANIMA

!! Currently FP8, MXFP8 and NVFP4 doesn't work properly with torch.compile, so it is better to use original BF16 model. !!

There are two models - FP8 and NVFP4Mixed.

  • FP8 (2.4GB) : (recommend) maximize generation speed while preserving quality as much as possible.
  • NVFP4Mixed (2.0GB): (marginal quality) Mixture of FP8 and NVFP4.

To use torch.compile, use the TorchCompileModelAdvanced node from KJNodes, set the mode to max-autotune-no-cudagraphs, and make sure dynamic is set to false.

Generation speed

Tested on

  • RTX5090 (400W), ComfyUI with --fast option, torch2.10.0+cu130
  • Generates 832x1216, 30steps, cfg 4.0, er sde, simple
quant none sage+torch.compile
bf16 7.13s/4.21it/s 5.16s/5.81it/s (+38%)
fp8 6.66s/4.50it/s (+11%) 4.52s/6.64it/s (+58%)
nvfp4mix 6.37s/4.71it/s (+12%) 4.99s/6.01it/s (+43%)

Sample

anima-preview3-base

26-04-09-Anima_00005_

anima-preview2

26-03-12-Anima_00008_

anima-preview

quant sample
bf16 anima-preview-bf16
fp8 anima-preview-fp8
nvfp4mixed anima-preview-nvfp4

Quantized layers

fp8

{
  "format": "comfy_quant",
  "block_names": ["net.blocks."],
  "rules": [
    { "policy": "keep", "match": ["blocks.0", "blocks.1."] },
    { "policy": "float8_e4m3fn", "match": ["q_proj", "k_proj", "v_proj", "o_proj", "output_proj", ".mlp"] },
    { "policy": "nvfp4", "match": [] }
  ]
}

nvfp4mixed

{
  "format": "comfy_quant",
  "block_names": ["net.blocks."],
  "rules": [
    { "policy": "keep", "match": ["blocks.0."] },
    { "policy": "float8_e4m3fn", "match": ["v_proj", "adaln_modulation", ".mlp"] },
    { "policy": "nvfp4", "match": ["k_proj", "q_proj", "output_proj"] }
  ]
}
Downloads last month
1,844
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Bedovyy/Anima-FP8

Quantized
(19)
this model
Adapters
1 model