This document provides a comprehensive overview of ComfyUI, a node-based visual AI engine for Stable Diffusion and related generative AI models. It covers the system's purpose, key features, and high-level architecture, serving as an entry point for understanding the codebase.
For installation instructions, see Installation and Setup. For a hands-on introduction, see Quick Start Guide. For detailed architectural information, see Core Architecture.
ComfyUI is a node-based visual AI engine and application for stable diffusion and other generative AI models. It provides a graph/nodes/flowchart interface for designing and executing advanced diffusion pipelines without coding. The system consists of a Python backend with a web-based frontend, designed for local execution on consumer hardware (Windows, Linux, macOS) with support for various GPU types (NVIDIA, AMD, Intel, Apple Silicon, Ascend, Cambricon, Iluvatar).
Sources: README.md1-37
ComfyUI is architected around three core principles:
| Principle | Implementation |
|---|---|
| Node-Based Composition | All operations are nodes in a directed acyclic graph (DAG). The execution.py:PromptExecutor traverses nodes in topological order based on dependencies. |
| Incremental Execution | Only nodes with changed inputs are re-executed. The comfy_execution/caching.py:HierarchicalCache stores outputs keyed by input hashes. |
| Hardware Flexibility | comfy/model_management.py:VRAMState enum enables dynamic loading/offloading. Models can run on GPUs with <2GB VRAM via LOW_VRAM mode. |
The codebase follows a layered architecture where high-level nodes in nodes.py and comfy_extras/ call into lower-level subsystems in the comfy/ package for model loading, sampling, and memory management.
Sources: README.md88-92 execution.py447-665 comfy_execution/caching.py1-50 comfy/model_management.py30-36
ComfyUI supports a wide range of generative model architectures through automatic model detection implemented in comfy/model_detection.py:detect_unet_config() and comfy/supported_models.py:
Model Detection and Loading Pipeline
Supported Model Types and Detection Logic
| Model Family | Class Location | Detection Keys | Text Encoder | VAE |
|---|---|---|---|---|
| SD1.5 | comfy/supported_models.py:SD15 | context_dim=768, model_channels=320 | sd1_clip.py:SD1ClipModel | ldm/modules/../AutoencoderKL |
| SDXL | comfy/supported_models.py:SDXL | context_dim=2048, adm_in_channels=2816 | sdxl_clip.py:SDXLClipModel | latent_formats.py:SDXL |
| SD3/SD3.5 | comfy/supported_models.py:SD3 | joint_blocks.*context_block.attn | sd3_clip.py:SD3ClipModel | ldm/modules/../AutoencoderKL |
| Flux | comfy/supported_models.py:Flux | double_blocks.*img_attn.qkv | flux.py:FluxClipModel | ldm/flux/../autoencoder.py |
| Hunyuan Video | comfy/supported_models.py:HunyuanVideo | decoder.conv_in.weight.shape[1]==64 | hunyuan_video.py | ldm/hunyuan_video/vae.py |
| Wan 2.1/2.2 | comfy/supported_models.py:Wan* | Various detection patterns | wan.py:WanClipModel | ldm/wan/vae*.py |
The detection system in comfy/model_detection.py:detect_unet_config() analyzes state dict keys and tensor shapes to identify model architecture, then instantiates the appropriate model class from comfy/supported_models.py.
Sources: comfy/model_detection.py37-273 comfy/supported_models.py34-1826 comfy/sd.py413-826 comfy/utils.py59-102
| Feature | Description | Implementation |
|---|---|---|
| Asynchronous Queue | Multiple prompts can be queued and executed sequentially | execution.py |
| Smart Memory Management | Dynamic model offloading supports models on GPUs with <2GB VRAM | comfy/model_management.py |
| Model Patching | Non-destructive model modifications (LoRA, hooks, weights) | comfy/model_patcher.py |
| Attention Optimization | Automatic selection between xFormers, SDPA, Flash Attention | comfy/ldm/modules/attention.py |
| Area Conditioning | Apply different prompts to different image regions | comfy/conds.py |
| Workflow Serialization | Save/load complete workflows including seeds | execution.py |
Sources: README.md87-109
ComfyUI uses a plugin architecture where functionality is added through nodes. Nodes are registered in NODE_CLASS_MAPPINGS dictionaries:
Sources: comfy_extras/nodes_mask.py396-415 comfy_extras/nodes_compositing.py203-214
Overall System Architecture
Architectural Layers:
| Layer | Key Code Entities | Responsibilities |
|---|---|---|
| User Interface Layer | server.py:PromptServer, server.py:routes, web/index.html | REST API endpoints (/prompt, /queue, /history), WebSocket communication for progress updates, frontend serving |
| Core Execution Engine | execution.py:PromptExecutor, execution.py:PromptQueue, main.py:prompt_worker, comfy_execution/graph.py:DynamicPrompt | Request queuing via PromptQueue.put(), worker thread polling via get(), dependency graph construction, topological execution with caching |
| Model Management System | comfy/model_management.py:load_models_gpu, comfy/model_management.py:VRAMState, comfy/sd.py:load_checkpoint_guess_config, comfy/model_patcher.py:ModelPatcher | Automatic model detection, checkpoint loading, LoRA/hook patching via add_patches(), dynamic VRAM state management, model loading/unloading |
| Node System | nodes.py:NODE_CLASS_MAPPINGS, custom_nodes/, comfy/samplers.py:KSampler | Node registration and discovery, workflow graph processing, sampling algorithms (Euler, DPM++, etc.) |
| Model Components | comfy/text_encoders/, comfy/sd.py:VAE, comfy/ldm/modules/, comfy/cldm/ | Text encoding (CLIP/T5/LLAMA), VAE encode/decode, UNet/DiT forward passes, ControlNet conditioning |
| Low-Level Operations | comfy/ldm/modules/attention.py, comfy/ops.py, comfy/cuda_malloc.py | Optimized attention implementations (xFormers/SDPA/Flash), FP8/FP16 operations, CUDA memory management |
Sources: server.py195-795 execution.py199-665 main.py199-275 comfy/model_management.py30-717 comfy/model_patcher.py215-276 comfy/sd.py74-1031 nodes.py1-80 comfy/samplers.py1-50
Prompt Execution Flow
Key Execution Functions:
| Component | Function/Method | Purpose |
|---|---|---|
server.py:PromptServer | post_prompt() at server.py710-795 | Receives workflow JSON, validates, adds to queue |
execution.py:PromptQueue | put() at execution.py683-708 | Adds prompt to heapq with priority |
execution.py:PromptExecutor | execute() at execution.py447-665 | Main execution loop, builds DynamicPrompt, orchestrates node execution |
execution.py:PromptExecutor | recursive_execute() at execution.py270-446 | Executes single node, handles dependencies recursively |
comfy_execution/graph.py | DynamicPrompt.__init__() at comfy_execution/graph.py161-199 | Parses workflow dict, builds node dependency graph |
comfy_execution/caching.py | HierarchicalCache.get()/set() | Caches node outputs by input signature |
main.py | prompt_worker() at main.py199-275 | Worker thread that continuously processes queue |
Execution Flow Details:
server.py:PromptServer receives workflow JSON via POST /prompt endpoint at server.py710execution.py:PromptQueue using put() methodmain.py:prompt_worker() continuously calls PromptQueue.get()PromptExecutor.execute() builds a DynamicPrompt object representing the dependency graphrecursive_execute() traverses nodes in topological order, checking cache firstexecute() method is called with resolved inputscomfy/model_management.py:load_models_gpu() loads models to deviceoutputs_cache (instance of HierarchicalCache or LRUCache)send_sync() to WebSocket clientsSources: server.py710-795 execution.py199-665 main.py199-275 comfy_execution/graph.py161-199 comfy/model_management.py632-717
Node Registration and Execution Pipeline
All operations in ComfyUI are implemented as nodes registered in NODE_CLASS_MAPPINGS dictionaries. The system supports both legacy class-based nodes and the new comfy_api.latest:ComfyExtension framework for API nodes:
Node Class Structure:
| Component | Purpose | Code Reference |
|---|---|---|
INPUT_TYPES() classmethod | Declares required/optional inputs with type constraints and UI widgets | nodes.py60-66 returns dict with "required" and "optional" keys |
RETURN_TYPES tuple | Output type declarations (e.g., (IO.CONDITIONING,)) | nodes.py67 class attribute |
FUNCTION string | Name of method to execute (e.g., "encode") | nodes.py69 |
CATEGORY string | UI organization path (e.g., "conditioning") | nodes.py71 |
execute() method | Core node logic, called by execution.py:recursive_execute | nodes.py75-79 |
Node Discovery and Loading:
Nodes are discovered through four mechanisms:
nodes.py at startup, registered in NODE_CLASS_MAPPINGS at nodes.py2425-2455comfy_extras/*.py files (e.g., nodes_mask.py, nodes_compositing.py)custom_nodes/*/ directories by main.py:load_custom_nodes() at main.py111-157comfy_api.latest:ComfyExtension.get_node_list() for newer API-based nodes (e.g., comfy_api_nodes/nodes_gemini.py805-814)Node Execution Flow:
The execution.py:recursive_execute() function at execution.py270-446 handles node execution:
outputs_cache for cached resultsget_input_data()validation.py:validate_inputs()FUNCTION method with resolved inputsoutputs_cache via comfy_execution/caching.py:HierarchicalCacheSources: nodes.py58-2455 execution.py148-446 main.py111-157 comfy_api_nodes/nodes_gemini.py805-814 comfy_execution/caching.py1-50
Model Loading and Memory Management Pipeline
Model Management Components:
| Component | File/Class | Key Functions | Purpose |
|---|---|---|---|
| Model Detection | comfy/model_detection.py | detect_unet_config() comfy/model_detection.py37-273 | Analyzes state_dict to identify model architecture |
| Model Loading | comfy/sd.py | load_checkpoint_guess_config() comfy/sd.py890-1031 | Loads checkpoint, instantiates model/CLIP/VAE |
| Model Patcher | comfy/model_patcher.py:ModelPatcher | add_patches() comfy/model_patcher.py348-380 clone() comfy/model_patcher.py287-331 | Non-destructive model modifications |
| Memory Manager | comfy/model_management.py | load_models_gpu() comfy/model_management.py632-717 free_memory() comfy/model_management.py595-630 | VRAM allocation, model offloading |
| Device Selection | comfy/model_management.py | get_torch_device() comfy/model_management.py170-188 | Chooses CUDA/MPS/CPU/XPU device |
| LoRA Loading | comfy/sd.py | load_lora_for_models() comfy/sd.py74-102 | Applies LoRA patches to model and CLIP |
VRAM States and Memory Management:
The VRAMState enum at comfy/model_management.py30-36 defines memory management strategies:
DISABLED: No VRAM (CPU-only mode)NO_VRAM: Very low VRAM - aggressive offloadingLOW_VRAM: Partial model loading, offload unused layersNORMAL_VRAM: Keep models in VRAM when possibleHIGH_VRAM: Keep all models loaded (--highvram flag)SHARED: Shared memory between CPU/GPU (MPS on macOS)ModelPatcher Operation:
The ModelPatcher class at comfy/model_patcher.py215-1034 provides non-destructive model modifications:
patches dict mapping weight keys to patch tuplesclone() creates child patcher sharing parent's modeladd_patches() registers LoRA/hook/weight modificationspatch_model() at comfy/model_patcher.py661-852 applies patches without modifying original weightsset_hook_mode() for dynamic weight injectionSources: comfy/sd.py74-1031 comfy/model_patcher.py215-852 comfy/model_management.py30-717 comfy/model_detection.py37-273
Extension System for Custom Nodes
ComfyUI provides an extension framework for both built-in and external integrations through the comfy_api.latest:ComfyExtension abstract base class:
Extension Types:
| Extension Type | Implementation | Example Nodes | Purpose |
|---|---|---|---|
| Built-in Extensions | Extend ComfyExtension in comfy_extras/ | Flux video generation, Hunyuan3D processing | Local model processing, specialized features |
| External API Extensions | Extend ComfyExtension in comfy_api_nodes/ | Gemini at comfy_api_nodes/nodes_gemini.py805-814 OpenAI at comfy_api_nodes/nodes_openai.py568-692 | Cloud service integration, paid API access |
| Custom Nodes | Traditional NODE_CLASS_MAPPINGS in custom_nodes/ | Community third-party plugins | User-created extensions |
API Node Structure (Example: Gemini):
Extension Discovery:
Extensions are loaded through comfy_entrypoint() functions that return extension instances. The system calls get_node_list() to register nodes into the global NODE_CLASS_MAPPINGS dictionary.
Sources: comfy_api_nodes/nodes_gemini.py238-817 comfy_api_nodes/nodes_openai.py568-692 comfy_api_nodes/nodes_moonvalley.py523-535 comfy_api_nodes/nodes_runway.py1-13
Attention Implementation Selection
ComfyUI automatically selects the optimal attention implementation based on hardware availability and model requirements. The selection logic in comfy/ldm/modules/attention.py prioritizes performance and memory efficiency:
Attention Implementations:
| Implementation | File/Function | Memory Complexity | Use Case |
|---|---|---|---|
| SageAttention | Hardware-specific implementation | O(n²) optimized | Modern GPUs with specialized hardware |
| Flash Attention | Fused CUDA kernels | O(n²) optimized | NVIDIA A100/H100 GPUs |
| xFormers | memory_efficient_attention() | O(n²) chunked | General NVIDIA/AMD GPUs |
| PyTorch SDPA | torch.nn.functional.scaled_dot_product_attention | O(n²) native | PyTorch 2.0+ with CPU/GPU backends |
| Sub-Quadratic | comfy/ldm/modules/sub_quadratic_attention.py187-276 | O(√n) | Limited VRAM, CPU execution |
The sub-quadratic attention implementation provides memory-efficient attention with O(√n) memory requirements through query/key chunking, based on the paper "Self-attention Does Not Need O(n²) Memory". The efficient_dot_product_attention() function at comfy/ldm/modules/sub_quadratic_attention.py187-276 implements this algorithm.
Sources: comfy/ldm/modules/sub_quadratic_attention.py1-276
ComfyUI implements intelligent caching to avoid redundant computation:
This enables interactive workflows where tweaking one parameter only re-executes affected nodes.
Sources: execution.py README.md318-321
The codebase is organized into several key directories:
ComfyUI/
├── comfy/ # Core library code
│ ├── sd.py # Model loading
│ ├── model_patcher.py # Model patching system
│ ├── model_management.py # Memory and device management
│ ├── samplers.py # Sampling algorithms
│ ├── model_base.py # Base model classes
│ ├── supported_models.py # Model detection
│ └── ldm/modules/ # Attention implementations
├── comfy_extras/ # Extra node implementations
│ ├── nodes_mask.py # Mask operations
│ ├── nodes_compositing.py # Porter-Duff compositing
│ └── nodes_post_processing.py # Image post-processing
├── custom_nodes/ # Third-party extensions
├── web/ # Frontend (compiled from separate repo)
├── nodes.py # Core node definitions
├── server.py # HTTP/WebSocket server
├── execution.py # Execution engine
├── folder_paths.py # Model path configuration
└── main.py # Entry point
Sources: README.md109 pyproject.toml1-25
ComfyUI follows a weekly release cycle with semantic versioning. The current version is tracked in:
version = "0.3.60"__version__ = "0.3.60"The version file is automatically synchronized with pyproject.toml via GitHub Actions workflow .github/workflows/update-version.yml1-59
The release process involves three repositories:
Sources: pyproject.toml1-25 comfyui_version.py1-4 README.md113-128 .github/workflows/update-version.yml1-59
To begin using ComfyUI:
ComfyUI can be run via:
Sources: README.md39-52 README.md167-296
Refresh this wiki