comfy-pack: Serving ComfyUI Workflows as APIs

December 17, 2024 • Written By Bo Jiang, Larme Zhao and Sherlock Xu

ComfyUI has rapidly gained popularity in the AI art community, driven by its vibrant ecosystem of community-created resources. Its intuitive interface lets users tap into an extensive collection of shared workflows, custom nodes, and models to easily generate creative contents without dealing with complex code. This freedom fuels creativity and simplifies experimentation.

However, once you’ve built a workflow and want to run it in production, new challenges arise. Production deployment requires exposing the workflow through an API interface, but ComfyUI itself isn’t designed to serve as an API-driven inference tool. At a high level, it faces the following limitations:

  • No standard API interface: ComfyUI is built for use with its graphical interface, where inputs and outputs are managed within the UI. There's no straightforward way to expose a workflow as a simple RESTful API.
  • Limited portability: Workflows can't be easily packaged and deployed elsewhere while maintaining consistent behavior. Users must manually manage Python dependencies, download custom nodes, and source specific model versions.
  • No scaling capabilities: ComfyUI doesn’t support dynamic scaling, such as scaling down to zero when idle or scaling up to handle high traffic.

In this blog post, we’ll introduce comfy-pack, a solution that transforms ComfyUI workflows into production-ready APIs.

Here is a quick demo:

Challenges in turning ComfyUI workflows into scalable APIs

Before diving into our approach, let's examine the technical difficulties in converting ComfyUI workflows to well-defined, deployable and scalable APIs.

Input and output definitions are unclear

A ComfyUI workflow is built by connecting various nodes, each potentially requiring different inputs. While this is intuitive for experimentation, it creates ambiguity when exposing workflows as APIs. Key questions include:

  • Which parameters are user-configurable and exposed as API inputs?
  • What should remain constant with default values?
  • How should input data types (e.g., ranges, types) be validated?

Without clear answers to these questions, it’s difficult to design well-defined, production-grade endpoints.

Recreating the workspace is difficult

To ensure a ComfyUI workflow runs smoothly in production, users need to reproduce the exact workspace, which typically involves:

  • Custom nodes: Many workflows depend on third-party custom nodes, often with specific version requirements. To install them, users may need to clone them manually from GitHub, which is time-consuming. While ComfyUI Manager can help install custom nodes, its default behavior of using the latest versions can break workflows that depend on older versions.
  • Python packages: Some custom nodes require specific Python libraries to function. Without proper version pinning, reproducing an identical environment becomes a game of chance. What works on one system might fail on another due to subtle version differences.

Models are hard to track and source

AI models in a ComfyUI workflow might come from various repositories like Hugging Face or Civitai. Even if you know a model name, identifying the exact version used by the workflow can be time-consuming and error-prone. This makes it difficult to ensure anyone running the workflow will achieve identical, reproducible results.

comfy-pack: Your pathway to production-grade APIs

To address these limitations, we developed comfy-pack and contributed it to the ComfyUI community. As the creators of BentoML, the unified model serving framework, we bring years of experience in building tools for scalable and reliable AI deployment. With comfy-pack, we look to provide a streamlined process for defining, packaging, and deploying ComfyUI workflows as robust, production-ready APIs.

Clear input and output declaration

comfy-pack introduces dedicated input and output nodes that make it easy to define which parameters are user-configurable and how outputs are structured. With these nodes, users can:

  • Explicitly declare inputs like prompts, images, dimensions, or seeds
  • Add constraints and automatic type validation for inputs, ensuring API requests are clear and reliable
  • Define exactly what the API returns and preview output

This clear interface definition makes it simple for others to understand and interact with your API endpoint.

Locked versions for all components

comfy-pack ensures consistency by locking the versions of every component in the workflow, such as custom nodes (pinned to its exact Git commit hash), Python packages, and ComfyUI version. With one click on the Serve button, comfy-pack generates an endpoint with OpenAPI documentation. This API can be called using standard tools like curl or BentoML clients, making integration with other applications straightforward.

Model hash matching

comfy-pack solves model tracking through hash-based verification:

  • Automatically computes and records model file hashes
  • Generates download URLs (e.g., from Hugging Face or Civitai)
  • Supports auto-retrieval of the model during deployment without manual searching and downloading
  • Helps workflows remain consistent and reproducible with the exact models

From local prototyping to scalable cloud APIs

comfy-pack makes cloud deployment seamless. With just one click, you can deploy your ComfyUI workflow to BentoCloud, our AI inference platform designed to build fast and scalable AI systems.

Let's explore how this process works end-to-end.

  1. When you click the Deploy button, comfy-pack locks the versions of all workflow components (e.g. custom nodes, Python packages and model files) to ensure consistent and reproducible behavior.

  2. comfy-pack bundles the locked components into a deployable artifact called a Bento. This portable package contains everything needed to recreate the ComfyUI workspace.

  3. Once the Bento is created, comfy-pack deploys it to BentoCloud. During deployment, the data in the Bento is used to reproduce the original ComfyUI workspace:

    • Custom nodes: Pull specific versions from GitHub repositories using commit hashes.
    • Python packages: Install dependencies with pinned versions.
    • Models: Retrieve models using their recorded URLs and validate them against the stored hashes. If a required model is already stored on BentoCloud, it’s pulled directly from there, thus saving download time.

Once deployed, your ComfyUI workflow benefits from BentoCloud's production-ready features, such as:

  • High-performance inference with easy access to various cloud GPUs like T4, L4, and A100
  • Automatic scaling based on traffic with blazing fast cold start
  • Comprehensive monitoring through built-in observability dashboards
  • Dedicated AI deployment in your own VPC

This end-to-end process allows your ComfyUI workflow to transition seamlessly from local prototyping to robust, scalable cloud APIs without any guesswork.

Getting started

The simplest way to install comfy-pack is through ComfyUI Manager. Simply search for comfy-pack custom node and install it. After installation, you'll notice new buttons in your ComfyUI dashboard.

To learn more and get support: