📑 Table of Contents

ComfyUI Extension Unifies Midjourney, GPT Image, Gemini

📅 · 📁 AI Applications · 👁 11 views · ⏱️ 13 min read
💡 A new open-source ComfyUI node extension lets creators access all 3 major commercial AI image models from a single unified workflow.

One Extension Now Connects ComfyUI to 3 Major AI Image Services

A new open-source ComfyUI custom node project called ComfyUI-MidjourneyHub aims to solve one of the biggest pain points in AI-assisted image creation: the fragmented experience of working with multiple commercial image generation models. The extension provides a single, unified interface to access Midjourney, OpenAI GPT Image, and Google Gemini image generation capabilities — all from within ComfyUI's popular node-based workflow environment.

For AI artists and developers who rely on ComfyUI as their primary creative pipeline, this means no more hunting for separate node packages, configuring multiple API connections, or switching between disconnected tools. One installation, one API configuration, and all 3 major commercial image generation services become available as drag-and-drop nodes.

Key Takeaways at a Glance

  • Unified access to Midjourney, OpenAI GPT Image, and Google Gemini image generation within ComfyUI
  • Supports the latest gpt-image-2-all model from OpenAI and Gemini 3 Pro Image Preview from Google
  • Midjourney features include text-to-image, upscale, variation, batch operations, and image blending
  • All nodes organized under a single MidjourneyHub category for streamlined workflow building
  • API access managed through a single proxy service, eliminating multi-key configuration headaches
  • Fully open-source and community-maintained

Why Fragmented Tooling Is a Real Problem for AI Creators

The AI image generation landscape in 2025 is more powerful than ever — but also more fragmented. Midjourney remains the go-to for stylized, artistic output. OpenAI's GPT Image (formerly DALL-E's successor line) excels at instruction-following and text rendering. Google Gemini's image capabilities bring multimodal reasoning to the table.

Professional creators and studios increasingly need access to all of these services. Different projects, different clients, and different creative goals demand different models.

The problem is that each service has its own API structure, authentication method, rate limiting approach, and output format. Within ComfyUI — which has become the de facto standard for advanced AI image workflows — users previously had to install separate custom node packages for each service. Each package came from a different developer, with different update cycles, different configuration patterns, and different levels of maintenance.

This created what many in the community describe as a 'tooling tax' — the overhead of managing multiple integrations that should, ideally, just work together. ComfyUI-MidjourneyHub directly addresses this friction.

What the Extension Actually Offers

The project organizes its functionality into 3 clear node families, each mapped to a major commercial service. Here is what is currently supported:

Midjourney Nodes:
- Imagine — standard text-to-image generation using Midjourney's latest models
- Upscale — high-resolution enhancement of generated images
- Variation — create style and composition variations from existing outputs
- Batch — process multiple prompts or generations in a single workflow run
- Blend — merge multiple source images into cohesive compositions

OpenAI GPT Image Nodes:
- Generate — text-to-image creation, now supporting the newest gpt-image-2-all model
- Edit — instruction-based editing of existing images using natural language prompts

Google Gemini Nodes:
- Generate — text-to-image via Gemini 3 Pro Image Preview
- Edit — image editing leveraging Gemini's multimodal understanding

All nodes appear under a single 'MidjourneyHub' category in ComfyUI's node browser. This is a small but significant UX decision — it means users do not need to remember which package provides which node, or search across multiple category trees.

How the API Proxy Architecture Works

One of the more interesting technical decisions in ComfyUI-MidjourneyHub is its use of a unified API proxy layer. Rather than requiring users to independently configure API keys for Midjourney (which does not even offer a public API in the traditional sense), OpenAI, and Google, the extension routes all requests through a single proxy service.

This approach offers several practical advantages. First, it dramatically simplifies setup. Users configure one API endpoint and one set of credentials, and all 3 services become accessible. Second, it abstracts away the significant differences in how each service handles authentication, request formatting, and response parsing.

For Midjourney in particular, this is notable. Unlike OpenAI and Google, Midjourney has historically not provided a straightforward REST API for third-party integrations. Accessing it programmatically has typically required workarounds involving Discord bot interactions or unofficial reverse-engineered endpoints. The proxy approach sidesteps this complexity entirely from the end user's perspective.

The tradeoff, of course, is that users become dependent on the proxy service's availability and pricing. This is a common pattern in the ComfyUI ecosystem — many popular node packages for commercial services use similar intermediary layers — but it is worth understanding before committing to a production workflow.

How This Fits Into the Broader ComfyUI Ecosystem

ComfyUI has experienced explosive growth over the past 18 months, establishing itself as the preferred workflow tool for serious AI image practitioners. Unlike simpler interfaces such as Automatic1111 or Fooocus, ComfyUI's node-based approach allows for complex, reproducible, and shareable pipelines that can mix local models (like Stable Diffusion XL, Flux, or SD3) with cloud-based commercial services.

The trend toward 'hybrid workflows' — combining local open-source models for certain tasks with commercial APIs for others — has been accelerating throughout 2025. A creator might use a local Flux model for initial concept exploration, then route the best candidates through Midjourney for stylistic refinement, and finally use GPT Image for text overlay and fine editing.

ComfyUI-MidjourneyHub fits squarely into this trend. By lowering the barrier to incorporating commercial models, it makes hybrid workflows more accessible to a broader range of users — not just those comfortable writing custom API integration code.

Compared to alternative approaches like n8n or LangChain for orchestrating multi-model pipelines, ComfyUI's visual node graph remains more intuitive for image-focused work. Extensions like MidjourneyHub strengthen this advantage by expanding the range of services available within the visual environment.

Practical Implications for Different User Groups

For individual AI artists and hobbyists, the extension reduces setup time and cognitive overhead. Instead of maintaining 3 separate integrations, they manage 1. This is especially valuable for users who want to experiment across models to find the best fit for a given creative brief.

For studios and production teams, the unified approach simplifies onboarding and standardization. A team can share ComfyUI workflow files that reference Midjourney, GPT Image, and Gemini nodes, and any team member with the extension installed can run them without additional configuration.

For developers building AI-powered applications, ComfyUI-MidjourneyHub serves as a rapid prototyping tool. Testing how different commercial models handle the same prompt — and comparing outputs side by side within a single workflow — accelerates model selection decisions.

Key practical benefits include:

  • Reduced context switching between different tools and interfaces
  • Easier A/B testing across commercial models using identical prompts
  • Workflow portability — share complete multi-model pipelines as single .json files
  • Simplified credential management with a single API proxy configuration
  • Faster iteration cycles thanks to async processing and batch operation support

Considerations and Limitations

As with any community-maintained open-source project, there are important caveats to consider. The extension relies on a third-party API proxy, which introduces a dependency on that service's uptime, pricing, and terms of service. Users should evaluate whether this fits their reliability and compliance requirements, especially for commercial production work.

Additionally, while the extension surfaces the core functionality of each service, it may not expose every advanced parameter or feature that the native APIs offer. Power users who need granular control over specific Midjourney parameters or OpenAI image generation settings may find the unified abstraction layer occasionally limiting.

Finally, the project is maintained by an individual developer rather than a large team or company. While this is common in the ComfyUI custom node ecosystem, it does mean that update frequency and long-term maintenance depend on that individual's continued involvement.

Looking Ahead: The Multi-Model Future of AI Image Creation

ComfyUI-MidjourneyHub reflects a broader industry trajectory: the future of AI image creation is not about choosing 1 model, but about orchestrating many. As commercial image generation services continue to differentiate — Midjourney on aesthetics, GPT Image on instruction-following, Gemini on multimodal reasoning — the tools that help creators move fluidly between them will become increasingly valuable.

The open-source community around ComfyUI continues to drive innovation at a remarkable pace. Extensions like MidjourneyHub demonstrate how community developers are solving real workflow problems that the commercial platforms themselves have little incentive to address. After all, OpenAI, Google, and Midjourney each want users locked into their own ecosystems.

For creators and developers looking to stay model-agnostic and workflow-flexible, projects like this represent an important piece of the puzzle. The AI image generation space is moving fast, and the ability to adopt new models without rebuilding entire workflows from scratch is becoming a genuine competitive advantage.

The extension is available now on GitHub and can be installed through ComfyUI's standard custom node installation process. Users need only configure their API proxy credentials once to begin building workflows that span all 3 commercial image generation services.