Skip to main content
The GeminiImage node generates text and image responses from Google’s Gemini AI models. It allows you to provide multimodal inputs including text prompts, images, and files to create coherent text and image outputs. The node handles all API communication and response parsing with the latest Gemini models.

Inputs

ParameterDescriptionData TypeInput TypeDefaultRange
promptText prompt for generationSTRINGrequired""-
modelThe Gemini model to use for generating responsesCOMBOrequiredgemini_2_5_flash_image_previewAvailable Gemini models
Options extracted from GeminiImageModel enum
seedWhen seed is fixed to a specific value, the model makes a best effort to provide the same response for repeated requests. Deterministic output isn’t guaranteed. Also, changing the model or parameter settings, such as the temperature, can cause variations in the response even when you use the same seed value. By default, a random seed value is usedINTrequired420 to 18446744073709551615
imagesOptional image(s) to use as context for the model. To include multiple images, you can use the Batch Images nodeIMAGEoptionalNone-
filesOptional file(s) to use as context for the model. Accepts inputs from the Gemini Generate Content Input Files nodeGEMINI_INPUT_FILESoptionalNone-
Note: The node includes hidden parameters (auth_token, comfy_api_key, unique_id) that are automatically handled by the system and do not require user input.

Outputs

Output NameDescriptionData Type
IMAGEThe generated image response from the Gemini modelIMAGE
STRINGThe generated text response from the Gemini modelSTRING
This documentation was AI-generated. If you find any errors or have suggestions for improvement, please feel free to contribute! Edit on GitHub