Overview
Whisk AI is a new experimental generative AI tool from Google Labs that lets you create images by querying with images instead of text. Instead of writing long, detailed text prompts, you drag and drop three images: one for the subject (what you want to create), one for the scene (where it should appear), and one for the style (artistic approach). Whisk uses Gemini to automatically write detailed captions of your images, which is then fed into Imagen 3 to generate the final image. This process captures the essence of your subject—not an exact replica—allowing you to easily remix subjects, scenes, and styles in novel ways. It's like a creative remixer for pictures, perfect for creating digital plushies, enamel pins, stickers, concept art, and storyboards without needing technical prompt engineering skills.
Key Features
Three-Input Visual Blending: Combine subject, scene, and style images to create completely new visuals with unprecedented control
Image-to-Image Generation: Create images by inputting images instead of text prompts — no need to write detailed descriptions
Automatic Caption Writing: Gemini automatically analyzes your images and writes detailed captions, then Imagen 3 generates the final output
Creative Remixing: Easily remix subjects, scenes, and styles to create unique content like digital plushies, pins, stickers, and concept art
Use Cases
Concept Art & Storyboards: Independent creators and designers generate concept art, storyboard illustrations, and visual ideas without prompt engineering expertise
Product Design & Merchandise: Create designs for digital plushies, enamel pins, stickers, and custom merchandise with precise visual control
Visual Storytelling: Artists and hobbyists explore ideas and create new visual content by mixing different elements for creative projects