机器之心报道
机器之心编辑部
最近几天,谷歌 Nano Banana 可是被广大网友玩出了新花样。
比如制作精致可爱的产品照片:
https://x.com/azed_ai/status/1962878353784066342
将 13 张图像合并为单个图像 :
https://x.com/MrDavids1/status/1960783672665128970
给人一键换衣:
反正你想到的,想不到的脑洞,都被广大网友挖掘出来了。
但别忘了,这些效果可不是凭空生成的。背后真正的魔法,其实是提示词。网友们正是用一条条巧妙的提示词,把这个模型玩出了无限可能。
就在刚刚,谷歌官方公布了 Nano Banana 六个文本转图像提示:
原文链接:https://x.com/googleaistudio/status/1962957615262224511
根据这些提示,你可以进行以下操作:
文本生成图像:通过简单或复杂的文本描述生成高质量图像。图像 + 文本生成图像(图像编辑):提供一张图片,并使用文本提示词添加、删除或修改图像元素,调整风格或颜色。多图合成与风格迁移:输入多张图片,合成新的场景,或将其中一张的风格迁移到另一张上。迭代式优化:通过对话逐步优化图像,每次做小调整,直到达到理想效果。文本渲染:生成包含清晰、布局合理文字的图像,适用于 logo、图表、海报等视觉创作。
谷歌强调,这些指令可以最大限度的发挥 Nano Banana 的图像生成能力。
接下来,我们看看这些提示具体包含的内容:
1、照片级写实场景
对于写实风格的图像,要像摄影师一样思考。prompt 中要提及机位角度、镜头类型、光线以及细节描写,这样可以引导模型生成更逼真的效果。
模板如下:
A photorealistic [shot type] of [subject], [action or expression], set in [environment]. The scene is illuminated by [lighting description], creating a [mood] atmosphere. Captured with a [camera/lens details], emphasizing [key textures and details]. The image should be in a [aspect ratio] format.
代码块如下:
from google import genaifrom google.genai import typesfrom PIL import Imagefrom io import BytesIOclient = genai.Client() Generate an image from a text promptresponse = client.models.generate_content( model="gemini-2.5-flash-image-preview", contents="A kawaii-style sticker of a happy red panda wearing a tiny bamboo hat. It's munching on a green bamboo leaf. The design features bold, clean outlines, simple cel-shading, and a vibrant color palette. The background must be white.",)image_parts = [ part.inline_data.data for part in response.candidates[0].content.parts if part.inline_data]if image_parts: image = Image.open(BytesIO(image_parts[0])) image.save('red_panda_sticker.png') image.show()
下图使用的完整 prompt 为「A kawaii-style sticker of a happy red panda wearing a tiny bamboo hat. It's munching on a green bamboo leaf. The design features bold, clean outlines, simple cel-shading, and a vibrant color palette. The background must be white.」
3、图上添加精准的文字
Gemini 擅长渲染文字。此类任务最好在 prompt 中明确说明文字内容、字体风格(用描述性的方式),以及整体设计。
模板如下:
Create a [image type] for [brand/concept] with the text "[text to render]" in a [font style]. The design should be [style description], with a [color scheme].
代码块如下:
from google import genaifrom google.genai import typesfrom PIL import Imagefrom io import BytesIOclient = genai.Client() Generate an image from a text promptresponse = client.models.generate_content( model="gemini-2.5-flash-image-preview", contents="A high-resolution, studio-lit product photograph of a minimalist ceramic coffee mug in matte black, presented on a polished concrete surface. The lighting is a three-point softbox setup designed to create soft, diffused highlights and eliminate harsh shadows. The camera angle is a slightly elevated 45-degree shot to showcase its clean lines. Ultra-realistic, with sharp focus on the steam rising from the coffee. Square image.",)image_parts = [ part.inline_data.data for part in response.candidates[0].content.parts if part.inline_data]if image_parts: image = Image.open(BytesIO(image_parts[0])) image.save('product_mockup.png') image.show()
下图使用的完整 prompt 为「A high-resolution, studio-lit product photograph of a minimalist ceramic coffee mug in matte black, presented on a polished concrete surface. The lighting is a three-point softbox setup designed to create soft, diffused highlights and eliminate harsh shadows. The camera angle is a slightly elevated 45-degree shot to showcase its clean lines. Ultra-realistic, with sharp focus on the steam rising from the coffee. Square image.」
5、极简与留白设计
适合用于创建网站、演示文稿或营销素材的背景,并在其上叠加文字内容。
模板如下:
A minimalist composition featuring a single [subject] positioned in the [bottom-right/top-left/etc.] of the frame. The background is a vast, empty [color] canvas, creating significant negative space. Soft, subtle lighting. [Aspect ratio].
代码块如下:
from google import genaifrom google.genai import typesfrom PIL import Imagefrom io import BytesIOclient = genai.Client() Generate an image from a text promptresponse = client.models.generate_content( model="gemini-2.5-flash-image-preview", contents="A single comic book panel in a gritty, noir art style with high-contrast black and white inks. In the foreground, a detective in a trench coat stands under a flickering streetlamp, rain soaking his shoulders. In the background, the neon sign of a desolate bar reflects in a puddle. A caption box at the top reads "The city was a tough place to keep secrets." The lighting is harsh, creating a dramatic, somber mood. Landscape.",)image_parts = [ part.inline_data.data for part in response.candidates[0].content.parts if part.inline_data]if image_parts: image = Image.open(BytesIO(image_parts[0])) image.save('comic_panel.png') image.show()
下图使用的完整 prompt 为「A single comic book panel in a gritty, noir art style with high-contrast black and white inks. In the foreground, a detective in a trench coat stands under a flickering streetlamp, rain soaking his shoulders. In the background, the neon sign of a desolate bar reflects in a puddle. A caption box at the top reads "The city was a tough place to keep secrets." The lighting is harsh, creating a dramatic, somber mood. Landscape.」
这一套 prompt 模板学下来,你大概就能掌握使用 Nano Banana 的精髓了。
不过,用户在使用中还有其他困扰,比如「在对已有图像进行编辑时,模型往往会返回一张一模一样的图像。」
另外有人指出了 Nano Banana 在编辑时存在的更多问题,「它在一致性上表现不如 Qwen 和 Kontext Pro,也不够稳定,特别是在持续对话过程中。对于文本转图像,直接用 Imagen 会更好且更可控。」
大家在使用 Nano Banana 的过程中有哪些独到的心得与技巧?欢迎在评论区分享出来。