Gemini will soon generate AI images of people again with the upgraded Imagen 3
The chatbot’s Gems, user-created chatbots with custom instructions, also begin rolling out this week.
Google’s generative AI tools are getting some of the boosts the company previewed at Google I/O. Starting this week, the company is rolling out the next-gen version of its Imagen image generator, which reintroduces the ability to generate AI people (after an embarrassing controversy earlier this year). Google’s Gemini chatbot also adds Gems, the company’s take on bots with custom instructions, similar to ChatGPT’s custom GPTs.
Google’s Imagen 3 is the upgraded version of its image generator, coming to Gemini. The company says the next-gen AI model “sets a new standard for image quality” and is built with guardrails to avoid overcorrecting for diversity, like the bizarre historical AI images that went viral early this year.
“Across a wide range of benchmarks, Imagen 3 performs favorably compared to other image generation models available,” Gemini Product Manager Dave Citron wrote in a press release. The tool allows you to guide the image generation with additional prompts if you don’t like what it spits out the first time.
Citron says Imagen 3 performs “favorably” compared to the competition. It also includes Google’s SynthID tool to watermark images, making it clear that they’re AI-made and not the genuine article.
Citron says the ability to generate people will return in the coming days for paid users, months after Google yanked the feature. He says new guardrails will prevent the generation of “photorealistic, identifiable individuals” — a far cry from the problematic deepfakes generated by Elon Musk’s Grok. Also off-limits are children and (as with other image generators) any gory, violent or sexual scenes. The product manager grounds expectations by saying Gemini’s images won’t be perfect, but he promises the company will continue to listen to user feedback and refine accordingly.
Starting this week, the Imagen 3 model will be available for all users, but reintroducing images featuring people will begin with paid users. English-speaking Gemini Advanced, Business and Enterprise users can expect human image generation to return “over the coming days.”
Initially previewed at Google I/O 2024, Gems are Google’s custom chatbots with user-created instructions. It’s essentially Gemini’s answer to OpenAI’s GPTs, which Google’s competitor rolled out late last year. Gems begin rolling out in the next few days.
“With Gems, you can create a team of experts to help you think through a challenging project, brainstorm ideas for an upcoming event, or write the perfect caption for a social media post,” Citron wrote. “Your Gem can also remember a detailed set of instructions to help you save time on tedious, repetitive or difficult tasks.”
In addition to the blank slate of custom Gems, Gemini will include premade ones “to help you get started” and inspire new ideas. Prebuilt Gems include:
Learning coach - to help you understand complex topics
Brainstormer - to inspire new ideas
Career guide - walk you through skill upgrades, decisions and goals
Writing editor - provide constructive feedback on grammar, tone and structure
Coding partner - upgrade coding skills for developers and inspire new projects
Gems begin rolling out today on desktop and mobile. However, they’re only available for Gemini Advanced, Business and Enterprise subscribers, so you’ll need a paid plan to check them out.