Ideogram Enhances AI Image Generation with Description-Based Referencing

After unveiling their most advanced text-to-image model, Ideogram has released an update bringing with it several additional capabilities such as description-based referencing and negative prompting.

Available via Ideogram’s web platform, these features aim to give users more control over how they produce images while improving overall quality and coherence of outputs. It further strengthens Ideogram, making a markable step toward surpassing rival image generation services such as Midjourney and DALL-E in terms of service performance.

In February, when Ideogram first introduced version 1.0 of its model, users were introduced to a magic prompt feature designed to expand and detail inputs given by users. Now, building on that success, they have implemented their new Describe feature, which creates descriptions or captions from reference images.

Users can now take a public Ideogram image or upload their own to obtain a text-based description, then use that content as a prompt to generate another similar image. Alterations to this description can also be made in order to tailor output according to specific user needs.

But Ideogram offers more. In addition to providing descriptions for reference images, the platform now allows for negative prompting as well as Fast, Default or Quality modes of operation. Negative prompting allows users to give negative prompts that inform the model what they don’t want in its output; designed specifically to remove certain objects or adapt a generation’s style.

This option allows users to control how quickly an output is produced. Ideogram claims that fast mode can generate images in about five seconds with basic quality; quality model’s focus will be photorealism and details but will take about 20 seconds; while default mode sits somewhere between them both; it takes only 12 seconds in total.

Although Ideogram does not yet know exactly how many users will take advantage of these modes, its website suggests they could quickly generate a basic image and then modify it for high-quality results.

Improved Photorealism and Text Rendering Additionally, Ideogram has also announced that their latest update enhances text rendering by further reducing error rates by 15% – this may not seem like much, but they state it performs better than DALL-3 Vivid in creating characters and words.

Ideogram did not release stats comparing their upgraded model with Midjourney, the leader in AI image generation, but did make claims that its outputs feature greater coherence and photorealism compared with its prior iteration, which are preferred over it by human raters.

Human raters overwhelmingly prefer images created using the upgraded model over its prior version in prompt alignment, image coherence and text rendering quality, according to a blog post from the company, which has attracted over seven million creators since launching public beta last year.

Presently, negative prompting and the new speed modes are available only to users paying for Ideogram’s Basic or Plus plans. There’s no clear information regarding reference image captioning but we expect it might be available free as it closely resembles Ideogram’s Remix feature that lets users generate images similar to existing reference images. Text and image coherence enhancements are available universally.