Рубрики NewsSoftwareTechnologies

OpenAI added improved image generation to GPT-4o — now in ChatGPT by default

Published by Vadym Karpus

OpenAI has introduced a long-awaited improvement in image generation. Now, instead of a separate model like DALL-E, the new and improved image generator is part of the GPT-4o.

There are many AI models on the market that can create impressive visual scenes, but they often have difficulty displaying text, logos, and other elements that are not uncommon in everyday life.

OpenAI claims that its new GPT-4o image generation solves these shortcomings. It can accurately reproduce text and follow instructions better thanks to its knowledge base and chat context. In addition, the new model allows you to edit uploaded images or create new ones using an uploaded image as visual inspiration.

Accessibility and new opportunities

The updated image generator in GPT-4o is already starting to roll out to all ChatGPT Plus, Pro, Team and Free users. As this model will become the standard for image generation in ChatGPT, users will no longer need to select it manually before entering a query.

Users can customize images by specifying aspect ratio, exact colors (HEX codes), or even a transparent background. In the coming weeks, the new generator will also be available for ChatGPT Enterprise and Edu.

The new model can also be used in Sora to create images or via a special DALL-E GPT. For developers, support for generating images via the GPT-4o API will be available in the coming weeks.

Limitations of the model

Despite numerous improvements, the model still has some limitations:

  • Generation time – due to the increased detail, image creation can take up to one minute.
  • Cropping – long images, such as posters, can be cropped too tightly, especially at the bottom.
  • Inventing details – in queries with insufficient contextual information, the model can «guess» details.
  • Knowledge limitations – when creating complex concepts (for example, the complete periodic table), the model may not accurately reproduce more than 10-20 objects at a time.
  • Difficulties with non-Latin languages – characters may be displayed incorrectly or distorted.
  • Editing parts of an image – Correcting individual details (e.g., spelling mistakes) doesn’t always work without making side effects in the image.
  • Problems with detail at small sizes – the model may not display small details correctly.

OpenAI plans to fix these limitations in the coming weeks and months.

All images created with this generator will contain C2PA metadata, and OpenAI’s internal tool will be able to verify their origin.

Despite some limitations, the new GPT-4o image generator significantly improves the accuracy and flexibility of image creation. OpenAI promises further improvements, so users will have an even better and more convenient tool for working with images.

RecentlyOpenAI has launched GPT-4.5, but with limited access — because «GPUs» have run out.

Source: neowin