OpenAI is rolling out its latest text-to-image generator more widely. On Thursday, the company will make the new DALL-E3 model available for use in the ChatGPT application to ChatGPTPlus and enterprise customers. OpenAI said it has prepared a security mitigation stack for the model, allowing it to expand its release.
DALL-E3 was first released last month, and OpenAI showed how it improved on the previous DALL-E2 by allowing users to leverage ChatGPT to write longer, more visually descriptive prompts for use by image generators. DALL-E3 was added to Bing Chat and Bing Image Generator, making Microsoft's platform the first to offer broader model access to the public -- even before ChatGPT.
Advertised safeguards to reduce harmful imagery didn't always work, with users generating images of the World Trade Center showing SpongeBob SquarePants and other characters flying planes towards the buildings. Even though Microsoft has tried blocking certain prompts, other simple workarounds have produced similar results.
Text-to-image generators such as Midjourney, StableDiffusion, and earlier versions of DALL-E have all caused controversy. The technology has exported copyrighted image material, non-consensual nudity, race-changing subjects and photorealistic misrepresentations of public figures.
OpenAI promises a broader approach this time around and has provided a website showcasing the research done on DALL-E3. The company says it will "limit the likelihood that models will generate content styled by living artists and images of public figures, and improve the demographic representation of generated images." OpenAI also has an internal "provenance classifier" tool, which it says is 99% accurate in detecting whether an image was generated by DALL-E3.