OpenAI opened DALL-E 2 to waitlist applicants in April and began removing the waitlist in September 2022. The quality jump from DALL-E 1 to DALL-E 2 is what made AI image generation a practical tool rather than a curiosity.

The diffusion model difference

DALL-E 2 uses a diffusion model rather than the autoregressive approach of DALL-E 1. Diffusion models start with noise and progressively denoise toward an image. The result is more coherent images with better spatial relationships, texture, and lighting consistency. The photorealistic results that DALL-E 2 can produce on many prompts were qualitatively different from anything publicly available before.

Inpainting and outpainting

DALL-E 2's inpainting (replacing selected regions with AI-generated content) and outpainting (extending an image beyond its original borders) were capabilities that professional image editors immediately found useful. Removing an object from a photo and filling the background, extending a landscape image, or replacing a person's clothing with different options: these are tasks with clear commercial applications.

The prompt craft

Getting consistent, high-quality results from DALL-E 2 required learning the prompt patterns that the model responds to well. Artists and photographers discovered that including style references (in the style of, photorealistic, cinematic lighting, shot with 85mm lens) dramatically improved outputs. This prompt engineering for image models is distinct from text model prompting and became a skill of its own.

What Midjourney did differently

Midjourney launched to public beta in July 2022 via Discord. Where DALL-E 2 optimised for photorealism and prompt accuracy, Midjourney optimised for aesthetic quality. Midjourney images have a distinctive look that many people find beautiful. The Discord-based interface with its community aspect created a different adoption path from DALL-E 2's API-based access. Both products validated different user need.