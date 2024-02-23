Stability AI has announced Stable Diffusion 3, its «next generation» image synthesis model. It «builds on the work of its predecessors to generate detailed, multi-subject images with improved quality and accuracy from textual descriptions».

Stability says that the Stable Diffusion 3 family of models (which takes textual descriptions, called «hints», and converts them into corresponding images) has between 800 million and 8 billion parameters. This range makes it possible to run different versions of the model locally on different devices ` from smartphones to servers. The number of parameters roughly corresponds to the model’s capabilities in terms of how many parts it can generate. Larger models also require more VRAM on graphics accelerators to run, transmits ArsTechnica.

Stability has been creating advanced AI image generation models since 2022: Stable Diffusion 1.4, 1.5, 2.0, 2.1, XL, XL Turbo, and now 3. The company has made a name for itself as a more open alternative to proprietary image synthesis models such as DALL-E 3 by OpenAI, although not without controversy over the use of copyrighted training data, bias, and the possibility of abuse. Stable Diffusion models can be run locally and fine-tuned to change the results.

Stability CEO Emad Mostak wrote on X:

It uses a new type of diffusion transformer (similar to the Sora) combined with flow matching and other enhancements. It takes advantage of transformer enhancements and can not only scale further, but also accept multimodal input.

Some notes:

– This uses a new type of diffusion transformer (similar to Sora) combined with flow matching and other improvements.

– This takes advantage of transformer improvements & can not only scale further but accept multimodal inputs..

– More technical details soon — Emad (@EMostaque) February 22, 2024

Stable Diffusion 3 also uses «flow matching»-* a method of creating AI models that can generate images by learning to smoothly transition from random noise to a structured image. It does this without having to model every step of the process, instead focusing on the general direction or flow that image creation should follow.

Stable Diffusion 3 is not widely available, but Stability claims that once testing is complete, it will be available for free to download and run locally.