Investigating Toxicity and Bias in Stable Diffusion Text-To-Image Models

Matthias Schneider
Thilo Hagendorff

0 evaluations Published on May 2, 2025

This article on Sciety

Abstract

Text-to-image models are increasingly popular and impactful, yet concerns regarding their safety and fairness remain. This study investigates the ability of ten popular Stable Diffusion models to generate harmful images, including sexual, violent, and personally sensitive material. We demonstrate that these models respond to harmful prompts by generating inappropriate content, which frequently displays troubling biases, such as the disproportionate portrayal of Black individuals in violent contexts. Our findings demonstrate a complete lack of any refusal behavior or safety measures in the models observed. We emphasize the importance of addressing this issue as image generation technologies continue to become more accessible and incorporated into everyday applications.

Related articles are currently not available for this article.