Explainable Generative AI: Enhancing Stable Diffusion with Machine Learning and Generative AI

Explainable AI (XAI) increases the transparency of AI, it is a set of techniques that help us how machine learning algorithm make decisions however issues persist, particularly in understanding image generation tasks. Numerical interpretability is the major issue that makes it challenging for models...

Full description

Saved in:

Bibliographic Details
Published in	2025 5th International Conference on Pervasive Computing and Social Networking (ICPCSN) pp. 571 - 577
Main Authors	Sarasan, Ashish P, M, Anuvind, S, Sreelekshmi
Format	Conference Proceeding
Language	English
Published	IEEE 14.05.2025
Subjects	Additives CLIP (Contrastive Language-Image Pre- training) Diffusion models Explainable AI GEN AI (Generative Artificial Intelligence) Image quality Image synthesis Indexes LPIPS (Learned Perceptual Image Patch Similarity) Numerical models Prompt engineering PSNR PSNR (Peak Signal-to-Noise Ratio) SHAP (Shapley Additive Explanations) SSIM (Structural Similarity Index Measure) Training XAI (Explainable Artificial Intelligence)
Online Access	Get full text
DOI	10.1109/ICPCSN65854.2025.11035339

Cover

More Information
Summary:	Explainable AI (XAI) increases the transparency of AI, it is a set of techniques that help us how machine learning algorithm make decisions however issues persist, particularly in understanding image generation tasks. Numerical interpretability is the major issue that makes it challenging for models to understand numerical values in prompts to recognize the number of subjects for image generation using diffusion models. In this study, prompt engineering using Generative AI and SHAP-based (Shapley Additive Explanations) explainability are used to enhance the prompt and understand the features that contribute to the generated image. The Gemini API prompts refined model responses and increases numerical interpretability because generative AI can reduce human errors and enhance image features through structured prompts. Prompt engineering techniques are used to guide Artificial Intelligence models to produce desired output, techniques such as dynamic prompting expands the prompt and improve the numerical interpretability and clarity of the actual input.Synthetic images were tested for distortion using Peak Signal-to-Noise Ratio (PSNR: 9.11 dB) and our model is 20% better than the existing model and the Learned Perceptual Image Patch Similarity (LPIPS: 0.69) for 69% dissimilarity between the generated images of our extensible diffusion model model and the existing stable diffusion model, the Structural Similarity Index Measure (SSIM: 24%) for structural consistency and the Con trastive Language-Image Pre-training (CLIP) Score for semantic alignment between text and image. The results indicated reduced noise, better prompt alignment, and greater transparency. By combining structured prompting, explainability, and quantitative testing, this method improves generative model control, ensuring efficiency, interpretability, and better image quality for real-world applications.
DOI:	10.1109/ICPCSN65854.2025.11035339