Generation of artificial facial drug abuse images using deep de-identified anonymous dataset augmentation through genetics algorithm (3DG-GA)

In biomedical research and artificial intelligence, access to large, well-balanced, and representative datasets is crucial for developing trustworthy applications that can be used in real-world scenarios. However, obtaining such datasets can be challenging, as they are often restricted to hospitals...

Full description

Saved in:

Bibliographic Details
Published in	Multimedia tools and applications Vol. 84; no. 28; pp. 34629 - 34643
Main Authors	Zein, Hazem, Laurent, Lou, Fournier, Régis, Nait-Ali, Amine
Format	Journal Article
Language	English
Published	New York Springer US 01.08.2025 Springer Nature B.V
Subjects	Artificial intelligence Asymmetry Computer Communication Networks Computer Science Data augmentation Data Structures and Information Theory Datasets Drug abuse Face recognition Genetic algorithms Image quality Multimedia Information Systems Mutation Privacy Special Purpose and Application-Based Systems Synthetic data Track 8: Medical Imaging Biometrics Synthetic database Face recognition Genetic algorithm Facial drug abuse
Online Access	Get full text
ISSN	1573-7721 1380-7501 1573-7721
DOI	10.1007/s11042-025-20639-y

Cover

More Information
Summary:	In biomedical research and artificial intelligence, access to large, well-balanced, and representative datasets is crucial for developing trustworthy applications that can be used in real-world scenarios. However, obtaining such datasets can be challenging, as they are often restricted to hospitals and specialized facilities. To address this issue, the study proposes to generate highly realistic synthetic faces exhibiting drug abuse traits through augmentation. The proposed method, called ”3DG-GA”, Deep De-identified anonymous Dataset Generation, uses a Genetic Algorithm as a strategy for synthetic face generation. The algorithm includes GAN-based artificial face generation, forgery detection, and face recognition. Initially, a dataset of 120 images of actual facial drug abuse is used. By preserving the drug traits, the 3DG-GA provides a dataset containing 3000 synthetic facial drug abuse images. The dataset will be open to the scientific community, allowing others to reproduce our results and benefit from the generated datasets while avoiding legal or ethical restrictions. Additionally, we validated the dataset by training a CNN model on the synthetic images and validating it on previously unseen real images. The model achieved an accuracy of 97.2% on the unseen real images, demonstrating the high quality and applicability of the synthetic data.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1573-7721 1380-7501 1573-7721
DOI:	10.1007/s11042-025-20639-y