Generative transformer-based deep hierarchical VAE model for the automated generation of chemical process topologies

Chemical process synthesis involves two key challenges: defining the process topology and specifying the physicochemical details. To address the first challenge, this work presents a data-driven framework for the automated generation of diverse and structurally valid process topologies. Our approach...

Full description

Saved in:
Bibliographic Details
Published inComputers & chemical engineering Vol. 205; p. 109431
Main Authors Son, Yeong Woo, Pak, Ji Hun, Kim, Chan, Lee, Jong Min
Format Journal Article
LanguageEnglish
Published Elsevier Ltd 01.02.2026
Subjects
Online AccessGet full text
ISSN0098-1354
DOI10.1016/j.compchemeng.2025.109431

Cover

More Information
Summary:Chemical process synthesis involves two key challenges: defining the process topology and specifying the physicochemical details. To address the first challenge, this work presents a data-driven framework for the automated generation of diverse and structurally valid process topologies. Our approach utilizes a transformer-based generative model to learn the underlying grammar of process structures from a large dataset of designs. By learning a flexible latent representation and enabling constraint-aware generation, our framework rapidly produces a wide range of novel candidate topologies for subsequent, engineering analysis. We compile a database of real-world process flow diagrams (PFDs) and augment it with synthetically generated process topologies using a higher-order Markov model. All flowsheets are encoded as structured text sequences using the simplified flowsheet input-line entry system (SFILES), allowing compatibility with transformer architectures. We train a generative model that integrates a modified transformer architecture with a deep hierarchical variational autoencoder (VAE), and apply a constrained beam search algorithm to ensure syntactic validity and design feasibility. Key contributions include: (1) a transformer-based generation method for latent vector-guided flexible process topology generation; (2) data augmentation using a higher-order Markov model; (3) a SFILES structural validator that checks the grammar and logic of process topologies; (4) a novel model architecture integrating a modified transformer decoder with a hierarchical VAE; and (5) a constrained beam search decoding strategy that enforces design requirements during sequence generation. Our results show that the proposed framework is capable of generating diverse, valid, and feasible topologies, offering a scalable approach to early-stage process development. •Presents a transformer-based framework for automated process topology generation.•Transformer-VAE learns structural patterns and latent representations of topologies.•A higher-order Markov model augments PFD data for improved diversity and coverage.•A SFILES structural validator ensures syntactic and logical validity of topologies.•A constrained-beam search enforces token count, ordering, and equality constraints.
ISSN:0098-1354
DOI:10.1016/j.compchemeng.2025.109431