基于扩散模型免训练的文本-图像生成方法、装置及介质

本发明公开了一种基于扩散模型免训练的文本-图像生成方法、装置及介质，所述方法包括利用语法解释器自动提取输入文本中的概念；利用预训练的自然语言模型BERT，预测每个主题对象的邻居对象；构造基于邻居对象的正样本、负样本、非条件样本概念；预估主题对象在特征空间中的目标磁场方向；对主题对象施加目标磁场方向，并利用扩散模型实现文本-图像生成。本发明采用免训练方式，使基于扩散模型的图像生成模型输出更加符合文本要求，对特征向量的调制过程均为扩散去噪环节之外完成，因此增加的计算成本可以忽略不计，能够实现更好的文本与图像对齐效果，提升扩散模型生成结果的图像-文本匹配度和准确性。 The invention di...

Full description

Saved in:

Bibliographic Details
Format	Patent
Language	Chinese
Published	24.12.2024
Subjects	CALCULATING COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS COMPUTING COUNTING ELECTRIC DIGITAL DATA PROCESSING IMAGE DATA PROCESSING OR GENERATION, IN GENERAL PHYSICS
Online Access	Get full text

Cover

More Information
Summary:	本发明公开了一种基于扩散模型免训练的文本-图像生成方法、装置及介质，所述方法包括利用语法解释器自动提取输入文本中的概念；利用预训练的自然语言模型BERT，预测每个主题对象的邻居对象；构造基于邻居对象的正样本、负样本、非条件样本概念；预估主题对象在特征空间中的目标磁场方向；对主题对象施加目标磁场方向，并利用扩散模型实现文本-图像生成。本发明采用免训练方式，使基于扩散模型的图像生成模型输出更加符合文本要求，对特征向量的调制过程均为扩散去噪环节之外完成，因此增加的计算成本可以忽略不计，能够实现更好的文本与图像对齐效果，提升扩散模型生成结果的图像-文本匹配度和准确性。 The invention discloses a diffusion model-based training-free text-image generation method and device and a medium. The method comprises the following steps of: automatically extracting a concept in an input text by utilizing a grammar interpreter; utilizing a pre-trained natural language model BERT to predict a neighbor object of each subject object; positive sample, negative sample and non-conditional sample concepts based on neighbor objects are constructed; estimating a target magnetic field direction of the subject object in the feature space; and applying a target magnetic field direction to the subject object, and realizing text-image generation by using a diffusion model. According to the method, a training-free mode is ad
Bibliography:	Application Number: CN202410341226