DDA-MSLD: A Multi-Feature Speech Lie Detection Algorithm Based on a Dual-Stream Deep Architecture

Speech lie detection is a technique that analyzes speech signals in detail to determine whether a speaker is lying. It has significant application value and has attracted attention from various fields. However, existing speech lie detection algorithms still have certain limitations. These algorithms...

Full description

Saved in:

Bibliographic Details
Published in	Information (Basel) Vol. 16; no. 5; p. 386
Main Authors	Guo, Pengfei, Huang, Shucheng, Li, Mingxing
Format	Journal Article
Language	English
Published	Basel MDPI AG 01.05.2025
Subjects	Accuracy Acoustics Algorithms Analysis Attention Chinese languages Computational linguistics Context Datasets Deep learning dual-stream architecture Dynamic characteristics dynamic features Feature extraction Generalization Language processing Linguistics Lying Mel spectrogram multi-feature fusion Natural language interfaces Nonlinear dynamics Pattern analysis Physiology Prosodic features Signal processing Speech speech lie detection
Online Access	Get full text
ISSN	2078-2489 2078-2489
DOI	10.3390/info16050386

Cover

More Information
Summary:	Speech lie detection is a technique that analyzes speech signals in detail to determine whether a speaker is lying. It has significant application value and has attracted attention from various fields. However, existing speech lie detection algorithms still have certain limitations. These algorithms fail to fully explore manually extracted features based on prior knowledge and also neglect the dynamic characteristics of speech as well as the impact of temporal context, resulting in reduced detection accuracy and generalization. To address these issues, this paper proposes a multi-feature speech lie detection algorithm based on the dual-stream deep architecture (DDA-MSLD).This algorithm employs a dual-stream structure to learn different types of features simultaneously. Firstly, it combines a gated recurrent unit (GRU) network with the attention mechanism. This combination enables the network to more comprehensively capture the context of speech signals and focus on the parts that are more critical for lie detection. It can perform in-depth sequence pattern analysis on manually extracted static prosodic features and nonlinear dynamic features, obtaining high-order dynamic features related to lies. Secondly, the encoder part of the transformer is used to simultaneously capture the macroscopic structure and microscopic details of speech signals, specifically for high-precision feature extraction of Mel spectrogram features of speech signals, obtaining deep features related to lies. This dual-stream structure processes various features of speech simultaneously, describing the subjective state of speech signals from different perspectives and thereby improving detection accuracy and generalization. Experiments were conducted on the multi-person scenario lie detection dataset CSC, and the results show that this algorithm outperformed existing state-of-the-art algorithms in detection performance. Considering the significant differences in lie speech in different lying scenarios, and to further evaluate the algorithm’s generalization performance, a single-person scenario Chinese lie speech dataset Local was constructed, and experiments were conducted on it. The results indicate that the algorithm has a strong generalization ability in different scenarios.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	2078-2489 2078-2489
DOI:	10.3390/info16050386