DDA-MSLD: A Multi-Feature Speech Lie Detection Algorithm Based on a Dual-Stream Deep Architecture
Speech lie detection is a technique that analyzes speech signals in detail to determine whether a speaker is lying. It has significant application value and has attracted attention from various fields. However, existing speech lie detection algorithms still have certain limitations. These algorithms...
Saved in:
| Published in | Information (Basel) Vol. 16; no. 5; p. 386 |
|---|---|
| Main Authors | , , |
| Format | Journal Article |
| Language | English |
| Published |
Basel
MDPI AG
01.05.2025
|
| Subjects | |
| Online Access | Get full text |
| ISSN | 2078-2489 2078-2489 |
| DOI | 10.3390/info16050386 |
Cover
| Summary: | Speech lie detection is a technique that analyzes speech signals in detail to determine whether a speaker is lying. It has significant application value and has attracted attention from various fields. However, existing speech lie detection algorithms still have certain limitations. These algorithms fail to fully explore manually extracted features based on prior knowledge and also neglect the dynamic characteristics of speech as well as the impact of temporal context, resulting in reduced detection accuracy and generalization. To address these issues, this paper proposes a multi-feature speech lie detection algorithm based on the dual-stream deep architecture (DDA-MSLD).This algorithm employs a dual-stream structure to learn different types of features simultaneously. Firstly, it combines a gated recurrent unit (GRU) network with the attention mechanism. This combination enables the network to more comprehensively capture the context of speech signals and focus on the parts that are more critical for lie detection. It can perform in-depth sequence pattern analysis on manually extracted static prosodic features and nonlinear dynamic features, obtaining high-order dynamic features related to lies. Secondly, the encoder part of the transformer is used to simultaneously capture the macroscopic structure and microscopic details of speech signals, specifically for high-precision feature extraction of Mel spectrogram features of speech signals, obtaining deep features related to lies. This dual-stream structure processes various features of speech simultaneously, describing the subjective state of speech signals from different perspectives and thereby improving detection accuracy and generalization. Experiments were conducted on the multi-person scenario lie detection dataset CSC, and the results show that this algorithm outperformed existing state-of-the-art algorithms in detection performance. Considering the significant differences in lie speech in different lying scenarios, and to further evaluate the algorithm’s generalization performance, a single-person scenario Chinese lie speech dataset Local was constructed, and experiments were conducted on it. The results indicate that the algorithm has a strong generalization ability in different scenarios. |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 2078-2489 2078-2489 |
| DOI: | 10.3390/info16050386 |