ConvConcatNet: A Deep Convolutional Neural Network to Reconstruct Mel Spectrogram from the EEG

To investigate the processing of speech in the brain, simple linear models are commonly used to establish a relationship between brain signals and speech features. However, these linear models are illequipped to model a highly dynamic and complex non-linear system like the brain. Although non-linear...

Full description

Saved in:
Bibliographic Details
Published in2024 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW) pp. 113 - 114
Main Authors Xu, Xiran, Wang, Bo, Yan, Yujie, Zhu, Haolin, Zhang, Zechen, Wu, Xihong, Chen, Jing
Format Conference Proceeding
LanguageEnglish
Published IEEE 14.04.2024
Subjects
Online AccessGet full text
DOI10.1109/ICASSPW62465.2024.10626859

Cover

More Information
Summary:To investigate the processing of speech in the brain, simple linear models are commonly used to establish a relationship between brain signals and speech features. However, these linear models are illequipped to model a highly dynamic and complex non-linear system like the brain. Although non-linear methods with neural networks have been developed recently, reconstructing unseen stimuli from unseen subjects' EEG is still a highly challenging task. This work presents a novel method, ConvConcatNet, to reconstruct mel-spectrograms from EEG, in which the deep convolution neural network and extensive concatenation operation were combined. With our ConvConcatNet model, the Pearson correlation between the reconstructed and the target mel-spectrogram can achieve 0.0420, which was ranked as No.1 in the Task 2 of the Auditory EEG Challenge. The codes and models to implement our work will be available on Github: https://github.com/xuxiran/ConvConcatNet
DOI:10.1109/ICASSPW62465.2024.10626859