Semi-supervised Speech Enhancement in Envelop and Details Subspaces
In this study, we propose a modulation decoupling based single channel speech enhancement subspace framework, in which the spectrogram of noisy speech is decoupled as the product of a spectral envelop subspace and a spectral details subspace. This decoupling approach provides a method to specificall...
Saved in:
| Main Authors | , |
|---|---|
| Format | Journal Article |
| Language | English |
| Published |
29.09.2016
|
| Subjects | |
| Online Access | Get full text |
| DOI | 10.48550/arxiv.1609.09443 |
Cover
| Abstract | In this study, we propose a modulation decoupling based single channel speech
enhancement subspace framework, in which the spectrogram of noisy speech is
decoupled as the product of a spectral envelop subspace and a spectral details
subspace. This decoupling approach provides a method to specifically work on
elimination of those noises that greatly affect the intelligibility. Two
supervised low-rank and sparse decomposition schemes are developed in the
spectral envelop subspace to obtain a robust recovery of speech components. A
Bayesian formulation of non-negative factorization is used to learn the speech
dictionary from the spectral envelop subspace of clean speech samples. In the
spectral details subspace, a standard robust principal component analysis is
implemented to extract the speech components. The validation results show that
compared with four speech enhancement algorithms, including MMSE-SPP, NMF-RPCA,
RPCA, and LARC, the proposed MS based algorithms achieve satisfactory
performance on improving perceptual quality, and especially speech
intelligibility. |
|---|---|
| AbstractList | In this study, we propose a modulation decoupling based single channel speech
enhancement subspace framework, in which the spectrogram of noisy speech is
decoupled as the product of a spectral envelop subspace and a spectral details
subspace. This decoupling approach provides a method to specifically work on
elimination of those noises that greatly affect the intelligibility. Two
supervised low-rank and sparse decomposition schemes are developed in the
spectral envelop subspace to obtain a robust recovery of speech components. A
Bayesian formulation of non-negative factorization is used to learn the speech
dictionary from the spectral envelop subspace of clean speech samples. In the
spectral details subspace, a standard robust principal component analysis is
implemented to extract the speech components. The validation results show that
compared with four speech enhancement algorithms, including MMSE-SPP, NMF-RPCA,
RPCA, and LARC, the proposed MS based algorithms achieve satisfactory
performance on improving perceptual quality, and especially speech
intelligibility. |
| Author | Qin, Jun Sun, Pengfei |
| Author_xml | – sequence: 1 givenname: Pengfei surname: Sun fullname: Sun, Pengfei – sequence: 2 givenname: Jun surname: Qin fullname: Qin, Jun |
| BackLink | https://doi.org/10.48550/arXiv.1609.09443$$DView paper in arXiv |
| BookMark | eNqFzb0OgjAUhuEOOvh3AU72BsASipEZMe51Jwc4hibl0LTQ6N2rxN3py5t8ybNmCxoIGdsnIpbnLBNHcE8d4uQk8ljkUqYrVijsdeQniy5ojy1XFrHpeEkdUIM90sg1fTKgGSwHavkFR9DGczXV3kKDfsuWDzAed7_dsMO1vBe3aOYq63QP7lV92Wpm0_-PN_V6OZY |
| ContentType | Journal Article |
| Copyright | http://arxiv.org/licenses/nonexclusive-distrib/1.0 |
| Copyright_xml | – notice: http://arxiv.org/licenses/nonexclusive-distrib/1.0 |
| DBID | AKY GOX |
| DOI | 10.48550/arxiv.1609.09443 |
| DatabaseName | arXiv Computer Science arXiv.org |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: GOX name: arXiv.org url: http://arxiv.org/find sourceTypes: Open Access Repository |
| DeliveryMethod | fulltext_linktorsrc |
| ExternalDocumentID | 1609_09443 |
| GroupedDBID | AKY GOX |
| ID | FETCH-arxiv_primary_1609_094433 |
| IEDL.DBID | GOX |
| IngestDate | Wed Jul 23 00:26:59 EDT 2025 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | false |
| IsScholarly | false |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-arxiv_primary_1609_094433 |
| OpenAccessLink | https://arxiv.org/abs/1609.09443 |
| ParticipantIDs | arxiv_primary_1609_09443 |
| PublicationCentury | 2000 |
| PublicationDate | 2016-09-29 |
| PublicationDateYYYYMMDD | 2016-09-29 |
| PublicationDate_xml | – month: 09 year: 2016 text: 2016-09-29 day: 29 |
| PublicationDecade | 2010 |
| PublicationYear | 2016 |
| Score | 3.2119937 |
| SecondaryResourceType | preprint |
| Snippet | In this study, we propose a modulation decoupling based single channel speech
enhancement subspace framework, in which the spectrogram of noisy speech is... |
| SourceID | arxiv |
| SourceType | Open Access Repository |
| SubjectTerms | Computer Science - Sound |
| Title | Semi-supervised Speech Enhancement in Envelop and Details Subspaces |
| URI | https://arxiv.org/abs/1609.09443 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwdVw7TwMxDLbaTiwIBKi8M7BGtM2jvRGVVhUSMBSk207JxafewOnUa1F_PnauCJaOSawkcmT5s2N_AA8UYQyLXFlpbGEkQWon3dgFSUhbY679GJET-q9vdvGpX1KTdkD89sK49a78bvmBffM4tEwnmWitutAloMDNvO9p-zkZqbj28n9yhDHj1D8nMT-B4z26E0_tc5xCB6szmC7xq5TNtmazbDCIZY2Yr8SsWrHKOT0nyoqGsXxHUGgvnmNhZyPYrGsumjqH-_nsY7qQ8disbjkiMr5RFm-kLqBHkTz2QZgQBi6xnlQw0aN84pVCwkMJDkwRUKtL6B_a5erw0jUckRe3XMQwSm6gt1lv8ZY85cbfRXX9ABF0bZY |
| linkProvider | Cornell University |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Semi-supervised+Speech+Enhancement+in+Envelop+and+Details+Subspaces&rft.au=Sun%2C+Pengfei&rft.au=Qin%2C+Jun&rft.date=2016-09-29&rft_id=info:doi/10.48550%2Farxiv.1609.09443&rft.externalDocID=1609_09443 |