Automatic indexing of key sentences for lecture archives using statistics of presumed discourse markers
Automatic extraction of key sentences from lecture audio archives is addressed. The method makes use of the characteristic expressions used in initial utterances of sections, which are defined as discourse markers and derived in a totally unsupervised manner based on word statistics. The statistics...
Saved in:
| Published in | 2004 IEEE International Conference on Acoustics, Speech and Signal Processing Vol. 1; pp. I - 449 |
|---|---|
| Main Authors | , , |
| Format | Conference Proceeding |
| Language | English Japanese |
| Published |
Piscataway, N.J
IEEE
28.09.2004
|
| Subjects | |
| Online Access | Get full text |
| ISBN | 9780780384842 0780384849 |
| ISSN | 1520-6149 |
| DOI | 10.1109/ICASSP.2004.1326019 |
Cover
| Abstract | Automatic extraction of key sentences from lecture audio archives is addressed. The method makes use of the characteristic expressions used in initial utterances of sections, which are defined as discourse markers and derived in a totally unsupervised manner based on word statistics. The statistics of the presumed discourse markers are then used to define the importance of the sentences. It is also combined with the conventional tf-idf measure of content words. Experimental results using a large corpus of lectures confirm the effectiveness of the method based on the discourse markers and its combination with the keyword-based method. It is also shown that the method is robust against ASR errors and sentence segmentation accuracy is more vital. Thus, we also enhance segmentation by incorporating prosodic information. |
|---|---|
| AbstractList | Automatic extraction of key sentences from lecture audio archives is addressed. The method makes use of the characteristic expressions used in initial utterances of sections, which are defined as discourse markers and derived in a totally unsupervised manner based on word statistics. The statistics of the presumed discourse markers are then used to define the importance of the sentences. It is also combined with the conventional tf-idf measure of content words. Experimental results using a large corpus of lectures confirm the effectiveness of the method based on the discourse markers and its combination with the keyword-based method. It is also shown that the method is robust against ASR errors and sentence segmentation accuracy is more vital. Thus, we also enhance segmentation by incorporating prosodic information. |
| Author | Kitade, T. Nanjo, H. Kawahara, T. |
| Author_xml | – sequence: 1 givenname: H. surname: Nanjo fullname: Nanjo, H. organization: Sch. of Informatics, Kyoto Univ., Japan – sequence: 2 givenname: T. surname: Kitade fullname: Kitade, T. organization: Sch. of Informatics, Kyoto Univ., Japan – sequence: 3 givenname: T. surname: Kawahara fullname: Kawahara, T. organization: Sch. of Informatics, Kyoto Univ., Japan |
| BackLink | http://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=17565953$$DView record in Pascal Francis |
| BookMark | eNpFkEtrAjEQgAO1ULX-Ai-59Lg2z93kKNIXCC3YniUmE5uqu5LZLfXfd8VCYWBg5vuGmRmRQd3UQMiUsxnnzN6_LOar1dtMMKZmXIqScXtFJrYyrA9plFFiQIZcC1aUXNkbMkL8YoyZSpkh2c67tjm4Nnma6gA_qd7SJtIdnChC3ULtAWlsMt2Db7sM1GX_mb77YodnFtvexV7Hs3bMgN0BAg0JfdNlBHpweQcZb8l1dHuEyV8ek4_Hh_fFc7F8feovWBaJKy0LGXnY6MCjF1qzwGTp4sYEWwartBeCSSHB-TJEo6tohQ1mo3SsKq9MlA7kmNxd5h4dereP2dU-4fqYU7_Iac0rXWqrZc9NL1wCgP_25X_yF1ZXaFs |
| ContentType | Conference Proceeding |
| Copyright | 2006 INIST-CNRS |
| Copyright_xml | – notice: 2006 INIST-CNRS |
| DBID | 6IE 6IH CBEJK RIE RIO IQODW |
| DOI | 10.1109/ICASSP.2004.1326019 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP) 1998-present Pascal-Francis |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering Statistics Applied Sciences |
| EndPage | 449 |
| ExternalDocumentID | 17565953 1326019 |
| Genre | orig-research |
| GroupedDBID | 23M 29P 6IE 6IF 6IH 6IK 6IL 6IM 6IN AAJGR AAWTH ABLEC ACGFS ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IJVOP IPLJI M43 OCL RIE RIL RIO RNS AAVQY IQODW RIB RIC |
| ID | FETCH-LOGICAL-i1453-3f1db5d1fc2550d036afb8d96d945c220323eac6df857f929d8b45f77c48f3ae3 |
| IEDL.DBID | RIE |
| ISBN | 9780780384842 0780384849 |
| ISSN | 1520-6149 |
| IngestDate | Wed Apr 02 07:25:52 EDT 2025 Tue Aug 26 18:33:11 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | true |
| Keywords | Performance evaluation Archive Keyword Segmentation Unsupervised classification Signal classification Accuracy Prosody Automatic indexing Signal processing Feature extraction Discourse analysis Automatic recognition |
| Language | English Japanese |
| License | CC BY 4.0 |
| LinkModel | DirectLink |
| MeetingName | 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing (proceedings) |
| MergedId | FETCHMERGED-LOGICAL-i1453-3f1db5d1fc2550d036afb8d96d945c220323eac6df857f929d8b45f77c48f3ae3 |
| ParticipantIDs | ieee_primary_1326019 pascalfrancis_primary_17565953 |
| PublicationCentury | 2000 |
| PublicationDate | 2004-09-28 |
| PublicationDateYYYYMMDD | 2004-09-28 |
| PublicationDate_xml | – month: 09 year: 2004 text: 2004-09-28 day: 28 |
| PublicationDecade | 2000 |
| PublicationPlace | Piscataway, N.J |
| PublicationPlace_xml | – name: Piscataway, N.J |
| PublicationTitle | 2004 IEEE International Conference on Acoustics, Speech and Signal Processing |
| PublicationTitleAbbrev | ICASSP |
| PublicationYear | 2004 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| SSID | ssj0008748 ssj0000454154 |
| Score | 1.5550214 |
| Snippet | Automatic extraction of key sentences from lecture audio archives is addressed. The method makes use of the characteristic expressions used in initial... |
| SourceID | pascalfrancis ieee |
| SourceType | Index Database Publisher |
| StartPage | I |
| SubjectTerms | Acoustic testing Applied sciences Automatic speech recognition Data mining Exact sciences and technology Informatics Information, signal and communications theory Machine assisted indexing Natural languages Robustness Signal and communications theory Signal representation. Spectral analysis Signal, noise Speech recognition Statistics Telecommunications and information theory Vocabulary |
| Title | Automatic indexing of key sentences for lecture archives using statistics of presumed discourse markers |
| URI | https://ieeexplore.ieee.org/document/1326019 |
| Volume | 1 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV07T8MwED5RJliAtojykgdGUprYTpwRIVBBAiFBJTbkJ0JAg_pY-PXcOaEUxMCWKHZiO5bvPvu77wCOcNpkwmmTyNyXieBWJsoqj1CFG55xrdN4Ynp9kw9H4upBPqzA8SIWxnsfyWe-T5fxLN9Vdk5bZSeInBA_lC1oFSqvY7UW-ykkJZeSaWxWYVXEzFlonggeiTJCdjXgSihRNso7X_dZI0eUDsqTy7PTu7vbCBz7zfeaxCtEm9RTHLlQp7xYskMXG3D91YOafvLSn89M3378Enf8bxc3ofsd8cduF7ZsC1b8uA3rS2KFbVgjv7SWde7A0-l8VkW1Vxb1FrEEqwLDFYFRNFMkZzP0h9lrfUbBdKNwy4ho_8Smi3dRNSLjYsMcoxjhimgl7I1oQ5NpF0YX5_dnw6TJ2ZA8p0LyhIfUGenSYBGrDBzaRx2McmXuSiFtRvnaOa71uQtKFgF9M6eMkKEorFCBa8-3YXVcjf0OMIueabDKuIJjkUKYVPPg0N3QVvrUqx50aPwe32tZjsdm6Hpw-OM3fT8vJIkn8t2_6-3BWs3KKZNM7cPqbDL3B-hwzMxhnGmf-PbP9g |
| linkProvider | IEEE |
| linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3JTsMwEB2xHIALu9jxgSMpTWw3zrFCoLIUIQESt8orQkCDulz4emactCziwC1R7MR2LM88-80bgCOcNplw2iSy5YtEcCsTZZVHqMINz7jWaTwx7d60Og_i8lE-zsDxNBbGex_JZ75Bl_Es35V2TFtlJ4icED8UszAvhRCyitaa7qiQmFxKxrFeh1Uec2ehgSKAJIoI2lWTK6FEUWvvTO6zWpAobRYnF6ftu7vbCB0b9Rfr1CtEnNRDHLtQJb34ZonOl6E76UNFQHlpjEemYT9-yTv-t5MrsPEV88dup9ZsFWZ8fw2WvskVrsEieaaVsPM6PLXHozLqvbKouIglWBkYrgmM4pkiPZuhR8xeq1MKpmuNW0ZU-yc2nL6LqhEdFxvmGEUJl0QsYW9EHBoMN-Dh_Oz-tJPUWRuS51RInvCQOiNdGiyilaZDC6mDUa5ouUJIm1HGdo6rfcsFJfOA3plTRsiQ51aowLXnmzDXL_t-C5hF3zRYZVzOsUguTKp5cOhwaCt96tU2rNP49d4rYY5ePXTbcPDjN309zyXJJ_Kdv-sdwkLnvnvdu764udqFxYqjUySZ2oO50WDs99H9GJmDOOs-AQht00M |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2004+IEEE+International+Conference+on+Acoustics%2C+Speech%2C+and+Signal+Processing&rft.atitle=Automatic+indexing+of+key+sentences+for+lecture+archives+using+statistics+of+presumed+discourse+markers&rft.au=Nanjo%2C+H.&rft.au=Kitade%2C+T.&rft.au=Kawahara%2C+T.&rft.date=2004-09-28&rft.pub=IEEE&rft.isbn=9780780384842&rft.issn=1520-6149&rft.volume=1&rft.spage=I&rft.epage=449&rft_id=info:doi/10.1109%2FICASSP.2004.1326019&rft.externalDocID=1326019 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1520-6149&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1520-6149&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1520-6149&client=summon |