복합 명사구 합성 방법을 적용한 효과적인 도서 본문 주제어 추출
Most of online bookstores are providing a user with the bibliographic book information rather than the concrete information such as thematic words and atmosphere. Especially, thematic words help a user to understand books and cast a wide net. In this paper, we propose an efficient extraction method...
Saved in:
Published in | 한국컴퓨터정보학회논문지 Vol. 22; no. 3; pp. 107 - 113 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | Korean |
Published |
한국컴퓨터정보학회
01.03.2017
|
Subjects | |
Online Access | Get full text |
ISSN | 1598-849X 2383-9945 |
DOI | 10.9708/jksci.2017.22.03.107 |
Cover
Abstract | Most of online bookstores are providing a user with the bibliographic book information rather than the concrete information such as thematic words and atmosphere. Especially, thematic words help a user to understand books and cast a wide net. In this paper, we propose an efficient extraction method of thematic words from book text by applying the compound noun and noun phrase synthetic method. The compound nouns represent the characteristics of a book in more detail than single nouns. The proposed method extracts the thematic word from book text by recognizing two types of noun phrases, such as a single noun and a compound noun combined with single nouns. The recognized single nouns, compound nouns, and noun phrases are calculated through TF-IDF weights and extracted as main words. In addition, this paper suggests a method to calculate the frequency of subject, object, and other roles separately, not just the sum of the frequencies of all nouns in the TF-IDF calculation method. Experiments is carried out in the field of economic management, and thematic word extraction verification is conducted through survey and book search. Thus, 9 out of the 10 experimental results used in this study indicate that the thematic word extracted by the proposed method is more effective in understanding the content. Also, it is confirmed that the thematic word extracted by the proposed method has a better book search result. KCI Citation Count: 0 |
---|---|
AbstractList | Most of online bookstores are providing a user with the bibliographic book information rather than the concrete information such as thematic words and atmosphere. Especially, thematic words help a user to understand books and cast a wide net. In this paper, we propose an efficient extraction method of thematic words from book text by applying the compound noun and noun phrase synthetic method. The compound nouns represent the characteristics of a book in more detail than single nouns. The proposed method extracts the thematic word from book text by recognizing two types of noun phrases, such as a single noun and a compound noun combined with single nouns. The recognized single nouns, compound nouns, and noun phrases are calculated through TF-IDF weights and extracted as main words. In addition, this paper suggests a method to calculate the frequency of subject, object, and other roles separately, not just the sum of the frequencies of all nouns in the TF-IDF calculation method. Experiments is carried out in the field of economic management, and thematic word extraction verification is conducted through survey and book search. Thus, 9 out of the 10 experimental results used in this study indicate that the thematic word extracted by the proposed method is more effective in understanding the content. Also, it is confirmed that the thematic word extracted by the proposed method has a better book search result. KCI Citation Count: 0 |
Author | 김기원(Kee-Won Kim) 김승훈(Seung-Hoon Kim) 안희정(Hee-Jeong Ahn) |
Author_xml | – sequence: 1 fullname: 안희정(Hee-Jeong Ahn) – sequence: 2 fullname: 김기원(Kee-Won Kim) – sequence: 3 fullname: 김승훈(Seung-Hoon Kim) |
BackLink | https://www.kci.go.kr/kciportal/ci/sereArticleSearch/ciSereArtiView.kci?sereArticleSearchBean.artiId=ART002210324$$DAccess content in National Research Foundation of Korea (NRF) |
BookMark | eNotjD9Lw0AAxQ-pYK39Bg63ODgkXu5PcjeWWrVQLEhBt5BcEonVVFIcHJU4u1iHkoLgYqWDWls6-Ima63cw1fKG936Px9sEhagT-QBsG0gXFuJ7F-2uDHWMDEvHWEdEN5C1BoqYcKIJQVkBFA0muMapONsA5W43dBExsSUwM4vgNBtPFr0hzN4f1P1oPhnBnFTyCbOPYfbVU4MEqpc71R8ueilc9N_m458lD2Ywe0xUksJsPMtGM6he8z5Vz99QTZ_UNN0C64Fz2fXLKy-B1kGtVT3SGs3DerXS0CKTEs2Tgjs0wB6R1LJM6TMkpMu9gEoXu4RwLIVFJQl8YviCBchjLmWOiRGXzMEmKYHd_9soDuy2DO2OE_75ecdux3blpFW3DUIFQyTf7qy2N3F45XuhY1_nwYlv7ePmfg1ZBlmK_AJnfn_u |
ContentType | Journal Article |
DBID | DBRKI TDB ACYCR |
DOI | 10.9708/jksci.2017.22.03.107 |
DatabaseName | DBPIA - 디비피아 Nurimedia DBPIA Journals Korean Citation Index |
DatabaseTitleList | |
DeliveryMethod | fulltext_linktorsrc |
DocumentTitleAlternate | Effective Thematic Words Extraction from a Book using Compound Noun Phrase Synthesis Method |
DocumentTitle_FL | Effective Thematic Words Extraction from a Book using Compound Noun Phrase Synthesis Method |
EISSN | 2383-9945 |
EndPage | 113 |
ExternalDocumentID | oai_kci_go_kr_ARTI_1349503 NODE07131313 |
GroupedDBID | .UV ALMA_UNASSIGNED_HOLDINGS DBRKI TDB ACYCR M~E |
ID | FETCH-LOGICAL-n643-dc98a4f2d3c4776ce509cb8df4cb2b3382c974c3fe31e95f0d5b45a6208c5a263 |
ISSN | 1598-849X |
IngestDate | Tue Nov 21 21:40:36 EST 2023 Thu Feb 06 13:40:10 EST 2025 |
IsPeerReviewed | false |
IsScholarly | false |
Issue | 3 |
Keywords | Text mining Thematic word Compound Noun Phrase Extraction Book |
Language | Korean |
LinkModel | OpenURL |
MergedId | FETCHMERGED-LOGICAL-n643-dc98a4f2d3c4776ce509cb8df4cb2b3382c974c3fe31e95f0d5b45a6208c5a263 |
Notes | G704-001619.2017.22.3.003 |
PageCount | 7 |
ParticipantIDs | nrf_kci_oai_kci_go_kr_ARTI_1349503 nurimedia_primary_NODE07131313 |
PublicationCentury | 2000 |
PublicationDate | 2017-03 |
PublicationDateYYYYMMDD | 2017-03-01 |
PublicationDate_xml | – month: 03 year: 2017 text: 2017-03 |
PublicationDecade | 2010 |
PublicationTitle | 한국컴퓨터정보학회논문지 |
PublicationYear | 2017 |
Publisher | 한국컴퓨터정보학회 |
Publisher_xml | – name: 한국컴퓨터정보학회 |
SSID | ssib036279256 ssib001107257 ssib044738270 ssib012146333 ssib008451689 ssib053377514 |
Score | 1.6288209 |
Snippet | Most of online bookstores are providing a user with the bibliographic book information rather than the concrete information such as thematic words and... |
SourceID | nrf nurimedia |
SourceType | Open Website Publisher |
StartPage | 107 |
SubjectTerms | 컴퓨터학 |
Title | 복합 명사구 합성 방법을 적용한 효과적인 도서 본문 주제어 추출 |
URI | https://www.dbpia.co.kr/journal/articleDetail?nodeId=NODE07131313 https://www.kci.go.kr/kciportal/ci/sereArticleSearch/ciSereArtiView.kci?sereArticleSearchBean.artiId=ART002210324 |
Volume | 22 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
ispartofPNX | 한국컴퓨터정보학회논문지, 2017, 22(3), 156, pp.107-113 |
journalDatabaseRights | – providerCode: PRVHPJ databaseName: ROAD: Directory of Open Access Scholarly Resources customDbUrl: eissn: 2383-9945 dateEnd: 99991231 omitProxy: true ssIdentifier: ssib044738270 issn: 1598-849X databaseCode: M~E dateStart: 19960101 isFulltext: true titleUrlDefault: https://road.issn.org providerName: ISSN International Centre |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrR1Na9RANNR60IsoKtaPEsQ5ldTszCSZOSbZLVVovVTsbdlMEtGFrZT24kFQ1rMX10PZguDFyh7U2tKDP8Nf0aT_wTcv2WxaKlSRhfDy5s2b9-Ztdt7LzrxnGPc6ujZazLgVNxLb4m7qWdJTiZUql7ssZalIdKC4tOwuPuYPV53VqXO_aruWNjeiefXy1HMl_2JVwIFd9SnZv7BsxRQQAIN94QoWhuuZbExaAQkYCRzSahLpEB8ifED5PhGAComgxA9Jy9cUfjhXUWEbJ0EDyQMbUQBQ3QxtsgnNcxrybXA1EeUjETKQJStACeQOIoTHqZskEMhcLOiRivGwXyGxQDnDggo61jhIBKRLgkKEwCWSj4Gw7kxX4oxVbCIZjMCxiaF8TVTVHnN3xiLwcXdZKiNCFNhBUUrpsJent4PAV5FScMfrb6ZRUIcIHAV09isdnAkJiCY1Aw2IUhAZEGmfQgLThLBmB6KI-jsZWOerTWn4FP139esLlBSW4Fh-GNZvxIHXxSwpuVNbiMpawqVP0yjO-55cLqWHR0Ced8Hb0rscvXlKdcrfqnM9O_kJr-FYfvKuetZ-utburrchCnvQ1ikvHZ2C9zz1XFfXFVl61Zo46cCe1v5rF7pW9KTmQUOXmmeTIAQ8LE_SScYnzj0mKFaBrCajOBOr1bl_mjLgN_bWwd280NvUNTPgh7fmQ65cNi6VwZ_pF0_yFWOqu3bVeJLt7h0Ndszsy9v8zehwb2TCXd7_ZmZfd7Lvg3y7b-YfX-dbO0eDoXm09flw96e-3z4ws3f9vD80s92DbHRg5p8AP8w__DDz_ff5_vCasbLQWgkXrbLcidWDsMCKlRQdntKYKe55rkrAlVeRiFOuIhox0FlB7K9YmrBGIp3Ujp2IOx2X2kI5Heqy68Z0b62X3DBM1Yk6ek55ZCseiSSCGCCJ7IRRnf1QxDPGXZgPtNqfrTdjzFbT1X5RpL5pLz9qtvR7Lf25eRYut4yLkwfktjG9sb6Z3AE3fiOaxW_Fb6evwJw |
linkProvider | ISSN International Centre |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=%EB%B3%B5%ED%95%A9+%EB%AA%85%EC%82%AC%EA%B5%AC+%ED%95%A9%EC%84%B1+%EB%B0%A9%EB%B2%95%EC%9D%84+%EC%A0%81%EC%9A%A9%ED%95%9C+%ED%9A%A8%EA%B3%BC%EC%A0%81%EC%9D%B8+%EB%8F%84%EC%84%9C+%EB%B3%B8%EB%AC%B8+%EC%A3%BC%EC%A0%9C%EC%96%B4+%EC%B6%94%EC%B6%9C&rft.jtitle=%ED%95%9C%EA%B5%AD%EC%BB%B4%ED%93%A8%ED%84%B0%EC%A0%95%EB%B3%B4%ED%95%99%ED%9A%8C%EB%85%BC%EB%AC%B8%EC%A7%80%2C+22%283%29&rft.au=%EC%95%88%ED%9D%AC%EC%A0%95&rft.au=%EA%B9%80%EA%B8%B0%EC%9B%90&rft.au=%EA%B9%80%EC%8A%B9%ED%9B%88&rft.date=2017-03-01&rft.pub=%ED%95%9C%EA%B5%AD%EC%BB%B4%ED%93%A8%ED%84%B0%EC%A0%95%EB%B3%B4%ED%95%99%ED%9A%8C&rft.issn=1598-849X&rft.eissn=2383-9945&rft.spage=107&rft.epage=113&rft_id=info:doi/10.9708%2Fjksci.2017.22.03.107&rft.externalDBID=n%2Fa&rft.externalDocID=oai_kci_go_kr_ARTI_1349503 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1598-849X&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1598-849X&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1598-849X&client=summon |