Complex Events Processing on Live News Events Using Apache Kafka and Clustering Techniques

The explosive growth of news and news content generated worldwide, coupled with the expansion through online media and rapid access to data, has made trouble and screening of news tedious. An expanding need for a model that can reprocess, break down, and order main content to extract interpretable i...

Full description

Saved in:
Bibliographic Details
Published inInternational journal of intelligent information technologies Vol. 17; no. 1; pp. 1 - 14
Main Authors Lakkad, Aditya Kamleshbhai, Bhadaniya, Rushit Dharmendrabhai, Shah, Vraj Nareshkumar, Lavanya K. (cb1dcf24-9f08-4fc8-b04b-47bbc153bc8e
Format Journal Article
LanguageEnglish
Published Hershey IGI Global 01.01.2021
Subjects
Online AccessGet full text
ISSN1548-3657
1548-3665
1548-3665
DOI10.4018/IJIIT.2021010103

Cover

Abstract The explosive growth of news and news content generated worldwide, coupled with the expansion through online media and rapid access to data, has made trouble and screening of news tedious. An expanding need for a model that can reprocess, break down, and order main content to extract interpretable information, explicitly recognizing subjects and content-driven groupings of articles. This paper proposed automated analyzing heterogeneous news through complex event processing (CEP) and machine learning (ML) algorithms. Initially, news content streamed using Apache Kafka, stored in Apache Druid, and further processed by a blend of natural language processing (NLP) and unsupervised machine learning (ML) techniques.
AbstractList The explosive growth of news and news content generated worldwide, coupled with the expansion through online media and rapid access to data, has made trouble and screening of news tedious. An expanding need for a model that can reprocess, break down, and order main content to extract interpretable information, explicitly recognizing subjects and content-driven groupings of articles. This paper proposed automated analyzing heterogeneous news through complex event processing (CEP) and machine learning (ML) algorithms. Initially, news content streamed using Apache Kafka, stored in Apache Druid, and further processed by a blend of natural language processing (NLP) and unsupervised machine learning (ML) techniques.
Audience Academic
Author Shah, Vraj Nareshkumar
Lavanya K. (cb1dcf24-9f08-4fc8-b04b-47bbc153bc8e
Bhadaniya, Rushit Dharmendrabhai
Lakkad, Aditya Kamleshbhai
AuthorAffiliation Vellore Institute of Technology, India
AuthorAffiliation_xml – name: Vellore Institute of Technology, India
Author_xml – sequence: 1
  givenname: Aditya
  surname: Lakkad
  middlename: Kamleshbhai
  fullname: Lakkad, Aditya Kamleshbhai
  organization: Vellore Institute of Technology, India
– sequence: 2
  givenname: Rushit
  surname: Bhadaniya
  middlename: Dharmendrabhai
  fullname: Bhadaniya, Rushit Dharmendrabhai
  organization: Vellore Institute of Technology, India
– sequence: 3
  givenname: Vraj
  surname: Shah
  middlename: Nareshkumar
  fullname: Shah, Vraj Nareshkumar
  organization: Vellore Institute of Technology, India
– sequence: 4
  surname: Lavanya K. (cb1dcf24-9f08-4fc8-b04b-47bbc153bc8e
  fullname: Lavanya K. (cb1dcf24-9f08-4fc8-b04b-47bbc153bc8e
  organization: Vellore Institute of Technology, India
BookMark eNp9kUFP4zAQha1VkZbC3vcYiXO7dmzHzrGqSrdQAYdy2Ys1dZziktrBTmH595sSoIIFNAdbnvlmnuf1Uc95ZxD6SfCQYSJ_zc5ms8UwxSnBu6Df0CHhTA5olvHe652L76gf4xpjymkqD9Gfsd_UlfmbTO6Na2JyFbw2MVq3SrxL5vbeJBfmIb6kr58yoxr0jUnOobyFBFyRjKttbEzY5RZG3zh7tzXxGB2UUEXz4_k8Qtenk8X492B-OZ2NR_OBZpI2Ayi1gEIUtCiWuaAcC8mlXqZLoKLMBWiMpaTa6KXkmBWM81RIDTITghnGgR4h0vXduhoeH6CqVB3sBsKjIljtlqPs2tpG7ZfTMicdUwe_09qotd8G18pUaU4kZ4zk2b5qBZVR1pW-CaA3Nmo1EjznWS5J3lYNP6hqozAbq1uXStu-vwFwB-jgYwym_E_vk5lv9WbvEG0baKx37SxbfQVOO9Cu7P6PreGqc1TtDVfefdqHCPoPk_K51Q
CitedBy_id crossref_primary_10_1016_j_tust_2021_104125
crossref_primary_10_1109_ACCESS_2023_3303810
ContentType Journal Article
Copyright COPYRIGHT 2021 IGI Global
Copyright © 2021, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited.
Copyright_xml – notice: COPYRIGHT 2021 IGI Global
– notice: Copyright © 2021, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited.
DBID AAYXX
CITATION
7SC
8FD
8FE
8FG
ABJCF
AFKRA
ARAPS
AZQEC
BENPR
BGLVJ
CCPQU
DWQXO
GNUQQ
HCIFZ
JQ2
K7-
L6V
L7M
L~C
L~D
M7S
P62
PHGZM
PHGZT
PKEHL
PQEST
PQGLB
PQQKQ
PQUKI
PRINS
PTHSS
ADTOC
UNPAY
DOI 10.4018/IJIIT.2021010103
DatabaseName CrossRef
Computer and Information Systems Abstracts
Technology Research Database
ProQuest SciTech Collection
ProQuest Technology Collection
Materials Science & Engineering Collection
ProQuest Central UK/Ireland
Advanced Technologies & Aerospace Collection
ProQuest Central Essentials Local Electronic Collection Information
ProQuest Central
Technology Collection
ProQuest One Community College
ProQuest Central Korea
ProQuest Central Student
SciTech Premium Collection
ProQuest Computer Science Collection
ProQuest Computer Science Database (NC LIVE)
ProQuest Engineering Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
ProQuest Engineering Database (NC LIVE)
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Premium
ProQuest One Academic (New)
ProQuest One Academic Middle East (New)
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Applied & Life Sciences
ProQuest One Academic
ProQuest One Academic UKI Edition
ProQuest Central China
Engineering Collection
Unpaywall for CDI: Periodical Content
Unpaywall
DatabaseTitle CrossRef
Computer Science Database
ProQuest Central Student
Technology Collection
Technology Research Database
Computer and Information Systems Abstracts – Academic
ProQuest One Academic Middle East (New)
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Essentials
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
SciTech Premium Collection
ProQuest One Community College
ProQuest Central China
ProQuest Central
ProQuest One Applied & Life Sciences
ProQuest Engineering Collection
ProQuest Central Korea
ProQuest Central (New)
Advanced Technologies Database with Aerospace
Engineering Collection
Advanced Technologies & Aerospace Collection
Engineering Database
ProQuest One Academic Eastern Edition
ProQuest Technology Collection
ProQuest SciTech Collection
Computer and Information Systems Abstracts Professional
ProQuest One Academic UKI Edition
Materials Science & Engineering Collection
ProQuest One Academic
ProQuest One Academic (New)
DatabaseTitleList
CrossRef

Computer Science Database
Database_xml – sequence: 1
  dbid: UNPAY
  name: Unpaywall
  url: https://proxy.k.utb.cz/login?url=https://unpaywall.org/
  sourceTypes: Open Access Repository
– sequence: 2
  dbid: 8FG
  name: ProQuest Technology Collection
  url: https://search.proquest.com/technologycollection1
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 1548-3665
EndPage 14
ExternalDocumentID 10.4018/ijiit.2021010103
A759569819
10_4018_IJIIT_2021010103
lex_Events_Processing_on_10_4018_IJIIT_202101010317
GroupedDBID 0R~
29J
4.4
5GY
AAYVP
ABEPT
ABGRR
ACOJC
ADEKF
ALMA_UNASSIGNED_HOLDINGS
AXMGO
BAWSF
BDBYZ
BLRFH
BTFVE
BYHXH
CBWLS
CDTDJ
CIGCI
CKMBR
CNQXE
COVLG
CTSEY
EBS
HZ~
IAO
ICD
IGYUU
JRD
MV1
NEEBM
O9-
P2P
RIF
XH6
AAYXX
ABJCF
ABPHS
ADMLS
AFKRA
ARAPS
BENPR
BGLVJ
CCPQU
CITATION
H13
HCIFZ
ITC
IVC
K7-
M7S
PHGZM
PHGZT
PQGLB
PTHSS
PUEGO
PMFND
7SC
8FD
8FE
8FG
AZQEC
DWQXO
GNUQQ
JQ2
L6V
L7M
L~C
L~D
P62
PKEHL
PQEST
PQQKQ
PQUKI
PRINS
8R4
8R5
ADTOC
BPHCQ
EJD
K6V
PROAC
Q2X
UNPAY
ID FETCH-LOGICAL-c483t-afc7ad7d3ddb973507858cb2ba37f97ac00883cecb8504d455278ca86774e45a3
IEDL.DBID 8FG
ISSN 1548-3657
1548-3665
IngestDate Tue Aug 19 18:11:29 EDT 2025
Sun Jul 13 04:35:51 EDT 2025
Tue Jun 17 21:39:25 EDT 2025
Tue Jun 10 21:21:07 EDT 2025
Thu Apr 24 23:11:26 EDT 2025
Wed Oct 01 02:40:01 EDT 2025
Sat Jan 27 04:14:32 EST 2024
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 1
Language English
License http://creativecommons.org/licenses/by/3.0/deed.en_US
cc-by
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c483t-afc7ad7d3ddb973507858cb2ba37f97ac00883cecb8504d455278ca86774e45a3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
OpenAccessLink https://proxy.k.utb.cz/login?url=https://doi.org/10.4018/ijiit.2021010103
PQID 2918544196
PQPubID 2045829
PageCount 14
ParticipantIDs crossref_primary_10_4018_IJIIT_2021010103
proquest_journals_2918544196
gale_infotracmisc_A759569819
unpaywall_primary_10_4018_ijiit_2021010103
crossref_citationtrail_10_4018_IJIIT_2021010103
igi_journals_lex_Events_Processing_on_10_4018_IJIIT_202101010317
gale_infotracacademiconefile_A759569819
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2021-01-01T00:00:00
2021-1-1
20210101
PublicationDateYYYYMMDD 2021-01-01
PublicationDate_xml – month: 01
  year: 2021
  text: 2021-01-01T00:00:00
  day: 01
PublicationDecade 2020
PublicationPlace Hershey
PublicationPlace_xml – name: Hershey
PublicationTitle International journal of intelligent information technologies
PublicationYear 2021
Publisher IGI Global
Publisher_xml – name: IGI Global
SSID ssj0035328
Score 2.233236
Snippet The explosive growth of news and news content generated worldwide, coupled with the expansion through online media and rapid access to data, has made trouble...
SourceID unpaywall
proquest
gale
crossref
igi
SourceType Open Access Repository
Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 1
SubjectTerms Algorithms
Computational linguistics
Language processing
Machine learning
Methods
Natural language interfaces
Streaming media
SummonAdditionalLinks – databaseName: Unpaywall
  dbid: UNPAY
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1bb9MwFLZY98ATY1xE0Tb5AQmBlK3Edmy_rZo2rQMmHlpp8GL5kqCwKK1oqw1-PeckztpyEygvUXy3z7G_Ex9_JuSFl3gMx6bJQCgwUIIeJDq1PHEyFEE57kRDVv3-Mjuf8IsrcRX_d-BZmLX9e0D-6qj8Upbo8giWCT5si2xnuJXUI9uTyw_Djw0dKlcJyxpSz_ieiXZH8rdZbKxAcR7eKj-XGxjz_rKe2W83tqrWlpuznZb7aN6wFKKXyfXhcuEO_fefOBz_pSUPyYOIOemwFZJdci-vH5Gd7j4HGtX7MfmEn6r8lp6iE-ScxkMEsLjRaU3fwbxIcU7sghtvAzqcISU0fWuLa0ttHehJtUTyBQwbdwSx8ydkcnY6PjlP4t0LieeKLRJbeGmDDCwEpyUD1KiE8i51lslCS-sBOyjmc--UGPDAkchNeYvseDznwrKnpFdP6_wZobIAaCxSm4MVzr3NnGOpd15rywpI4vvkqBsP4yMxOd6PURkwULDrzOhiNBqbVdf1yau7FLOWlOMvcV_iEBvUV8gVymuPHUDdkPnKDKUAE1EDMOqTvY2YoGd-I_gYhMREFZ8bGA7T9rdZDYeZ1n-syRsJJXQStson1QCZAJTqrE9e30ndL-1qxGctt-f_E3mP9BZfl_k-4KeFO4iq8wOx_xJV
  priority: 102
  providerName: Unpaywall
Title Complex Events Processing on Live News Events Using Apache Kafka and Clustering Techniques
URI http://services.igi-global.com/resolvedoi/resolve.aspx?doi=10.4018/IJIIT.2021010103
https://www.proquest.com/docview/2918544196
https://doi.org/10.4018/ijiit.2021010103
UnpaywallVersion publishedVersion
Volume 17
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVEBS
  databaseName: Inspec with Full Text
  customDbUrl:
  eissn: 1548-3665
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0035328
  issn: 1548-3665
  databaseCode: ADMLS
  dateStart: 20050101
  isFulltext: true
  titleUrlDefault: https://www.ebsco.com/products/research-databases/inspec-full-text
  providerName: EBSCOhost
– providerCode: PRVPQU
  databaseName: ProQuest Central
  customDbUrl: http://www.proquest.com/pqcentral?accountid=15518
  eissn: 1548-3665
  dateEnd: 20211231
  omitProxy: true
  ssIdentifier: ssj0035328
  issn: 1548-3665
  databaseCode: BENPR
  dateStart: 20050101
  isFulltext: true
  titleUrlDefault: https://www.proquest.com/central
  providerName: ProQuest
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV1bb9MwFLbY9gAv4y4K2-QHJARS1C62a_sJuqllHVBNqJUGL5YvCcqI0kJbAf-ecxJnXYWY8hApviQ55_hcfPkOIS-9xGM4Nk16QkGAEnQv0anliZMhD8pxJ2qw6k-T_tmMn1-KyzjhtozbKludWCvqMPc4R95NNVgWsN26_3bxI8GsUbi6GlNo7JC94xQkCU-Kj963mpgJVudWRa88YX0hm2VKiChUd3w-Hk8hPISIBy-2ZZaict4pvhVbjufddbWwf37Zsrxhg0YPyH50Humg4fZDcierHpH7bWIGGsfpY_IVH5XZbzrE3YxLGk8DgJWi84p-BAVHUbm1xfW2ATpYILYz_WDz75baKtDTco0oClg2bZFel0_IbDScnp4lMYlC4rliq8TmXtogAwvBacnA_VNCeZc6y2SupfXgBCjmM--U6PHAEZFNeYswdzzjwrKnZLeaV9kzQmUOPq5IbQbhNPe27xxLvfNaW5ZDE98h3ZaGxkeEcUx0URqINJDqpqa62VC9Q15ft1g06Bq31H2FbDE48KBXeF9zfgC-DSGszEAKiPU0eDgdcrBVEwaM3yp-B4w1cawuDbDDNPQ2G3aYefXfLzmW8IZWKjb9bCS0Q95cS8o__1VcFcXqRm_Pb-_rBbmHdZtZnwOyu_q5zg7BD1q5o1rYj8jeyXBy8Rnus8nF4Mtfje0FBw
linkProvider ProQuest
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lb9NAEB615VAuvBGBAnsAIZCspF5vdn1AEJWGuEl7SqWKy7IPGxksJ5BEpX-K38iMH00jRG-Vb17v2DszOw_v7jcAr5ykYzgmDHpCYYLi414QhyYKrPSZVzayogKrPj7pj06jozNxtgV_2rMwtK2ytYmVofYzR__Iu2GMngV9d9z_MP8ZUNUoWl1tS2jUajFOL84xZVu8Tz6hfF-H4fBwejAKmqoCgYsUXwYmc9J46bn3NpYc4yEllLOhNVxmsTQOvaLiLnVWiV7kI4IoU84Q7luURsJwpLsNt3CInLD61fBza_m54FUtV8oCAt4Xsl4WxQxGdZOjJJliOooZFl18ww02zmA7_5ZvBLq7q3JuLs5NUVzxecN7cKcJVtmg1q77sJWWD-BuWwiCNXbhIXyhW0X6mx3S7skFa04foFdks5JN0KAyMqZtc7VNgQ3mhCXNxib7YZgpPTsoVoTaQG3TFll28QhOb4S9j2GnnJXpE2Ayw5hahCbF9D1ypm8tD511cWx4hl1cB7otD7VrEM2psEahMbMhruuK63rN9Q68vewxr9E8rnn2DYlF00RHqvi--rwCfhtBZumBFJhbxhhRdWBv40mcoG6j-SMKVje2YaFRHLrmt16LQ8_K_37JvsQ3tFqxprOeER14d6kp_4wr_57nyyvUnl5P6yXsjqbHEz1JTsbP4Db1q_847cHO8tcqfY4x2NK-qBSfwdebnml_ATuTPu0
linkToUnpaywall http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1bb9MwFLZY98ATY1xE0Tb5AQmBlK3Edmy_rZo2rQMmHlpp8GL5kqCwKK1oqw1-PeckztpyEygvUXy3z7G_Ex9_JuSFl3gMx6bJQCgwUIIeJDq1PHEyFEE57kRDVv3-Mjuf8IsrcRX_d-BZmLX9e0D-6qj8Upbo8giWCT5si2xnuJXUI9uTyw_Djw0dKlcJyxpSz_ieiXZH8rdZbKxAcR7eKj-XGxjz_rKe2W83tqrWlpuznZb7aN6wFKKXyfXhcuEO_fefOBz_pSUPyYOIOemwFZJdci-vH5Gd7j4HGtX7MfmEn6r8lp6iE-ScxkMEsLjRaU3fwbxIcU7sghtvAzqcISU0fWuLa0ttHehJtUTyBQwbdwSx8ydkcnY6PjlP4t0LieeKLRJbeGmDDCwEpyUD1KiE8i51lslCS-sBOyjmc--UGPDAkchNeYvseDznwrKnpFdP6_wZobIAaCxSm4MVzr3NnGOpd15rywpI4vvkqBsP4yMxOd6PURkwULDrzOhiNBqbVdf1yau7FLOWlOMvcV_iEBvUV8gVymuPHUDdkPnKDKUAE1EDMOqTvY2YoGd-I_gYhMREFZ8bGA7T9rdZDYeZ1n-syRsJJXQStson1QCZAJTqrE9e30ndL-1qxGctt-f_E3mP9BZfl_k-4KeFO4iq8wOx_xJV
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Complex+Events+Processing+on+Live+News+Events+Using+Apache+Kafka+and+Clustering+Techniques&rft.jtitle=International+journal+of+intelligent+information+technologies&rft.au=Lakkad%2C+Aditya+Kamleshbhai&rft.au=Bhadaniya%2C+Rushit+Dharmendrabhai&rft.au=Shah%2C+Vraj+Nareshkumar&rft.date=2021-01-01&rft.pub=IGI+Global&rft.issn=1548-3657&rft.volume=17&rft.issue=1&rft_id=info:doi/10.4018%2FIJIIT.2021010103&rft.externalDocID=A759569819
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1548-3657&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1548-3657&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1548-3657&client=summon