Real-Time Data Processing and Distributed System Optimization with Kafka And Cassandra

The rapid expansion of data volume across industries has intensified the need for real-time data processing and optimization strategies. Distributed systems must now handle diverse workloads, ensuring both efficiency and scalability. Kafka and Cassandra have emerged as dominant technologies for stre...

Full description

Saved in:
Bibliographic Details
Published inInternational Research Journal on Advanced Engineering and Management (IRJAEM) Vol. 3; no. 8; pp. 2698 - 2704
Main Author Fnu Pawan Kumar
Format Journal Article
LanguageEnglish
Published 11.08.2025
Online AccessGet full text
ISSN2584-2854
2584-2854
DOI10.47392/IRJAEM.2025.0424

Cover

More Information
Summary:The rapid expansion of data volume across industries has intensified the need for real-time data processing and optimization strategies. Distributed systems must now handle diverse workloads, ensuring both efficiency and scalability. Kafka and Cassandra have emerged as dominant technologies for streaming and storing high-throughput, low-latency data in real-time analytics pipelines. This review analyzes the roles of Apache Kafka in the context of data ingestion and stream processing and Apache Cassandra data storage purposes for distributed database management. It also discusses how these two systems interact and the advantages of integrating the two systems for improving responsiveness, fault tolerance, and data consistency in distributed systems. The review analyzes the middleware and stream pipelines, different use cases, and recent applications in environmental monitoring, healthcare, and power systems. Lastly, by synthesizing previous work from the last several conferences and journal articles, this review outlines methodologies, tools, and architecture patterns for accomplishing real-time data processing and system optimization using Kafka and Cassandra
ISSN:2584-2854
2584-2854
DOI:10.47392/IRJAEM.2025.0424