Parallelization of Top Algorithm Through a New Hybrid Recommendation System for Big Data in Spark Cloud Computing Framework

In the era of big data, parallel <inline-formula><tex-math notation="LaTeX">Top_{k}</tex-math></inline-formula> query processing under information retrieval has received increasing attention from both the industry and academia. This query handling allows users to re...

Full description

Saved in:

Bibliographic Details
Published in	IEEE systems journal Vol. 15; no. 4; pp. 4876 - 4886
Main Authors	El Handri, Kaoutar, Idrissi, Abdellah
Format	Journal Article
Language	English
Published	IEEE 01.12.2021
Subjects	<inline-formula xmlns:ali="http://www.niso.org/schemas/ali/1.0/" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <tex-math notation="LaTeX"> Top_{k}</tex-math> </inline-formula> Big Data Collaboration Collaborative filtering (CF) Filtering Funk singular value decomposition (SVD) Machine learning Machine learning algorithms multiple criteria decision aiding (MCDA) Parallel algorithms Query processing Recommender systems Skyline Spark
Online Access	Get full text
ISSN	1932-8184 1937-9234
DOI	10.1109/JSYST.2020.3019368

Cover

More Information
Summary:	In the era of big data, parallel <inline-formula><tex-math notation="LaTeX">Top_{k}</tex-math></inline-formula> query processing under information retrieval has received increasing attention from both the industry and academia. This query handling allows users to retrieve the most useful data objects in a set of choices. This problem is compounded by the use of <inline-formula><tex-math notation="LaTeX">Top_{k}</tex-math></inline-formula> in cases of multiple dimensions and extensive data analytics. In this article, we provide a novel parallel algorithm in a distributed recommender system based on the Apache Spark platform. The purpose of this approach was to implement the multicriteria decision aiding support and dominating query approach run by using matrix factorization and singular value decomposition (SVD)-based model as a sophisticated machine learning technique. Simultaneously, applying the resilient distributed datasets paradigm in cloud computing, which presents a favorable environment for big data management. Extensive experimental results in terms of accuracy, and scalability indicated the new algorithm's advantage compared to other <inline-formula><tex-math notation="LaTeX">Top_{k}</tex-math></inline-formula> algorithms. Accordingly, our recommender system based on the conceived algorithm achieved high precision (62%-82%, depending on the data) to verify the profoundly positive effect of the use of the Spark framework and the SVD-based model while applying the commonly used evaluation metrics in the recommendation systems.
ISSN:	1932-8184 1937-9234
DOI:	10.1109/JSYST.2020.3019368