Segmenting with big data analytics and Python: A quantitative exploratory analysis of household savings

According to the national balance sheets of the most advanced economies, despite a recent sharp decline in per capita net wealth, Italian private households present a higher rate among the wealthiest and least indebted in Europe. Recently, the COVID-19 outbreak caused a new leap in households'...

Full description

Saved in:
Bibliographic Details
Published inTechnological forecasting & social change Vol. 191; p. 122431
Main Authors Cuomo, Maria Teresa, Tortora, Debora, Colosimo, Ivan, Ricciardi Celsi, Lorenzo, Genovino, Cinzia, Festa, Giuseppe, La Rocca, Michele
Format Journal Article
LanguageEnglish
Published Elsevier Inc 01.06.2023
Subjects
Online AccessGet full text
ISSN0040-1625
DOI10.1016/j.techfore.2023.122431

Cover

More Information
Summary:According to the national balance sheets of the most advanced economies, despite a recent sharp decline in per capita net wealth, Italian private households present a higher rate among the wealthiest and least indebted in Europe. Recently, the COVID-19 outbreak caused a new leap in households' savings worldwide, particularly in advanced economies and Italy. This study underlines that using advanced analytics tools, household saving behaviour information, and big data analytics may support data-driven decision approaches addressing the management of complex relationships in the financial arena. More specifically, using exploratory and predictive analyses based on big data analytics and machine learning, this study aims to provide extensive customer profiling in the household saving sector in Italy, supporting a data-driven decision-making approach. A profiling of household savings has been defined using the information provided by big data analysis. To proceed in this direction, the hardware and software requirements necessary to perform data processing were considered in the first phase of the study. Data collection was performed according to the so-called extract, transform, load (ETL) process. The contribution of this study lies in the results obtained in terms of data analytics over a dataset that accounts for the purchasing behaviour of almost 20 million postal savers. The clustering algorithm is highly efficient and scales well for large datasets. K-means clustering can be implemented within the MapReduce computational framework. Therefore, the overall procedure proposed here can be easily extended to big data using parallel computing and software implementing MapReduce, such as Hadoop and Spark. •Underlining household saving behaviour with an exploratory analysis•Profiling of household saving on 20 millions of clients in Italy•Creating a novel method to clustering saving market
ISSN:0040-1625
DOI:10.1016/j.techfore.2023.122431