XtraLibD: Detecting Irrelevant Third-Party Libraries in Java and Python Applications

Software development comprises the use of multiple Third-Party Libraries (TPLs). However, the irrelevant libraries present in software application’s distributable often lead to excessive consumption of resources such as CPU cycles, memory, and modile-devices’ battery usage. Therefore, the identifica...

Full description

Saved in:
Bibliographic Details
Published inEvaluation of Novel Approaches to Software Engineering Vol. 1556; pp. 132 - 155
Main Authors Kapur, Ritu, Rao, Poojith U., Dewam, Agrim, Sodhi, Balwinder
Format Book Chapter
LanguageEnglish
Published Switzerland Springer International Publishing AG 2022
Springer International Publishing
SeriesCommunications in Computer and Information Science
Subjects
Online AccessGet full text
ISBN303096647X
9783030966478
ISSN1865-0929
1865-0937
DOI10.1007/978-3-030-96648-5_7

Cover

More Information
Summary:Software development comprises the use of multiple Third-Party Libraries (TPLs). However, the irrelevant libraries present in software application’s distributable often lead to excessive consumption of resources such as CPU cycles, memory, and modile-devices’ battery usage. Therefore, the identification and removal of unused TPLs present in an application are desirable. We present a rapid, storage-efficient, obfuscation-resilient method to detect the irrelevant-TPLs in Java and Python applications. Our approach’s novel aspects are i) Computing a vector representation of a .class file using a model that we call Lib2Vec. The Lib2Vec model is trained using the Paragraph Vector Algorithm. ii) Before using it for training the Lib2Vec models, a .class file is converted to a normalized form via semantics-preserving transformations. iii) A eXtraLibrary Detector (XtraLibD) developed and tested with 27 different language-specific Lib2Vec models. These models were trained using different parameters and >30,000 .class and >478,000 .py files taken from >100 different Java libraries and 43,711 Python available at MavenCentral.com and Pypi.com, respectively. XtraLibD achieves an accuracy of 99.48% with an F1 score of 0.968 and outperforms the existing tools, viz., LibScout, LiteRadar, and LibD with an accuracy improvement of 74.5%, 30.33%, and 14.1%, respectively. Compared with LibD, XtraLibD achieves a response time improvement of 61.37% and a storage reduction of 87.93% (99.85% over JIngredient). Our program artifacts are available at https://www.doi.org/10.5281/zenodo.5179747.
ISBN:303096647X
9783030966478
ISSN:1865-0929
1865-0937
DOI:10.1007/978-3-030-96648-5_7