A Study of Call Graph Construction for JVM-Hosted Languages

Call graphs have many applications in software engineering, including bug-finding, security analysis, and code navigation in IDEs. However, the construction of call graphs requires significant investment in program analysis infrastructure. An increasing number of programming languages compile to the...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on software engineering Vol. 47; no. 12; pp. 2644 - 2666
Main Authors Ali, Karim, Lai, Xiaoni, Luo, Zhaoyi, Lhotak, Ondrej, Dolby, Julian, Tip, Frank
Format Journal Article
LanguageEnglish
Published New York IEEE 01.12.2021
IEEE Computer Society
Subjects
Online AccessGet full text
ISSN0098-5589
1939-3520
2326-3881
1939-3520
DOI10.1109/TSE.2019.2956925

Cover

More Information
Summary:Call graphs have many applications in software engineering, including bug-finding, security analysis, and code navigation in IDEs. However, the construction of call graphs requires significant investment in program analysis infrastructure. An increasing number of programming languages compile to the Java Virtual Machine (JVM), and program analysis frameworks such as WALA and SOOT support a broad range of program analysis algorithms by analyzing JVM bytecode. This approach has been shown to work well when applied to bytecode produced from Java code. In this paper, we show that it also works well for diverse other JVM-hosted languages: dynamically-typed functional Scheme, statically-typed object-oriented Scala, and polymorphic functional OCaml. Effectively, we get call graph construction for these languages for free, using existing analysis infrastructure for Java, with only minor challenges to soundness. This, in turn, suggests that bytecode-based analysis could serve as an implementation vehicle for bug-finding, security analysis, and IDE features for these languages. We present qualitative and quantitative analyses of the soundness and precision of call graphs constructed from JVM bytecodes for these languages, and also for Groovy, Clojure, Python, and Ruby. However, we also show that implementation details matter greatly. In particular, the JVM-hosted implementations of Groovy, Clojure, Python, and Ruby produce very unsound call graphs, due to the pervasive use of reflection, invokedynamic instructions, and run-time code generation. Interestingly, the dynamic translation schemes employed by these languages, which result in unsound static call graphs, tend to be correlated with poor performance at run time.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0098-5589
1939-3520
2326-3881
1939-3520
DOI:10.1109/TSE.2019.2956925