Scalable Compositional Static Taint Analysis for Sensitive Data Tracing on Industrial Micro-Services

In recent years, there has been an increasing demand for sensitive data tracing for industrial microservices; these include change of governance, data breach detection, to data consistency validation. As an information tracking technique, Taint analysis is widely used to address these demands. This...

Full description

Saved in:
Bibliographic Details
Published inIEEE/ACM International Conference on Software Engineering: Software Engineering in Practice (Online) pp. 110 - 121
Main Authors Zhong, Zexin, Liu, Jiangchao, Wu, Diyu, Di, Peng, Sui, Yulei, Liu, Alex X., Lui, John C. S.
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.05.2023
Subjects
Online AccessGet full text
ISSN2832-7659
DOI10.1109/ICSE-SEIP58684.2023.00015

Cover

Abstract In recent years, there has been an increasing demand for sensitive data tracing for industrial microservices; these include change of governance, data breach detection, to data consistency validation. As an information tracking technique, Taint analysis is widely used to address these demands. This paper aims to share our experience in developing a scalable static taint analyzer on sensitive data tracing for large-scale industrial microservices. Although several taint analyzers have been proposed for Java applications, our experiments show that existing approaches are inefficient and/or ineffective (in terms of low recall/precision rates) for analyzing large-scale industrial microservices.Instead, we present CFTaint, a compositional field-based taint analyzer, to address the challenges for popular microservices running on industrial Fintech applications. CFTaint improves scalability by using a fast compositional function summary, which summarizes the data propagation of each function during the on-the-fly taint analysis. CFTaint also uses a novel filed-based algorithm to analyze the taint propagation based on specified sensitive fields to reduce false negatives. Our field-based algorithm maximizes the soundness of our approach even when the taint tracking is performed on an unsound call graph. Furthermore, we also propose an efficient code transformation method to model the behaviours of the containers, which allows our analysis to trace data propagation in a container environment. Experiments on numerous production microservices demonstrate the high recall (96.09%) rates and precision (93.51% for tracing sensitive data) of CFTaint with low time complexity (121.73 seconds).
AbstractList In recent years, there has been an increasing demand for sensitive data tracing for industrial microservices; these include change of governance, data breach detection, to data consistency validation. As an information tracking technique, Taint analysis is widely used to address these demands. This paper aims to share our experience in developing a scalable static taint analyzer on sensitive data tracing for large-scale industrial microservices. Although several taint analyzers have been proposed for Java applications, our experiments show that existing approaches are inefficient and/or ineffective (in terms of low recall/precision rates) for analyzing large-scale industrial microservices.Instead, we present CFTaint, a compositional field-based taint analyzer, to address the challenges for popular microservices running on industrial Fintech applications. CFTaint improves scalability by using a fast compositional function summary, which summarizes the data propagation of each function during the on-the-fly taint analysis. CFTaint also uses a novel filed-based algorithm to analyze the taint propagation based on specified sensitive fields to reduce false negatives. Our field-based algorithm maximizes the soundness of our approach even when the taint tracking is performed on an unsound call graph. Furthermore, we also propose an efficient code transformation method to model the behaviours of the containers, which allows our analysis to trace data propagation in a container environment. Experiments on numerous production microservices demonstrate the high recall (96.09%) rates and precision (93.51% for tracing sensitive data) of CFTaint with low time complexity (121.73 seconds).
Author Liu, Jiangchao
Zhong, Zexin
Lui, John C. S.
Sui, Yulei
Liu, Alex X.
Wu, Diyu
Di, Peng
Author_xml – sequence: 1
  givenname: Zexin
  surname: Zhong
  fullname: Zhong, Zexin
  email: zhongzexin.zzx@antgroup.com
  organization: University of Technology Sydney,Sydney,Australia
– sequence: 2
  givenname: Jiangchao
  surname: Liu
  fullname: Liu, Jiangchao
  email: jiangchao.ljc@antgroup.com
  organization: Ant Group,Hangzhou,China
– sequence: 3
  givenname: Diyu
  surname: Wu
  fullname: Wu, Diyu
  email: wudiyu.wdy@antgroup.com
  organization: Ant Group,Hangzhou,China
– sequence: 4
  givenname: Peng
  surname: Di
  fullname: Di, Peng
  email: dipeng.dp@antgroup.com
  organization: Ant Group,Hangzhou,China
– sequence: 5
  givenname: Yulei
  surname: Sui
  fullname: Sui, Yulei
  email: yulei.sui@uts.edu.au
  organization: University of New South Wales,Sydney,Australia
– sequence: 6
  givenname: Alex X.
  surname: Liu
  fullname: Liu, Alex X.
  email: alexliu@antgroup.com
  organization: Ant Group,Hangzhou,China
– sequence: 7
  givenname: John C. S.
  surname: Lui
  fullname: Lui, John C. S.
  email: cslui@cse.cuhk.edu.hk
  organization: Chinese University of Hong Kong,HongKong,China
BookMark eNotT11LwzAAjKLgnPsHPsQf0JmP5utx1KqFiULn80jSVAJdMpI42L-3ok_HHXfH3S24CjE4AB4wWmOM1GPX9G3Vt90Hk1zWa4IIXSOEMLsAKyWUpAxRhKjgl2BBJCWV4EzdgFXO3iBGmBSCkAUYeqsnbSYHm3g4xuyLj0FPsC-6eAt32ocCN7Nyzj7DMSbYu_DrOjn4pIuGu6StD18wBtiF4TuX5Of4m7cpVr1LJ29dvgPXo56yW_3jEnw-t7vmtdq-v3TNZltponipDLKcDrWRI8KG19xIMQjGCZ63YiusVfMprKkZGLGDqQ2hnHA7U8UtxiNdgvu_Xu-c2x-TP-h03mOEBWGc0h_LQ1s_
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/ICSE-SEIP58684.2023.00015
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
EISBN 9798350300376
EISSN 2832-7659
EndPage 121
ExternalDocumentID 10172563
Genre orig-research
GrantInformation_xml – fundername: Ant Group
  funderid: 10.13039/100018735
GroupedDBID 6IE
6IL
6IN
AAWTH
ABLEC
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
OCL
RIE
RIL
ID FETCH-LOGICAL-a296t-b0c63d4b8f01b646b87d756218771c7cc93501a3bd52cdb4b23626cbd596c11f3
IEDL.DBID RIE
IngestDate Wed Aug 27 02:20:47 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a296t-b0c63d4b8f01b646b87d756218771c7cc93501a3bd52cdb4b23626cbd596c11f3
PageCount 12
ParticipantIDs ieee_primary_10172563
PublicationCentury 2000
PublicationDate 2023-May
PublicationDateYYYYMMDD 2023-05-01
PublicationDate_xml – month: 05
  year: 2023
  text: 2023-May
PublicationDecade 2020
PublicationTitle IEEE/ACM International Conference on Software Engineering: Software Engineering in Practice (Online)
PublicationTitleAbbrev ICSE-SEIP
PublicationYear 2023
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssib052587722
ssj0003211720
Score 2.3056314
Snippet In recent years, there has been an increasing demand for sensitive data tracing for industrial microservices; these include change of governance, data breach...
SourceID ieee
SourceType Publisher
StartPage 110
SubjectTerms Analytical models
Codes
Containers
Java
micro-services
Microservice architectures
Production
program analysis
Scalability
taint analysis
Title Scalable Compositional Static Taint Analysis for Sensitive Data Tracing on Industrial Micro-Services
URI https://ieeexplore.ieee.org/document/10172563
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1NSwMxEA22B_GkYsVvInhNu_nYJHuuLa1gKbSF3kqSTaAoW5HtxV9vZrfbiiB4281hCZNh5u3kzRuEnjgNWnibEdAOISIPnmgfKGEhMTIxFLwI2BYTOVqIl2W63DWrV70w3vuKfOa78Fjd5ecbt4VSWQ_cJ6Zo3kItpWXdrNU4T8pSHZEi2xdYePy1USw5Ro87Xc3euD8bxCg1nqZaaqinMFA3TWAi7o_JKlViGZ6iSbOlmk_y1t2Wtuu-fqk1_nvPZ6hz6OHD0312OkdHvrhA-SweCTRLYYgDO76WeccAOdcOz826KHGjU4IjnsUzILhDSMTPpjQ4pjYXP4c3BT4M_cCvwOojTdjpoMVwMO-PyG7OAjEskyWxiZM8F1aHhFoppNUqVxEX0WhQ6pRzGdw-Gm7zlLncCstAw8bF10w6SgO_RO1iU_grhL0IPK4qJyOsNNEHlAlM-NQo7tLM2GvUAROtPmopjVVjnZs_1m_RCRxTzTC8Q-3yc-vvIwoo7UN1-t9b77Ax
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LSwMxEA5aQT2pWPFtBK9pd_Pa3XNtabUthbbQW0myCRRlK7K9-OvN7KMVQfC2m8MSJsPMt5NvvkHoiYUu5lYnBLRDCE-dJbF1IaEuUDJQIXgRsC3Gsj_nLwuxqJrVi14Ya21BPrMteCzu8tO12UCprA3u41M020cHgnMuynat2n0EFbHHinRbYmH-5yaiwSF6rJQ124POtOvj1GAiYhlDRYWCvmkAM3F_zFYpUkvvBI3rTZWMkrfWJtct8_VLr_Hfuz5FzV0XH55s89MZ2rPZOUqn_lCgXQpDJKgYW-odA-hcGTxTqyzHtVIJ9ogWT4HiDkERP6tcYZ_cjP8cXmd4N_YDj4DXR-rA00TzXnfW6ZNq0gJRNJE50YGRLOU6dkGoJZc6jtLII6PQGzQ0kTEJ3D8qplNBTaq5pqBiY_xrIk0YOnaBGtk6s5cIW-6YX42M9MBSeS-IlKPcChUxIxKlr1ATTLT8KMU0lrV1rv9Yf0BH_dlouBwOxq836BiOrOQb3qJG_rmxdx4T5Pq-8IRvxY-zfg
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=IEEE%2FACM+International+Conference+on+Software+Engineering%3A+Software+Engineering+in+Practice+%28Online%29&rft.atitle=Scalable+Compositional+Static+Taint+Analysis+for+Sensitive+Data+Tracing+on+Industrial+Micro-Services&rft.au=Zhong%2C+Zexin&rft.au=Liu%2C+Jiangchao&rft.au=Wu%2C+Diyu&rft.au=Di%2C+Peng&rft.date=2023-05-01&rft.pub=IEEE&rft.eissn=2832-7659&rft.spage=110&rft.epage=121&rft_id=info:doi/10.1109%2FICSE-SEIP58684.2023.00015&rft.externalDocID=10172563