Scalable Compositional Static Taint Analysis for Sensitive Data Tracing on Industrial Micro-Services
In recent years, there has been an increasing demand for sensitive data tracing for industrial microservices; these include change of governance, data breach detection, to data consistency validation. As an information tracking technique, Taint analysis is widely used to address these demands. This...
        Saved in:
      
    
          | Published in | IEEE/ACM International Conference on Software Engineering: Software Engineering in Practice (Online) pp. 110 - 121 | 
|---|---|
| Main Authors | , , , , , , | 
| Format | Conference Proceeding | 
| Language | English | 
| Published | 
            IEEE
    
        01.05.2023
     | 
| Subjects | |
| Online Access | Get full text | 
| ISSN | 2832-7659 | 
| DOI | 10.1109/ICSE-SEIP58684.2023.00015 | 
Cover
| Summary: | In recent years, there has been an increasing demand for sensitive data tracing for industrial microservices; these include change of governance, data breach detection, to data consistency validation. As an information tracking technique, Taint analysis is widely used to address these demands. This paper aims to share our experience in developing a scalable static taint analyzer on sensitive data tracing for large-scale industrial microservices. Although several taint analyzers have been proposed for Java applications, our experiments show that existing approaches are inefficient and/or ineffective (in terms of low recall/precision rates) for analyzing large-scale industrial microservices.Instead, we present CFTaint, a compositional field-based taint analyzer, to address the challenges for popular microservices running on industrial Fintech applications. CFTaint improves scalability by using a fast compositional function summary, which summarizes the data propagation of each function during the on-the-fly taint analysis. CFTaint also uses a novel filed-based algorithm to analyze the taint propagation based on specified sensitive fields to reduce false negatives. Our field-based algorithm maximizes the soundness of our approach even when the taint tracking is performed on an unsound call graph. Furthermore, we also propose an efficient code transformation method to model the behaviours of the containers, which allows our analysis to trace data propagation in a container environment. Experiments on numerous production microservices demonstrate the high recall (96.09%) rates and precision (93.51% for tracing sensitive data) of CFTaint with low time complexity (121.73 seconds). | 
|---|---|
| ISSN: | 2832-7659 | 
| DOI: | 10.1109/ICSE-SEIP58684.2023.00015 |