A/B testing: A systematic literature review

A/B testing, also referred to as online controlled experimentation or continuous experimentation, is a form of hypothesis testing where two variants of a piece of software are compared in the field from an end user’s point of view. A/B testing is widely used in practice to enable data-driven decisio...

Full description

Saved in:
Bibliographic Details
Published inThe Journal of systems and software Vol. 211; p. 112011
Main Authors Quin, Federico, Weyns, Danny, Galster, Matthias, Silva, Camila Costa
Format Journal Article
LanguageEnglish
Published Elsevier Inc 01.05.2024
Subjects
Online AccessGet full text
ISSN0164-1212
1873-1228
1873-1228
DOI10.1016/j.jss.2024.112011

Cover

More Information
Summary:A/B testing, also referred to as online controlled experimentation or continuous experimentation, is a form of hypothesis testing where two variants of a piece of software are compared in the field from an end user’s point of view. A/B testing is widely used in practice to enable data-driven decision making for software development. While a few studies have explored different facets of research on A/B testing, no comprehensive study has been conducted on the state-of-the-art in A/B testing. Such a study is crucial to provide a systematic overview of the field of A/B testing driving future research forward. To address this gap and provide an overview of the state-of-the-art in A/B testing, this paper reports the results of a systematic literature review that analyzed primary studies. The research questions focused on the subject of A/B testing, how A/B tests are designed and executed, what roles stakeholders have in this process, and the open challenges in the area. Analysis of the extracted data shows that the main targets of A/B testing are algorithms, visual elements, and workflow and processes. Single classic A/B tests are the dominating type of tests, primarily based in hypothesis tests. Stakeholders have three main roles in the design of A/B tests: concept designer, experiment architect, and setup technician. The primary types of data collected during the execution of A/B tests are product/system data, user-centric data, and spatio-temporal data. The dominating use of the test results are feature selection, feature rollout, continued feature development, and subsequent A/B test design. Stakeholders have two main roles during A/B test execution: experiment coordinator and experiment assessor. The main reported open problems are related to the enhancement of proposed approaches and their usability. From our study we derived three interesting lines for future research: strengthen the adoption of statistical methods in A/B testing, improving the process of A/B testing, and enhancing the automation of A/B testing. •We consolidate 143 studies on software engineering aspects of A/B testing.•We present the different roles stakeholders take in A/B test design and execution.•A/B testing has gained traction in fields like embedded and cyber–physical systems.•Despite automation trends, A/B testing still requires significant human involvement.
ISSN:0164-1212
1873-1228
1873-1228
DOI:10.1016/j.jss.2024.112011