Theoretical Analysis of Git Bisect

In this paper, we consider the problem of finding a regression in a version control system (VCS), such as git. The set of versions is modelled by a directed acyclic graph (DAG) where vertices represent versions of the software, and arcs are the changes between different versions. We assume that some...

Full description

Saved in:

Bibliographic Details
Published in	Algorithmica Vol. 86; no. 5; pp. 1365 - 1399
Main Authors	Courtiel, Julien, Dorbec, Paul, Lecoq, Romain
Format	Journal Article
Language	English
Published	New York Springer US 01.05.2024 Springer Nature B.V Springer Verlag
Subjects	Algorithm Analysis and Problem Complexity Algorithms Apexes Approximation Computer Science Computer Systems Organization and Communication Networks Data Structures and Algorithms Data Structures and Information Theory Discrete Mathematics Graph theory Mathematical analysis Mathematics of Computing Queries Regression Theory of Computation Version control Worst-case complexity Approximation algorithm Version control system Regression search Graph algorithm
Online Access	Get full text
ISSN	0178-4617 1611-3349 0302-9743 1432-0541
DOI	10.1007/s00453-023-01194-0

Cover

More Information
Summary:	In this paper, we consider the problem of finding a regression in a version control system (VCS), such as git. The set of versions is modelled by a directed acyclic graph (DAG) where vertices represent versions of the software, and arcs are the changes between different versions. We assume that somewhere in the DAG, a bug was introduced, which persists in all of its subsequent versions. It is possible to query a vertex to check whether the corresponding version carries the bug. Given a DAG and a bugged vertex, the Regression Search Problem consists in finding the first vertex containing the bug in a minimum number of queries in the worst-case scenario. This problem is known to be NP-complete. We study the algorithm used in git to address this problem, known as git bisect. We prove that in a general setting, git bisect can use an exponentially larger number of queries than an optimal algorithm. We also consider the restriction where all vertices have indegree at most 2 (i.e. where merges are made between at most two branches at a time in the VCS), and prove that in this case, git bisect is a 1 log 2 ( 3 / 2 ) -approximation algorithm, and that this bound is tight. We also provide a better approximation algorithm for this case. Finally, we give an alternative proof of the NP-completeness of the Regression Search Problem, via a variation with bounded indegree.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	0178-4617 1611-3349 0302-9743 1432-0541
DOI:	10.1007/s00453-023-01194-0