StaTDS library: Statistical tests for Data Science

In Data Science, there is a continual demand for statistical comparison to identify the most advantageous algorithms. Finding a software tool that facilitates the execution of multiple tests on different Data Science experiments without relying on additional libraries poses a challenge. This paper i...

Full description

Saved in:

Bibliographic Details
Published in	Neurocomputing (Amsterdam) Vol. 595; p. 127877
Main Authors	Luna, Christian, Moya, Antonio R., Luna, José María, Ventura, Sebastián
Format	Journal Article
Language	English
Published	Elsevier B.V 28.08.2024
Subjects	Data science comparison Python Statistical tests Statistical tests Data science comparison Python
Online Access	Get full text
ISSN	0925-2312 1872-8286 1872-8286
DOI	10.1016/j.neucom.2024.127877

Cover

More Information
Summary:	In Data Science, there is a continual demand for statistical comparison to identify the most advantageous algorithms. Finding a software tool that facilitates the execution of multiple tests on different Data Science experiments without relying on additional libraries poses a challenge. This paper introduces StaTDS, an open-source library and web application implemented entirely in pure Python, designed to analyze, test, and compare Data Science algorithms. StaTDS implements all statistical tests without external dependencies. It ensures its durability and avoids future uncontrolled deprecated dependencies. With support for a wide variety of statistical tests (24 in total), StaTDS surpasses existing libraries dedicated to statistical testing. Moreover, the library incorporates tests to guide users in determining whether to employ parametric or non-parametric tests, such as the assessment of normality and homoscedasticity. This platform-independent library is available on GitHub under the GNU General Public License.
ISSN:	0925-2312 1872-8286 1872-8286
DOI:	10.1016/j.neucom.2024.127877