Towards fair decentralized benchmarking of healthcare AI algorithms with the Federated Tumor Segmentation (FeTS) challenge

Computational competitions are the standard for benchmarking medical image analysis algorithms, but they typically use small curated test datasets acquired at a few centers, leaving a gap to the reality of diverse multicentric patient data. To this end, the Federated Tumor Segmentation (FeTS) Challe...

Full description

Saved in:

Bibliographic Details
Published in	Nature communications Vol. 16; no. 1; pp. 6274 - 20
Main Authors	Zenk, Maximilian, Pati, Sarthak, Foley, Patrick, Zimmerer, David, Isensee, Fabian, Kassem, Hasan, Thakur, Siddhesh, Kushibar, Kaisar, Lekadir, Karim, Jiang, Meirui, Yang, Hongzheng, Paetzold, Johannes C., Pawar, Kamlesh, Chen, Zhaolin, Tuladhar, Anup, Souza, Raissa, Maurya, Akansh, Anand, Vikas Kumar, Ganesh, Chandan, Wagner, Ben, Reddy, Divya, Das, Yudhajit, Madhuranthakam, Ananth J., Ren, Jianxun, Siomos, Vasilis, Yan, Yonghong, Li, Zhaopei, Milesi, Alexandre, Nguyen, Quoc D., Huang, Tsung-Ming, Ma, Jun, Singh, Har Shwinder H., Xia, Yong, Dobko, Mariia, Carré, Alexandre, Alam, Saruar, Shah, Nameeta, Sako, Chiharu, Calabrese, Evan, Rudie, Jeffrey, Agzarian, Marc, Kozubek, Michal, Michálek, Jan, Ker^kovský, Miloš, Kopr^ivová, Tereza, Pinho, Marco C., Holcomb, James, Metz, Marie, Lee, Matthew D., Lui, Yvonne W., Yadav, Ipsa, Kumar, Neeraj, Bhuvaneshwar, Krithika, Sayah, Anousheh, Bencheqroun, Camelia, Colen, Rivka R., Kotrotsou, Aikaterini, Sahm, Felix, Choi, Yoon Seong, Lee, Seung-Koo, Chang, Jong Hee, Landman, Bennett, Chotai, Silky, Lee, Joonsang, Jeraj, Robert, Xavier Falcão, Alexandre, Lucio, Diego R., Radojewski, Piotr, Meier, Raphael, Wiest, Roland, Pichler, Josef, Necker, Georg, Meckel, Stephan, Mendoza, Cristobal, Baek, Stephen, Ismael, Heba, Allen, Bryan, Zacharaki, Evangelia I., Wen, Ning, Vallières, Martin, Lepage, Martin, Morón, Fanny, Mandel, Jacob, Liem, Spencer, Alexandre, Gregory S., Lombardo, Joseph, Odafe-Oyibotha, Olubunmi, Shu’aibu Hikima, Mustapha, Fu, Eric, Thompson, John A., Simpson, Amber L., Cutler, Danielle, Moraes, Fabio Y., Boss, Michael A., Schmidt, Kendall, Apgar, Charles, Kirby, Justin S., Albrecht, Jake, Karargyris, Alexandros, Maier-Hein, Klaus
Format	Journal Article
Language	English
Published	London Nature Publishing Group UK 08.07.2025 Nature Publishing Group Nature Portfolio
Subjects	631/114/1305 631/114/2397 631/67/2321 631/67/2322 692/699/67/2321 Algorithms Artificial Intelligence Benchmarking - methods Benchmarks Brain cancer Brain Neoplasms - diagnostic imaging Brain research Brain tumors Collaboration Data integrity Datasets Failure modes Federated learning Glioma Health care Humanities and Social Sciences Humans Image acquisition Image analysis Image processing Image Processing, Computer-Assisted - methods Learning Machine learning Magnetic Resonance Imaging Medical imaging Medical prognosis multidisciplinary Neuroimaging Performance evaluation Science Science (multidisciplinary) Segmentation Tumors
Online Access	Get full text
ISSN	2041-1723 2041-1723
DOI	10.1038/s41467-025-60466-1

Cover

More Information
Summary:	Computational competitions are the standard for benchmarking medical image analysis algorithms, but they typically use small curated test datasets acquired at a few centers, leaving a gap to the reality of diverse multicentric patient data. To this end, the Federated Tumor Segmentation (FeTS) Challenge represents the paradigm for real-world algorithmic performance evaluation. The FeTS challenge is a competition to benchmark (i) federated learning aggregation algorithms and (ii) state-of-the-art segmentation algorithms, across multiple international sites. Weight aggregation and client selection techniques were compared using a multicentric brain tumor dataset in realistic federated learning simulations, yielding benefits for adaptive weight aggregation, and efficiency gains through client sampling. Quantitative performance evaluation of state-of-the-art segmentation algorithms on data distributed internationally across 32 institutions yielded good generalization on average, albeit the worst-case performance revealed data-specific modes of failure. Similar multi-site setups can help validate the real-world utility of healthcare AI algorithms in the future. Federated learning (FL) algorithms have emerged as a promising solution to train models for healthcare imaging across institutions while preserving privacy. Here, the authors describe the Federated Tumor Segmentation (FeTS) challenge for the decentralised benchmarking of FL algorithms and evaluation of Healthcare AI algorithm generalizability in real-world cancer imaging datasets.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	2041-1723 2041-1723
DOI:	10.1038/s41467-025-60466-1