The Federated Tumor Segmentation (FeTS) challenge

International challenges have become the standard for validation of biomedical image analysis methods. We argue, though, that the actual performance even of the winning algorithms on “real-world” clinical data often  remains unclear, as the data included in these challenges are usually acquired in very controlled settings at few institutions. The seemingly obvious solution of just collecting increasingly more data from more geographically distinct institutions in such challenges does not scale well due to privacy, ownership, and technical hurdles.

The Federated Tumor Segmentation (FeTS) challenge 2021 is the first challenge to ever be proposed for federated learning in medicine, and intends to address these hurdles, for both the creation and the evaluation of tumor segmentation models. Specifically, the FeTS 2021 challenge uses clinically acquired, multi-institutional MRI scans from the BraTS 2020 challenge, as well as from various remote independent institutions included in the collaborative network of a real-world federation (https://www.fets.ai).

The FeTS challenge focuses on the construction and evaluation of a consensus model for the segmentation of intrinsically heterogeneous (in appearance, shape, and histology) brain tumors, namely gliomas [1]. Compared to the BraTS 2020 challenge, the ultimate goal of FeTS is 1) the creation of a consensus segmentation model that has gained knowledge from data of multiple institutions without pooling their data together (i.e., by retaining the data within each institution), and 2) the evaluation of segmentation models in such a federated configuration (i.e., in the wild).

The FeTS 2021 challenge is structured in two explicit tasks:

  • Task 1 ("Federated Training") aims at effective weight aggregation methods for the creation of a consensus model given a pre-defined segmentation algorithm for training, while also (optionally) accounting for network outages.
  • Task 2 ("Federated Evaluation") aims at robust segmentation algorithms, evaluated during the testing phase on unseen datasets from various remote independent institutions of the collaborative network of the fets.ai federation.

Participants are free to choose whether they want to focus on only one or multiple tasks.

The clinical relevance and importance of the FeTS challenge is that it addresses challenges related to privacy, legal, bureaucratic, and ownership concerns raised in the current paradigm of multi-site collaborations through data sharing.

The official challenge design document can be found here

Feel free to send any communication related to the MICCAI FeTS 2021 challenge at:  challenge@fets.ai

 

(All deadlines are for 23:59 Eastern Time)

21 May Training phase (Release of training data + associated ground truth).
14 Jun Validation phase (Release of validation data. Hidden ground truth).
19 Jul Submission of short paper and prediction algorithm (incl model weights).
20 Jul-27 Aug Testing phase (Evaluation by organizers, only for methods with submitted papers)
3 Sep  Contacting top-ranked methods to prepare their oral presentation at MICCAI
1 Oct (PM) Announcement of top 3 ranked teams at MICCAI FeTS 2021.
10 Oct Submission deadline for extended LNCS papers (12-14 pages)
24 Oct Reviewers' feedback.
10 Nov Camera-ready paper submission.
15 Dec Summarizing meta-analysis manuscript.

Task 1 ('Federated Training'): Weight Aggregation in Collaborative Learning for Glioma Segmentation

Description

The specific focus of this task is to identify the best way to aggregate the knowledge coming from segmentation models trained on the individual institutions, instead of identifying the best segmentation method. More specifically, the focus is on the methodological portions specific to federated learning (e.g., aggregation, client selection, training-per-round), and not in the development of segmentation algorithms (that the BraTS challenge focuses on).

Provided Infrastructure

To facilitate this task, an existing infrastructure for federated tumor segmentation using federated averaging is provided to all participants in GitHub, indicating the exact places that the participants are allowed and expected to make changes. This infrastructure can be found in GitHub, at: https://github.com/FETS-AI/Challenge/tree/main/Task_1

Specific instructions are given to the participants on the parts/functions that they would need to alter the federated algorithm in the following ways:

  • The aggregation function used to fuse the collaborator model updates.

  • Which collaborators are chosen to train in each federated round.

  • The training parameters for each federated round. 

  • The validation metrics to be computed each round (that can then be used as inputs to the other functions). 

The primary goal involved in this task comprises the aggregation of local segmentation models given the partitioning of the data following their real-world distribution.

License Conformance

By participating and submitting your contribution to the FeTS 2021 challenge, for review and evaluation during the testing/ranking phase, you confirm that your code follows a license conforming to one of the standards: Apache 2.0, BSD-style, or MIT.

Performance Evaluation

The evaluation metrics considered for this task are:

  1. Dice Similarity Coefficient

  2. Hausdorff Distance - 95th percentile
  3. Communication cost, during model training, i.e., Budget time (product of bytes sent/received * number of federated rounds)
  4. Sensitivity (this will not be used for ranking purposes)
  5. Specificity (this will not be used for ranking purposes)

Task 2 ('Federated Evaluation'): Federated Evaluation of Glioma Segmentation Methods In The Wild

Description

In this task, we are seeking to showcase the feasibility of scaling up the concept of challenges by implementing a “Phase 2” challenge with a federated evaluation environment. Specifically, during training the participants will be asked to explore the effects of data partitioning and distribution shifts per contributing site towards finding tumor segmentation algorithms that are able to generalize to data acquired at institutions that did not participate in the training data. After training, all participating algorithms will be evaluated in a distributed way on data from multiple institutions (that are always retained within the institution's server) from the first real-world federation reported in www.fets.ai that have graciously accepted to be part of the FeTS challenge.

On one hand this would allow circumventing many of the common obstacles such as data-privacy issues. On the other hand, federated designs come with their own set of challenges and particularities. This “real-world semantic segmentation challenge” is set up to demonstrate that it is feasible to address these obstacles, hopefully providing a blueprint for future similar “phase 2” challenge endeavors. From a methodical perspective, the main goal of this task is to identify segmentation algorithms that are robust to unknown and realistic distribution shifts between training/validation and test data.

By participating and submitting your contribution to the FeTS 2021 challenge, for review and evaluation during the testing/ranking phase, you confirm that your code follows a license conforming to one of the standards: Apache 2.0, BSD-style, or MIT.

Performance Evaluation

The evaluation metrics considered for this task are:

  1. Dice Similarity Coefficient
  2. Hausdorff Distance - 95th percentile
  3. Sensitivity (this will not be used for ranking purposes)
  4. Specificity (this will not be used for ranking purposes)

Training Data availability (May 21). Register to download the co-registered, skull-stripped, and annotated training data.

Validation Data availability (June 15). An independent set of validation scans will be made available to the participants in June, with the intention to allow them assess the generalizability of their methods in unseen data, via CBICA's Image Processing Portal (IPP). Note that this may not reflect the out-of-distribution generalization aimed at in task 2. The FeTS Challenge leaderboard will be available through a link from this page. Validation data model outputs submitted for placement on the leaderboard must use a model trained using the run_challenge_experiment function as shown in the jupyter notebook: Challenge/Task_1/FeTS_Challenge.ipynb of the FeTS Competition Supporting Code Repository. In addition, the model must be trained in under the maximum simulated time of one week, and must be trained using the institution_split_csv_filename: partitioning_2.csv (see the README under Challenge/Task_1 for further details on simulated time and partitioning csvs).

Short Paper submission deadline (July 19). Participants will have to evaluate their methods on the training and validation datasets, and submit their short paper (8-10 LNCS pages — together with the "LNCS Consent to Publish" form), describing their method and results to the BrainLes CMT submission system, and make sure you choose FeTS as the "Track". Please ensure that you include the appropriate citations, mentioned at the bottom of the "Data" section. This unified scheme should allow for appropriate preliminary comparisons and the creation of the pre- and post-conference proceedings. Participants are allowed to submit longer papers to the MICCAI 2021 BrainLes Workshop, by choosing "BrainLes" as the "Track". FeTS papers will be part of the BrainLes workshop proceedings distributed by Springer LNCS. All paper submissions should use the LNCS template, available both in LaTeX and in MS Word format, directly from Springer (link here).

Testing Phase (July 20 — August 27). The test scans are not made available to participating teams. The organizers will evaluate the submitted contributions instead for all participants that submitted a short paper, and an appropriate version of their algorithm, as described in each task. Participants that have not submitted a short paper, and the copyright form, will not be evaluated.

Oral Presentations. The top-ranked participants will be contacted by September 3 to prepare slides for orally presenting their method during the FeTS satellite event at MICCAI 2021, on Oct. 1.

Announcement of Final Results (Oct 1). The final rankings will be reported during the FeTS 2021 challenge, which will run in conjunction with MICCAI 2021.

Post-conference LNCS paper (Oct 10). All participated methods are invited to extend their papers to 11-14 pages for inclusion to the LNCS proceedings of the BrainLes Workshop.

Joint post-conference journal paper. All participating teams will be involved to the joint manuscript summarizing the results of FeTS 2021, that will be submitted to a high-impact journal in the field. To be involved in this manuscript the participating teams will need to participate in all phases of at least one of the FeTS tasks.

To register for participation and get access to the FeTS 2021 data, you can follow the instructions given at the "Registration/Data Request" section below.

Ample multi-institutional routine clinically-acquired pre-operative baseline multi-parametric Magnetic Resonance Imaging (mpMRI) scans of radiographically appearing glioblastoma (GBM) are provided as the training and validation data for the FeTS 2021 challenge. The data partitioning according to the acquisition origin will also be provided for the training data of the challenge. Specifically, the datasets used in the FeTS 2021 challenge are the subset of GBM cases from the BraTS 2020 challenge. Ground truth reference annotations are created and approved by expert board-certified neuroradiologists for every subject included in the training, validation, and testing datasets to quantitatively evaluate the performance of the participating algorithms.

Validation data will be released on June 14, through an email pointing to the accompanying leaderboard. This will allow participants to obtain preliminary results in unseen data and also report it in their submitted papers (due on July 19), in addition to their cross-validated results on the training data, and the implemented algorithm. The ground truth of the validation data will not be provided to the participants, but multiple submissions to the online evaluation platform (CBICA's IPP) will be allowed.

Finally, the algorithms of the participating teams with a valid short paper submission will be evaluated by the challenge organizers on the same test data, which will not be made available to the participants. Please note that the testing data will be a subset of the BraTS 2020 testing data, as well as data offered by independent geographically distinct institutions that participated in the FeTS federation. The top-ranked participating teams will be invited by September 3, to prepare their slides for a short oral presentation of their method during the FeTS 2021 challenge.

Imaging Data Description

All FeTS mpMRI scans are available as NIfTI files (.nii.gz) and describe a) native (T1) and b) post-contrast T1-weighted (T1Gd), c) T2-weighted (T2), and d) T2 Fluid Attenuated Inversion Recovery (T2-FLAIR) volumes, and were acquired with different clinical protocols and various scanners from multiple institutions, mentioned as data contributors here.

All the imaging datasets have been segmented manually, by one to four raters, following the same annotation protocol, and their annotations were approved by experienced neuro-radiologists. Annotations comprise the GD-enhancing tumor (ET — label 4), the peritumoral edematous/invaded tissue (ED — label 2), and the necrotic tumor core (NCR — label 1), as described both in the BraTS 2012-2013 TMI paper and in the latest BraTS summarizing paper. The provided data are distributed after their pre-processing, i.e., co-registered to the same anatomical template, interpolated to the same resolution (1 mm^3) and skull-stripped.

Non-Imaging Data Description

All imaging data will be accompanied by a comma-separated value (.csv) file including information of the data partitioning according to the acquisition origin for each of the pseudo-identified imaging data, to further facilitate research on FL.

Furthermore, since FeTS leverages the BraTS 2020 data, information on the overall survival (OS), defined in days, are included in a csv file with correspondences to the pseudo-identifiers of the imaging data. The .csv file also includes the age of patients, as well as the resection status. Note that all these data are complementary and not required for the FeTS challenge.

Use of Data Beyond FeTS

Participants are NOT allowed to use additional public and/or private data (from their own institutions) for extending the provided data. This is due to our intentions to provide a fair comparison among the participating methods.

Data Usage Agreement / Citations

You are free to use and/or refer to the FeTS challenge and datasets in your own research, provided that you always cite the following three manuscripts:

[1] S.Pati, U.Baid, M.Zenk, B.Edwards, M.Sheller, G.A.Reina, et al., "The Federated Tumor Segmentation (FeTS) Challenge", arXiv preprint arXiv:2105.05874 (2021)

[2] G.A.Reina, A.Gruzdev, P.Foley, O.Perepelkina, M.Sharma, I.Davidyuk, et al., “OpenFL: An open-source framework for Federated Learning”, arXiv preprint arXiv: 2105.06413 (2021)

[3] S. Bakas, H. Akbari, A. Sotiras, M. Bilello, M. Rozycki, J.S. Kirby, et al., "Advancing The Cancer Genome Atlas glioma MRI collections with expert segmentation labels and radiomic features", Nature Scientific Data, 4:170117 (2017) DOI: 10.1038/sdata.2017.117

 

Additionally, the manuscript below contains results of a simulated study directly related to the FeTS challenge.

[4] M.J.Sheller, B.Edwards, G.A.Reina, J.Martin, S.Pati, A.Kotrotsou, et al., "Federated learning in medicine: facilitating multi-institutional collaborations without sharing patient data", Nature Scientific Reports, 10:12598 (2020)   DOI: 10.1038/s41598-020-69250-1

 

Finally, the following is a data citations directly referring to the TCGA-GBM and TCGA-LGG collections used as part of the FeTS dataset.

[5] S. Bakas, H. Akbari, A. Sotiras, M. Bilello, M. Rozycki, J. Kirby, et al., "Segmentation Labels and Radiomic Features for the Pre-operative Scans of the TCGA-GBM collection", The Cancer Imaging Archive, 2017. DOI: 10.7937/K9/TCIA.2017.KLXWJJ1Q 

[6] S. Bakas, H. Akbari, A. Sotiras, M. Bilello, M. Rozycki, J. Kirby, et al., "Segmentation Labels and Radiomic Features for the Pre-operative Scans of the TCGA-LGG collection", The Cancer Imaging Archive, 2017. DOI: 10.7937/K9/TCIA.2017.GJQ7R0EF

 

Note: Use of the FeTS datasets for creating and submitting benchmark results for publication on MLPerf.org is considered non-commercial use. It is further acceptable to republish results published on MLPerf.org, as well as to create unverified benchmark results consistent with the MLPerf.org rules in other locations. Please note that you should always adhere to the FeTS data usage guidelines and cite appropriately the aforementioned publications, as well as to the terms of use required by MLPerf.org.

Challenge data may be used for all purposes, provided that the challenge is appropriately referenced using the citations given at the bottom of this section.

To register and request the training and the validation data of the FeTS 2021 challenge, please follow the steps below. Please note that the i) training data includes ground truth annotations, ii) validation data does not include annotations, and iii) testing data are not available to either challenge participants or the public.

  1. Create an account in CBICA's Image Processing Portal (ipp.cbica.upenn.edu) and wait for its approval. Note that a confirmation email will be sent so make sure that you also check your Spam folder. This approval process requires a manual review of the account details and might take 3-4 days until completed.
  2. Once your IPP account is approved, login to ipp.cbica.upenn.edu and then click on the application "FeTS 2021: Registration", under the "MICCAI FeTS 2021" group.
  3. Fill in the requested details and press "Submit Job".
  4. Once your request is recorded, you will receive an email pointing to the "results" of your submitted job. You need to login to IPP, access the "Results.zip" file, in which you will find the file “REGISTRATION_STATUS.txt”. In this txt file you will find the links to download the FeTS 2021 data. The training data will include for each subject the 4 structural modalities, ground truth segmentation labels and accompanying text information relating to the source institution, whereas the validation data will include only the 4 modalities.

Please note that you are expected to use CBICA's IPP to evaluate your method against the ground truth labels of the validation datasets. 

 

Data Usage Agreement / Citations

You are free to use and/or refer to the FeTS challenge and datasets in your own research, provided that you always cite the following three manuscripts:

[1] S.Pati, U.Baid, M.Zenk, B.Edwards, M.Sheller, G.A.Reina, et al., "The Federated Tumor Segmentation (FeTS) Challenge", arXiv preprint arXiv:2105.05874 (2021)

[2] G.A.Reina, A.Gruzdev, P.Foley, O.Perepelkina, M.Sharma, I.Davidyuk, et al., “OpenFL: An open-source framework for Federated Learning”, arXiv preprint arXiv: 2105.06413 (2021)

[3] S. Bakas, H. Akbari, A. Sotiras, M. Bilello, M. Rozycki, J.S. Kirby, et al., "Advancing The Cancer Genome Atlas glioma MRI collections with expert segmentation labels and radiomic features", Nature Scientific Data, 4:170117 (2017) DOI: 10.1038/sdata.2017.117

Organizing Committee

(in alphabetical order, except lead organizers)

Data Contributors

Clinical Evaluators and Annotation Approvers

  • Michel Bilello, MD, Ph.D.,    UPenn, Philadelphia, PA, USA

  • Suyash Mohan, MD, Ph.D.,    UPenn, Philadelphia, PA, USA

Awards Sponsor

Acknowledgements

  • Chiharu Sako, Ph.D.,  (Data Analysts - CBICA, UPenn, PA, USA) for her invaluable assistance in the datasets' organization.
  • G Anthony Reina, M.D.,    Intel AI