MICCAI BraTS 2017: Evaluation | Section for Biomedical Image Analysis (SBIA) | Perelman School of Medicine at the University of Pennsylvania

Multimodal Brain Tumor Segmentation Challenge 2017

• Scope • Relevance • Tasks • Data • Data Request • Evaluation • Participation Summary • Previous BraTS • People •

Evaluation Framework

In this year's challenge, two reference standards are used for the two tasks of the challenge: 1) manual segmentation labels of tumor sub-regions, and 2) clinical data of overall survival.

For the segmentation task, and for consistency with the configuration of the previous BraTS challenges, we will use the "Dice score", and the "Hausdorff distance". Expanding upon this evaluation scheme, in BraTS'17 we will also use the metrics of "Sensitivity" and "Specificity", allowing to determine potential over- or under-segmentations of the tumor sub-regions by participating methods. Since the BraTS'12-'13 are subsets of the BraTS'17 test data, we will also calculate performance on the '12-'13 data to allow for a comparison against the performances reported in the BraTS TMI reference paper.

For the task of survival prediction, we are considering to follow two evaluation schemes, based on classification and regression principles. For the evaluation based on the classification principle, we will divide the provided data in three groups based on survival, i.e. long-survivors (e.g., >15 months), short-survivors (e.g., <10 months), and mid-survivors (e.g. between 10 and 15 months), and then evaluate the predictions of the participating teams based on Accuracy (i.e. the number of correctly classified survivors over all patients). For the evaluation based on the regression principle, we intend to use the Mean Square Error to evaluate the predictions in a pairwise manner. Finally, for the post-conference analysis of all participating methods we will conduct an additional Kaplan-Meier analysis (including Hazard-Ratio and p-values) for each participating method.

For both tasks, we will announce a 3-week evaluation period (1–21 August) (Note that the evaluataion period has been extended to 4 weeks, until the 27^th of August), during which the participants will be able to request different dates for the test data to be released to them. Note that each team should analyze the test data using their local computing infrastructure and submit their segmentation results 48-hours later in CBICA's Image Processing Portal (IPP).

**Fig.2: Methods evaluation from previous BraTS benchmarks.** Hausdorff scores for two tumor sub-regions. Black squares indicate the mean scores, which were used here to rank the methods. The Hausdorff distances are reported on a logarithmic scale. (Figure from the BraTS reference paper.)

Feel free to send any communication related to the BraTS challenge to brats2017@cbica.upenn.edu