MICCAI BraTS 2018: Evaluation | Section for Biomedical Image Analysis (SBIA) | Perelman School of Medicine at the University of Pennsylvania

Multimodal Brain Tumor Segmentation Challenge 2018

• Scope • Relevance • Tasks • Data • Evaluation • Participation Summary • Data Request • Previous BraTS • People •

Evaluation Framework

In this year's challenge, two reference standards are used for the two tasks of the challenge: 1) manual segmentation labels of tumor sub-regions, and 2) clinical data of overall survival.

For the segmentation task, and for consistency with the configuration of the previous BraTS challenges, we will use the "Dice score", and the "Hausdorff distance". Expanding upon this evaluation scheme, in BraTS'18 we will also use the metrics of "Sensitivity" and "Specificity", allowing to determine potential over- or under-segmentations of the tumor sub-regions by participating methods. Since the BraTS'12-'13 are subsets of the BraTS'18 test data, we will also calculate performance on the '12-'13 data to allow for a comparison against the performances reported in the BraTS TMI reference paper.

For the task of survival prediction, two evaluation schemes are considered. First, for ranking the participating teams, evaluation will be based on the classification of subjects as long-survivors (e.g., >15 months), short-survivors (e.g., <10 months), and mid-survivors (e.g. between 10 and 15 months). Predictions of the participating teams will be assessed based on accuracy (i.e. the number of correctly classified patients) with respect to this grouping. Note that participants are expected to provide predicted survival status only for subjects with resection status of GTR (i.e., Gross Total Resection). For post-challenge analyses, we will also compare both the mean and median square error of survival time predictions.

For both tasks, we will announce a 3-week evaluation period (30 July–20 August), during which the participants will be able to request different dates for the test data to be released to them. Note that each team should analyze the test data using their local computing infrastructure and submit their results 48-hours later in CBICA's Image Processing Portal (IPP).

**Fig.2: Methods evaluation from previous BraTS benchmarks.** Hausdorff scores for two tumor sub-regions. Black squares indicate the mean scores, which were used here to rank the methods. The Hausdorff distances are reported on a logarithmic scale. (Figure from the BraTS reference paper.)

Feel free to send any communication related to the BraTS challenge to brats2018@cbica.upenn.edu.