A comprehensive dataset of annotated brain metastasis MR images with clinical and radiomic data
Beatriz Ocaña-Tienda, Julián Pérez-Beteta, José D. Villanueva-García, José A. Romero-Rosales, David Molina-García, Yannick Suter, Beatriz Asenjo, David Albillo, Ana Ortiz de Mendivil, Luis A. Pérez-Romasanta, Elisabet González-Del Portillo, Manuel Llorente, Natalia Carballo, Fátima Nagib-Raya, Maria Vidal-Denis, Belén Luque, Mauricio Reyes, Estanislao Arana, Víctor M. Pérez-García
As mathematicians in oncology, we understand the importance of data in developing realistic models that can provide useful information to medical professionals. However, finding quality data can be challenging.
Brain metastases (BMs) are a common type of intracranial tumor with an incidence rate comparable to that of prostate and lung cancer. However, despite their frequency, BMs have been under-studied, particularly when it comes to mathematical modeling. In fact, there are fewer than ten published papers on this type of tumor in the context of mathematical studies, indicating a significant research gap in this area.
To address this gap, the Mathematical Oncology Laboratory (MOLAB) collaborated with several medical institutions to create the OpenBTAI BMs dataset. This dataset contains 637 high-resolution imaging studies of 75 patients with 260 BM lesions, alongside clinical data, segmentations of 593 BMs, and morphological measurements and radiomic features for each segmented lesion.
Every lesion was semi-automatically segmented. Initially, the tumors were automatically delineated by using a gray-level threshold that was carefully chosen to identify the contrast-enhancing tumoral volume. However, since automatic segmentation is not always accurate, the MOLAB team carefully reviewed and corrected each segmentation, slice by slice, using a brushing/pixel-removing tool to ensure accuracy and precision. Besides, every segmentation was cross-checked by a radiologist. The step-by-step process used to identify and isolate the tumor from the surrounding tissues in the magnetic resonance images (MRIs) is illustrated in the accompanying figure.
We have used this data to find a growth law for untreated and treated BMs and to develop a mathematical biomarker able to distinguish radiation necrosis, a frequent adverse event after radiotherapy, from tumor recurrence. Wondering how to apply this data further? The potential is vast. For mathematicians, the dataset enables the development and validation of predictive and prognostic models with clinical relevance, evaluation of disease status, and improved treatment planning methods. For those interested in AI, the dataset can be used for automatic brain tumor detection and lesion segmentation.
The OpenBTAI BMs dataset is only the beginning. In the coming months, we will release a second dataset containing over 1000 segmented lesions, as well as a glioblastoma dataset with more than 400 patients.