Swinme: Swin Transformer V2-based Framework for Multimodal Brain Tumor Segmentation
Abstract
Accurate and efficient brain tumor segmentation from Magnetic Resonance Imaging
(MRI) is a critical component in neuro-oncology, as it directly supports diagnosis,
treatment planning, and disease progression assessment. This requirement is particularly
significant in pediatric gliomas, where early and precise clinical intervention can
substantially influence patient outcomes. Despite its importance, manual tumor
delineation remains a time-consuming and subjective process, highly dependent on expert
availability and prone to inter-observer variability. These limitations restrict scalability
and reliability, especially when dealing with high-resolution three-dimensional MRI data
and in healthcare environments with limited resources. Consequently, there is a strong
need for automated segmentation methods that are both accurate and computationally
efficient.
This thesis proposes a novel hybrid deep learning framework, termed SwinME-UNETR
3D, for automated brain tumor segmentation. The proposed approach integrates the
hierarchical attention mechanism of Swin Transformer V2 with a multi-scale
enhancement strategy (SwinME) and the volumetric modeling capability of UNETR
architecture. The framework is designed to capture long-range spatial dependencies while
preserving fine-grained local features that are essential for precise tumor boundary
delineation. Multi-scale feature enhancement and an enhanced transformer module are
incorporated to strengthen feature representation across different spatial resolutions,
enabling effective handling of tumor heterogeneity and complex anatomical structures
present in three-dimensional MRI volumes.
The proposed method is evaluated using the BraTS 2023 dataset, which comprises
multimodal MRI scans including T1, T1 post-contrast (T1ce), T2, and FLAIR sequences.
Segmentation performance is assessed on clinically relevant tumor subregions, namely
whole tumor (WT), tumor core (TC), and enhancing tumor (ET), using Dice Similarity
Coefficient and sensitivity metrics. Experimental results demonstrate robust and
consistent performance, achieving an average Dice score of 0.91 on the validation set.
x
Qualitative analysis further confirms strong alignment with expert annotations and high
volumetric consistency. These findings indicate that transformer-based architectures
augmented with multi-scale enhancement provide an effective solution for automated
brain tumor segmentation and hold strong potential for future clinical decision-support
applications.
Collections
- Informatics Engineering [2522]
