TY - EJOU AU - Mohammed, Isra AU - Musa, Mohamed Elhafiz M. AU - Elbashir, Murtada K. AU - Mostafa, Ayman Mohamed AU - Adam, Amin Ibrahim AU - Mahmood, A. AU - Faggad, Areeg S. TI - Addressing Class Overlap in Sonic Hedgehog Medulloblastoma Molecular Subtypes Classification Using Under-Sampling and SVD-Enhanced Multinomial Regression T2 - Computers, Materials \& Continua PY - 2025 VL - 84 IS - 2 SN - 1546-2226 AB - Sonic Hedgehog Medulloblastoma (SHH-MB) is one of the four primary molecular subgroups of Medulloblastoma. It is estimated to be responsible for nearly one-third of all MB cases. Using transcriptomic and DNA methylation profiling techniques, new developments in this field determined four molecular subtypes for SHH-MB. SHH-MB subtypes show distinct DNA methylation patterns that allow their discrimination from overlapping subtypes and predict clinical outcomes. Class overlapping occurs when two or more classes share common features, making it difficult to distinguish them as separate. Using the DNA methylation dataset, a novel classification technique is presented to address the issue of overlapping SHH-MB subtypes. Penalized multinomial regression (PMR), Tomek links (TL), and singular value decomposition (SVD) were all smoothly integrated into a single framework. SVD and group lasso improve computational efficiency, address the problem of high-dimensional datasets, and clarify class distinctions by removing redundant or irrelevant features that might lead to class overlap. As a method to eliminate the issues of decision boundary overlap and class imbalance in the classification task, TL enhances dataset balance and increases the clarity of decision boundaries through the elimination of overlapping samples. Using fivefold cross-validation, our proposed method (TL-SVDPMR) achieved a remarkable overall accuracy of almost 95% in the classification of SHH-MB molecular subtypes. The results demonstrate the strong performance of the proposed classification model among the various SHH-MB subtypes given a high average of the area under the curve (AUC) values. Additionally, the statistical significance test indicates that TL-SVDPMR is more accurate than both SVM and random forest algorithms in classifying the overlapping SHH-MB subtypes, highlighting its importance for precision medicine applications. Our findings emphasized the success of combining SVD, TL, and PMR techniques to improve the classification performance for biomedical applications with many features and overlapping subtypes. KW - Class overlap; SHH-MB molecular subtypes; under-sampling; singular value decomposition; penalized multinomial regression; DNA methylation profiles DO - 10.32604/cmc.2025.063880