论文Nature ML· 06-13

用显著性图评估 3D 结构 MRI 精神分裂症分类的决策过程

Decision processes in 3D structural MRI schizophrenia classification evaluated with saliency maps

Introduction

Psychiatric disorders such as schizophrenia, depression, or anxiety disorders are characterized by high heterogeneity in symptoms and wide-spread structural as well as functional alterations of the brain[1](https://www.nature.com/articles/s41598-026-57667-z#ref-CR1 "Tandon, R. et al. The schizophrenia syndrome, circa 2024: What we know and how that informs its nature. Schizophr Res. 264, 1–28 (2024)."),[2](https://www.nature.com/articles/s41598-026-57667-z#ref-CR2 "McCutcheon, R. A., Keefe, R. S. E. & McGuire, P. K. Cognitive impairment in schizophrenia: aetiology, pathophysiology, and treatment. Mol. Psychiatry. 28, 1902–1918 (2023)."). Neuroimaging, especially (functional) magnetic resonance imaging ((f)MRI) provides information about those alterations and fosters insight into the pathologies of the disorders. Quantification of those alterations might also be used as biomarkers in a clinical setting[3](https://www.nature.com/articles/s41598-026-57667-z#ref-CR3 "Abi-Dargham, A. et al. Candidate biomarkers in psychiatric disorders: state of the field. World Psychiatry. 22, 236–262 (2023)."). Advances in application of machine learning (ML) techniques on neuroimaging data show promise in clinical decision support for diagnosis or prognosis, therapy decision, and treatment development[4](https://www.nature.com/articles/s41598-026-57667-z#ref-CR4 "Chen, Z. S. et al. Modern views of machine learning for precision psychiatry. Patterns 3, 100602 (2022)."),[5](https://www.nature.com/articles/s41598-026-57667-z#ref-CR5 "Chen, J., Patil, K. R., Yeo, B. T. T. & Eickhoff, S. B. Leveraging Machine Learning for Gaining Neurobiological and Nosological Insights in Psychiatric Research. Biol. Psychiatry. 93, 18–28 (2023)."). Classical ML-approaches for these tasks usually process extracted features based on biomarkers or other expert knowledge[6](https://www.nature.com/articles/s41598-026-57667-z#ref-CR6 "Zang, J. et al. Effects of Brain atlases and machine learning methods on the discrimination of schizophrenia patients: A multimodal MRI study. Front. Neurosci. 15, 697168 (2021)."),[7](https://www.nature.com/articles/s41598-026-57667-z#ref-CR7 "Tavakoli, H., Rostami, R., Shalbaf, R. & Nazem-Zadeh, M. R. Diagnosis of schizophrenia and its subtypes using MRI and machine learning. Brain Behav. 15, e70219 (2025)."). Deep learning (DL) models circumvent this a priori selection by incorporating feature selection mechanisms operating on the data either in a rich, high-dimensional feature space or in its original form[8](https://www.nature.com/articles/s41598-026-57667-z#ref-CR8 "Di Camillo, F. et al. Magnetic resonance imaging–based machine learning classification of schizophrenia spectrum disorders: a meta-analysis. Psychiatry Clin. Neurosci. 78, 732–743 (2024)."),[9](https://www.nature.com/articles/s41598-026-57667-z#ref-CR9 "Zhang, J. et al. Detecting schizophrenia with 3D structural brain MRI using deep learning. Sci. Rep. 13, 14433 (2023)."),[10](https://www.nature.com/articles/s41598-026-57667-z#ref-CR10 "Smucny, J., Shi, G. & Davidson, I. Deep learning in neuroimaging: overcoming challenges with emerging approaches. Front Psychiatry 13, 912600 (2022)."). Current improvements in the field of deep learning e.g., specialized convolutional neural networks (CNN) architectures for medical images, enable effective detection of complex structural alterations[8](https://www.nature.com/articles/s41598-026-57667-z#ref-CR8 "Di Camillo, F. et al. Magnetic resonance imaging–based machine learning classification of schizophrenia spectrum disorders: a meta-analysis. Psychiatry Clin. Neurosci. 78, 732–743 (2024)."),[11](https://www.nature.com/articles/s41598-026-57667-z#ref-CR11 "Sadeghi, et al. An overview of artificial intelligence techniques for diagnosis of schizophrenia based on magnetic resonance imaging modalities: Methods, challenges, and future works. Comput. Biol. Med. 105554, 105554 (2022)."),[12](https://www.nature.com/articles/s41598-026-57667-z#ref-CR12 "Rakić, M., Cabezas, M., Kushibar, K., Oliver, A. & Lladó, X. Improving the detection of autism spectrum disorder by combining structural and functional MRI information. NeuroImage Clin. 102181, 102181 (2020)."),[13](https://www.nature.com/articles/s41598-026-57667-z#ref-CR13 "Sarveswaran, T. & Rajangam, V. An ensemble approach using multidimensional convolutional neural networks in wavelet domain for schizophrenia classification from sMRI data. Sci. Rep. 1025710257. (2025)."). However, those methods come with their own impediments. The complexity of deep learning architectures makes them data greedy, requiring large data sets for training. Medical image data sets are several magnitudes smaller than the image data sets that brought deep-learning based image analysis their breakthrough[14](https://www.nature.com/articles/s41598-026-57667-z#ref-CR14 "Krizhevsky, A., Sutskever, I. & Hinton, G. E. ImageNet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems, vol. 25 (Curran Associates, Inc., 2012)."). Counter-intuitively, small data sets can yield very good performance, which is frequently a sign of overfitting to the specific data set[12](https://www.nature.com/articles/s41598-026-57667-z#ref-CR12 "Rakić, M., Cabezas, M., Kushibar, K., Oliver, A. & Lladó, X. Improving the detection of autism spectrum disorder by combining structural and functional MRI information. NeuroImage Clin. 102181, 102181 (2020)."),[15](https://www.nature.com/articles/s41598-026-57667-z#ref-CR15 "Eitel, F., Schulz, M.-A., Seiler, M., Walter, H. & Ritter, K. Promises and pitfalls of deep neural networks in neuroimaging-based psychiatric research. Exp. Neurol. 113608, 113608 (2021)."). This problem can be addressed with transfer learning, i.e., using a larger, less specific data set or employing weights from adjunct fields of application for training and only re-adjusting some of the downstream layers[16](https://www.nature.com/articles/s41598-026-57667-z#ref-CR16 "Zhuang, F. et al. A comprehensive survey on transfer learning. Proc. IEEE 76, 43–76 (2021).").

Still, classical DL models lack innate ability to explain their decision processes or require a specific, ante-hoc model design to capture human-understandable information and high quality data patterns alike. In order to illuminate the behavior of pre-existing models, post-hoc explainable artificial intelligence (XAI) methods allow the clarification of decisions in DL-based clinical decision support systems. Although those methods are known to strengthen the trust of all stakeholders in such systems, post-hoc XAI methods are occasionally used but not yet standard practice in medical DL research[17](https://www.nature.com/articles/s41598-026-57667-z#ref-CR17 "Allgaier, J., Mulansky, L., Draelos, R. L. & Pryss, R. How does the model make predictions? A systematic literature review on the explainability power of machine learning in healthcare. Artif. Intell. Med. 102616, 102616 (2023)."),[18](https://www.nature.com/articles/s41598-026-57667-z#ref-CR18 "Qian, J., Li, H., Wang, J. & He, L. Recent advances in explainable artificial intelligence for magnetic resonance imaging. Diagnostics 13, 1571 (2023)."). If deployed, one of the most commonly used techniques is the image-based Gradient-Weighted Class Activation Mapping (Grad-CAM)[19](https://www.nature.com/articles/s41598-026-57667-z#ref-CR19 "Selvaraju, R. R. et al. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization. In IEEE International Conference on Computer Vision (ICCV), 618–626 (IEEE Xplore, 2017). https://doi.org/10.1109/ICCV.2017.74

"). This local method generates saliency maps for individual test images with respect to a specific output class. Two recent studies used Grad-CAM saliency maps to evaluate the plausibility of their schizophrenia classifiers[20](https://www.nature.com/articles/s41598-026-57667-z#ref-CR20 "Hu, M. et al. Structural and diffusion MRI based schizophrenia classification using 2D pretrained and 3D naive convolutional neural networks. Schizophr. Res. 341, 330–341 (2022)."),[21](https://www.nature.com/articles/s41598-026-57667-z#ref-CR21 "Wen, Y. et al. Bridging structural MRI with cognitive function for individual level classification of early psychosis via deep learning. Front Psychiatry 13, 1075564 (2023)."). The advantage of Grad-CAMs is their explanation on the individual level which provides transparent and intuitive information in a diagnostic or prognostic clinical setting. These local explanations, however, cannot provide reliable information on reoccurring, consistent patterns in the dataset, let alone in the disease as a whole. To derive this kind of generalized explanation, information across multiple patient maps has to be aggregated.

This work explores the necessity of transparent clinical AI decision support and evaluates the practicality of XAI methods, more specifically saliency maps derived with Grad-CAMs, for providing this transparency. Furthermore, we offer an approach to derive neuroanatomical biomarker candidates of a psychiatric disorder across patient saliency maps (Fig. 1). In the first stage (classification), we train and evaluate seven DL-architectures frequently used in the field of medical image processing to separate 3D MR images of schizophrenia patients from healthy controls. Sequence 1 (Seq1, inspired by VGG16[22](https://www.nature.com/articles/s41598-026-57667-z#ref-CR22 "Simonyan, K. & Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. In 3rd International Conference on Learning Representations (ICLR 2015) 1–14 (Computational and Biological Learning Society, San Diego, 2015).") and OhNet (OhNet)[23](https://www.nature.com/articles/s41598-026-57667-z#ref-CR23 "Oh, J., Oh, B. L., Lee, K. U., Chae, J. H. & Yun, K. Identifying schizophrenia using structural MRI with a deep learning algorithm. Front Psychiatry 11, 481509 (2020)."), were trained from scratch. Med3D[24](https://www.nature.com/articles/s41598-026-57667-z#ref-CR24 "Chen, S., Ma, K. & Zheng, Y. Med3D: Transfer Learning for 3D Medical Image Analysis. (2019). http://arxiv.org/abs/1904.00625

"), BrainID[25](https://www.nature.com/articles/s41598-026-57667-z#ref-CR25 "Liu, P., Puonti, O., Hu, X. & Alexander, D. C. Brain-ID: Learning Contrast-Agnostic Anatomical Representations for Brain Imaging. In Computer Vision – ECCV 2024 (ed. Leonardis, A.) 322–340 (Springer Nature Switzerland, 2024). https://doi.org/10.1007/978-3-031-73254-6_19

."), RiekeNet[26](https://www.nature.com/articles/s41598-026-57667-z#ref-CR26 "Rieke, J., Eitel, F., Weygandt, M., Haynes, J.-D. & Ritter, K. Visualizing Convolutional Networks for MRI-Based Diagnosis of Alzheimer’s Disease. In Understanding and Interpreting Machine Learning in Medical Image Computing Applications (ed. Stoyanov, D.) 24–31 (Springer International Publishing, 2018). https://doi.org/10.1007/978-3-030-02628-8_3

."), Mixed Convolution Network (Mixed Conv)[27](https://www.nature.com/articles/s41598-026-57667-z#ref-CR27 "Tran, D. et al. A Closer Look at Spatiotemporal Convolutions for Action Recognition. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 6450–6459 (2018). https://doi.org/10.1109/CVPR.2018.00675

"), and ResNet18[27](https://www.nature.com/articles/s41598-026-57667-z#ref-CR27 "Tran, D. et al. A Closer Look at Spatiotemporal Convolutions for Action Recognition. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 6450–6459 (2018). https://doi.org/10.1109/CVPR.2018.00675

") are publicly available architectures pre-trained on diverse training sets. We hypothesize that classifiers extracting physiological correlates of the disorder should localize anatomically plausible features from the images and eventually converge across architectures. In the second stage (local explanation), we evaluate the performance of the architectures with regard to the classification task, and their plausibility based on quantitative metrics derived from Grad-CAM saliency maps. The most suitable classifiers based on these evaluations are then selected for further investigations. In the third stage (global explanation), we first derive locations that differ robustly between clinical groups based on statistical comparisons within classification architectures. Secondly, we intersect those locations to derive robust regions across classifiers. Mapping the locations derived from the individual classifiers as well as their intersections on the corresponding brain areas provides candidate regions for schizophrenia pathology and potential anatomical biomarkers. This approach constitutes a general method that allows the transition between local saliency map explanations and a global statistical evaluation indicating brain areas relevant to psychiatric biomarkers.

Results

Fig. 1

The alternative text for this image may have been generated using AI.

Full size image

Three-stage process to explain the decision process in diagnosis classifiers with Grad-CAM and derive neuroanatomical underpinnings of the disorder. (1) Classification: training of seven different DL architecture types: Sequence 1 (Seq1, inspired by VGG16[22](https://www.nature.com/articles/s41598-026-57667-z#ref-CR22 "Simonyan, K. & Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. In 3rd International Conference on Learning Representations (ICLR 2015) 1–14 (Computational and Biological Learning Society, San Diego, 2015).") and OhNet (OhNet)[23](https://www.nature.com/articles/s41598-026-57667-z#ref-CR23 "Oh, J., Oh, B. L., Lee, K. U., Chae, J. H. & Yun, K. Identifying schizophrenia using structural MRI with a deep learning algorithm. Front Psychiatry 11, 481509 (2020)."), were trained from scratch. Med3D[24](https://www.nature.com/articles/s41598-026-57667-z#ref-CR24 "Chen, S., Ma, K. & Zheng, Y. Med3D: Transfer Learning for 3D Medical Image Analysis. (2019). http://arxiv.org/abs/1904.00625

") are publicly available architectures pre-trained on diverse training sets ranging from human motion clips (Mixed Conv, ResNet) over mixed or synthesized medical imaging modalities (Med3D, BrainID) to sMRI data for Alzheimer’s classification (RiekeNet). (2) Local Explanations: Plausibility check for all architecture types with evaluation of classification performance and three subject-specific Grad-CAM metrics (visual saliency map inspection, center of mass (CoM) deviation analysis, and mass accuracy as an estimation of Grad-CAM accuracy). (3) Global Explanation: Derivation of robust differences between patient and control class saliency maps for the most promising two network types. Detection of stable regions across subject Grad-CAMs within each architecture and across all architectures. _sMRI_ structural magnetic resonance imaging; _ReLU_ rectified linear unit; Grad-CAM: gradient-weighted class activation mapping; _ROC_ receiver-operating characteristic.

Classification performance

In the first stage (Fig. 1, classification), we train seven commonly used CNN architectures for MRI image classification on structural T1-weigthed MR images of 192 adult brains (101 schizophrenia patients and 91 healthy control subjects) from the MCIC collection[28](https://www.nature.com/articles/s41598-026-57667-z#ref-CR28 "Gollub, R. L. et al. The MCIC collection: A shared repository of multi-modal, multi-site brain image data from a clinical investigation of schizophrenia. Neuroinformatics 388, 367–388 (2013)."). The models were either trained from scratch (Seq1, OhNet) or with transfer learning (Med3D10, BrainID, RiekeNet, MixedConv, ResNet18) with 5-fold cross-validation. On average, all classifiers exhibited a stratified classification accuracy well above 70% (Fig. 2a) and area-under-the-curve (AUC) scores of more than 0.75 (Fig. 2b) without significant differences in accuracy (_F_ _6,28_=0.371, _p_=0.891). These classification accuracies are well within the range of what is expected from an ML-classifier based on structural MRI data[8](https://www.nature.com/articles/s41598-026-57667-z#ref-CR8 "Di Camillo, F. et al. Magnetic resonance imaging–based machine learning classification of schizophrenia spectrum disorders: a meta-analysis. Psychiatry Clin. Neurosci. 78, 732–743 (2024).").

Fig. 2

The alternative text for this image may have been generated using AI.

Full size image

Classification performance for all architecture types. (a) highest network accuracy per stratification run and (b) average receiver operating characteristic (ROC) curve with associated area under the curve (AUC), both collected in a 5-fold stratification process.

Local explanation: Individual saliency map evaluation

In order to make the decision process in the classifiers visible and quantifiable, we generate saliency maps with Grad-CAM for the individual test images. The intensity of the voxel in the saliency map scales with its contribution to the model’s decision. Two metrics are generated to quantify the plausibility of the classifier’s decision process. Mass accuracy (MA) is a ratio representing the accumulated intensity of saliency within the brain vs. outside the brain area[18](https://www.nature.com/articles/s41598-026-57667-z#ref-CR18 "Qian, J., Li, H., Wang, J. & He, L. Recent advances in explainable artificial intelligence for magnetic resonance imaging. Diagnostics 13, 1571 (2023).") (Fig. 3). We use the whole brain as an area of interest and do not constrain the location to known areas affected in schizophrenia in order to avoid a bias towards current literature. A high portion of attention outside of the brain area points towards overfitting or inconsistency of the learned feature representation. The center of mass (CoM) describes the intensity-weighted average of the voxel positions of a saliency map. In order to describe the groups center’s position in image space, the average vector length and its standard deviation (STD) over the test set is considered (Table 1). Uniformly distributed saliency would yield average CoMs close to the image center. Furthermore, a low standard deviation of CoM across subjects saliency maps indicates convergence across persons in each clinical group and, hence, is preferred over broadly scattered saliency. The former is used in conjunction with qualitative assessment of the averaged saliency maps per CNN architecture since, e.g., two areas localized in point symmetry would also yield a central CoM.

Fig. 3

The alternative text for this image may have been generated using AI.

Full size image

Mass accuracy for all architectures, separated for the two clinical groups. Per architecture type, the Grad-CAM generation was conducted with the best performing example network and was based on at least 17 correctly classified, unseen test set images in every group. Data points outside the median ± 1.5*IQR are depicted as outliers.

Table 1 Average center of mass (µ, relative to image center) and its standard deviation (σ) within the two clinical groups schizophrenia (SZ) and control (C) for all architectures. Two-sided _t_-tests compare vector lengths of CoMs to corresponding image midpoint between schizophrenia and control groups (bold: sign. on _p_< 0.05).

Full size table

Architecture selection for the global explanation stage is based on three criteria: First, the model’s attention should be mainly concentrated on the target, i.e., brain area. Therefore, we accepted at most 50% attention focused on non-biological image structures. As the networks were constructed with a single output node architecture, the memorization of a single set of features is enforced during the learning procedure. We therefore accepted diverging performances for the two groups being more lenient with the control group. The architectures Sequence 1, OhNet, Med3D, and RiekeNet fail to meet this criterion (Fig.3) and are eliminated from further consideration. Second, the between-subject STD of the CoM should be at least an order of magnitude smaller than the image resolution (cf. Supplementary Table 1) to indicate convergence across subjects (Table1). The three remaining architectures fulfil this criterion. Third, clearly distinguishable CoMs distributions might indicate specific features for each group, which favours Sequence 1, Med3D10, BrainID, and ResNet (Table1, bold) but excludes Mixed Conv. Taken together, the saliency maps generated by BrainID (Fig.4a&b) and ResNet18 (Fig.5a&b) remain as the ones plausible enough to warrant further study of their global characteristics.

Global explanation: Consistent saliency maps between groups within architectures

The third stage aims to identify consistent brain areas where high network saliency differs significantly between the two clinical groups. First, the two remaining architectures, BrainID (Fig.4) and ResNet18 (Fig.5), are considered separately. Brain areas with differing saliency in the schizophrenia group (cf. Figs.4a and 5a) than in the control group (cf. Figs.4b and 5b) indicate candidate regions for schizophrenia pathology (cf. Figs.4c and 5c). For both architectures, we found one contiguous cluster surviving the correction for multiple comparisons (cf. Figs.4d and 5d).

Fig. 4

The alternative text for this image may have been generated using AI.

Full size image

Saliency maps and statistical derivations of the BrainID classifier, overlaid on an exemplary patient image. Average network attention maps in the (a) patient and (b) control groups. (c) Uncorrected _t_-maps (two-sided) contrasting schizophrenia and patient groups. (d) Negative log _p_-value maps after threshold-free cluster enhancement (TFCE)[29](https://www.nature.com/articles/s41598-026-57667-z#ref-CR29 "Smith, S. M. & Nichols, T. E. Threshold-free cluster enhancement: Addressing problems of smoothing, threshold dependence and localisation in cluster inference. NeuroImage 98, 83–98 (2009).")_p_-value correction and additional Bonferroni correction on the remaining voxels. The one contiguous cluster comprises 100,743 voxels. All maps were derived with image input of the size 128 × 128 × 128 voxel and thereby result in saliency depictions of the same size. Note that all maps are rather coarse due to the convolutions in the classification networks.

Fig. 5

The alternative text for this image may have been generated using AI.

Full size image

Saliency maps and statistical derivations of the ResNet18 classifier, overlaid on an exemplary patient image. Average network attention maps in the (a) patient and (b) control groups. (c) Uncorrected _t_-maps (two-sided) contrasting schizophrenia and patient groups. (d) Negative log _p_-value maps after TFCE _p_-value correction and additional Bonferroni correction on the remaining voxels. The one contiguous cluster comprises 342,354 voxels. All maps were derived with image input of the size 64 × 64 × 64 voxel and thereby result in saliency depictions of the same size. Note that all maps are rather coarse due to the convolutions in the classification networks.

Mapping the voxels with the top 1% _t_-values of these clusters to the anatomical regions of the AAL atlas[30](https://www.nature.com/articles/s41598-026-57667-z#ref-CR30 "Tzourio-Mazoyer, B. et al. Automated anatomical labeling of activations in SPM using a macroscopic anatomical parcellation of the MNI MRI single-subject brain. NeuroImage 289, 273–289 (2002)."), reveals predominantly frontal regions with dominance of right-sided regions for both architectures (Fig. 6), even though the regions with the highest coverage for BrainID are dominated by left side regions. BrainID includes bilaterally further cortical and subcortical regions while ResNet is restricted to frontal and subcortical regions in the right hemisphere. Subcortical regions are restricted to the right side in both architectures. We find a correspondence in regions between the architectures in superior and inferior frontal regions of the right hemisphere.

Fig. 6

The alternative text for this image may have been generated using AI.

Full size image

AAL atlas[30](https://www.nature.com/articles/s41598-026-57667-z#ref-CR30 "Tzourio-Mazoyer, B. et al. Automated anatomical labeling of activations in SPM using a macroscopic anatomical parcellation of the MNI MRI single-subject brain. NeuroImage 289, 273–289 (2002).") regions (in MNI152 space[31](https://www.nature.com/articles/s41598-026-57667-z#ref-CR31 "Fonov, V. et al. Unbiased average age-appropriate atlases for pediatric studies. NeuroImage 54, 313–327 (2011).") associated with the brain areas of the most robust difference between clinical groups for (a) BrainID and (b) ResNet18. For an association, at least one voxel of the top 1% voxels within the cluster derived after multiple comparison correction (Figs. 4d and 5d) must be present in the atlas region. Regions are thresholded at 2%.

Global explanation: Intersection across architectures

Consensual saliency across architectures potentially indicates similar learned features, i.e. strengthen the possibility of finding a disorder-relevant brain area. Even though some of the anatomical regions associated with the top 1% of voxels within each architecture overlap (Fig.6.), we do not find any top 1% voxels overlapping across the two architectures. Therefore, we relax our threshold for the intersection analysis and consider all voxels in the one contiguous cluster of each classifier (Figs.4d and 5d). From this overlap, we now consider only the top 1% voxel again (Fig.7a). Mapping those voxels to the AAL atlas reveals predominantly frontal regions again and a dominance of the right hemisphere (Fig.7b). The two regions that are mapped for all three variants, the two individual architectures and the intersection, are the right superior frontal gyrus and the triangular part of the right inferior frontal gyrus.

Fig. 7

The alternative text for this image may have been generated using AI.

Full size image

Intersection of both network types. (a) Overlap of areas (in MNI152 space) with higher saliency in schizophrenia than control across the architectures ResNet18 and BrainID. (b) AAL atlas regions associated with regions displayed in a. For an association, at least one voxel of the top 1% voxels intersecting both architectures must be present in the atlas region. Regions are thresholded at 2%.

Discussion

In this three-stage approach, we demonstrate the necessity and feasibility of transparency in the decision process of DL-architectures for image-based decision support in psychiatry. Furthermore, we demonstrate the value of a local explainability method, which is a helpful tool for enriching individual decisions made by a DL-based clinical decision support system, in deriving global anatomical biomarkers for a psychiatric disorder such as schizophrenia.

Seven models, all based on DL-architectures frequently used for medical image analysis and adapted for the analysis of 3D MRI images, achieve a good classification performance with areas under the curve ranging from 0.75 to 0.85. The achieved accuracies correspond well to other ML-models based on anatomical MRI for the task of schizophrenia diagnosis[8](https://www.nature.com/articles/s41598-026-57667-z#ref-CR8 "Di Camillo, F. et al. Magnetic resonance imaging–based machine learning classification of schizophrenia spectrum disorders: a meta-analysis. Psychiatry Clin. Neurosci. 78, 732–743 (2024)."). However, the evaluation of their saliency maps obtained with Grad-CAM provides rather diverse performances with respect to the plausibility of the models, with some models even basing their decision primarily on areas outside of the brain.

Saliency maps are tools most often used for qualitative evaluation and visualisation of the decision process of classification models whereas their quantitative evaluation is often considered unsuitable and is scarcely conducted[32](https://www.nature.com/articles/s41598-026-57667-z#ref-CR32 "Tjoa, E. & Guan, C. ASo. E. A. A survey on explainable artificial intelligence (XAI): Toward medical XAI. IEEE Trans. Neural Netw. Learn. Syst. 4813, 4793–4813 (2021)."),[33](https://www.nature.com/articles/s41598-026-57667-z#ref-CR33 "Zhang, Y. et al. Grad-CAM helps interpret the deep learning models trained to classify multiple sclerosis types using clinical brain magnetic resonance imaging. J. Neurosci. Methods 109098, 109098 (2021)."). In image classification tasks in which the face validity of the produced saliency map can be easily assessed visually or _via_ a well defined ground truth area of interest, a quantitative evaluation might not be necessary or hard. The anatomical changes in schizophrenia are subtle, distributed[34](https://www.nature.com/articles/s41598-026-57667-z#ref-CR34 "van Erp, T. G. M. et al. Cortical brain abnormalities in 4474 individuals with schizophrenia and 5098 control subjects via the Enhancing Neuro Imaging Genetics Through Meta Analysis (ENIGMA) consortium. Biol. Psychiatry 84, 644–654 (2018)."),[35](https://www.nature.com/articles/s41598-026-57667-z#ref-CR35 "Picó-Pérez, M. et al. Multimodal meta-analysis of structural gray matter, neurocognitive and social cognitive fMRI findings in schizophrenia patients. Psychol. Med. 52, 614–624 (2022)."),[36](https://www.nature.com/articles/s41598-026-57667-z#ref-CR36 "Dabiri, M. et al. Neuroimaging in schizophrenia: A review article. Front Neurosci 16, 1042814 (2022)."),[37](https://www.nature.com/articles/s41598-026-57667-z#ref-CR37 "Keshavan, M. S. et al. Neuroimaging in schizophrenia. Neuroimaging Clin. N Am. 30, 73–83 (2020)."),[38](https://www.nature.com/articles/s41598-026-57667-z#ref-CR38 "Howes, O. D., Cummings, C., Chapman, G. E. & Shatalina, E. Neuroimaging in schizophrenia: an overview of findings and their implications for synaptic changes. Neuropsychopharmacol. Off Publ Am. Coll. Neuropsychopharmacol. 48, 151–167 (2023)."), and still an ongoing matter of research. Ground truth is therefore not possible. The quantitative metrics developed in this study are based on the rather coarse “region of interest” that includes the whole brain area, the assumption that the saliency should not be uniformly distributed, and that similar brain regions are affected in the majority of patients. Our approach can be generalized to other disorders characterized by subtle, complex brain alterations as it is typical for many psychiatric disorders. For biomarker discovery, further analyses that detect confined clusters of intensity can be a valuable addition to the analyses presented in this study.

The anatomical regions most frequently identified in reviews or meta-analyses of neuroimaging studies on schizophrenia patients[34](https://www.nature.com/articles/s41598-026-57667-z#ref-CR34 "van Erp, T. G. M. et al. Cortical brain abnormalities in 4474 individuals with schizophrenia and 5098 control subjects via the Enhancing Neuro Imaging Genetics Through Meta Analysis (ENIGMA) consortium. Biol. Psychiatry 84, 644–654 (2018)."),[35](https://www.nature.com/articles/s41598-026-57667-z#ref-CR35 "Picó-Pérez, M. et al. Multimodal meta-analysis of structural gray matter, neurocognitive and social cognitive fMRI findings in schizophrenia patients. Psychol. Med. 52, 614–624 (2022)."),[36](https://www.nature.com/articles/s41598-026-57667-z#ref-CR36 "Dabiri, M. et al. Neuroimaging in schizophrenia: A review article. Front Neurosci 16, 1042814 (2022)."),[37](https://www.nature.com/articles/s41598-026-57667-z#ref-CR37 "Keshavan, M. S. et al. Neuroimaging in schizophrenia. Neuroimaging Clin. N Am. 30, 73–83 (2020)."),[38](https://www.nature.com/articles/s41598-026-57667-z#ref-CR38 "Howes, O. D., Cummings, C., Chapman, G. E. & Shatalina, E. Neuroimaging in schizophrenia: an overview of findings and their implications for synaptic changes. Neuropsychopharmacol. Off Publ Am. Coll. Neuropsychopharmacol. 48, 151–167 (2023).") are frontal[39](https://www.nature.com/articles/s41598-026-57667-z#ref-CR39 "Mubarik, A. & Tohid, H. Frontal lobe alterations in schizophrenia: a review. Trends Psychiatry Psychother. 38, 198–206 (2016)."), temporal[40](https://www.nature.com/articles/s41598-026-57667-z#ref-CR40 "Kaur, A. et al. Structural and functional alterations of the temporal lobe in schizophrenia: A literature review. Cureus 12, e11177 (2020)."),[41](https://www.nature.com/articles/s41598-026-57667-z#ref-CR41 "Ohi, K. et al. Structural alterations of the superior temporal gyrus in schizophrenia: Detailed subregional differences. Eur. Psychiatry. 35, 25–31 (2016)."), and subcortical[42](https://www.nature.com/articles/s41598-026-57667-z#ref-CR42 "Okada, N. et al. Subcortical volumetric alterations in four major psychiatric disorders: a mega-analysis study of 5604 subjects and a volumetric data-driven approach for classification. Mol. Psychiatry. 28, 5206–5216 (2023).") regions. For the data set used in this study, frontal, temporal, and insular grey matter reduction was assessed with voxel based morphometry[28](https://www.nature.com/articles/s41598-026-57667-z#ref-CR28 "Gollub, R. L. et al. The MCIC collection: A shared repository of multi-modal, multi-site brain image data from a clinical investigation of schizophrenia. Neuroinformatics 388, 367–388 (2013)."). Along this line, frontal regions are highlighted most prominently by the saliency maps of both our classifiers as well as their intersection. Beyond the right superior frontal gyrus and the triangular part of the right inferior frontal gyrus, however, the specific regions vary across classifiers. While the saliency map of BrainID highlights left superior and right temporal regions as well, the map of ResNet18 emphasizes the right insula and putamen. The insula has been discussed for its role in the progression of schizophrenia[43](https://www.nature.com/articles/s41598-026-57667-z#ref-CR43 "Kittleson, A. R., Woodward, N. D., Heckers, S. & Sheffield, J. M. The insula: Leveraging cellular and systems-level research to better understand its roles in health and schizophrenia. Neurosci. Biobehav Rev. 160, 105643 (2024)."),[44](https://www.nature.com/articles/s41598-026-57667-z#ref-CR44 "Kittleson, A. R. et al. A 2-year longitudinal investigation of insula subregional volumes in early psychosis. https://doi.org/10.1101/2024.11.25.24317916

(2024).") and has also been highlighted in another study utilizing saliency maps[20](https://www.nature.com/articles/s41598-026-57667-z#ref-CR20 "Hu, M. et al. Structural and diffusion MRI based schizophrenia classification using 2D pretrained and 3D naive convolutional neural networks. Schizophr. Res. 341, 330–341 (2022)."). The basal ganglia and related subcortical structures are affected in several psychiatric disorders with caudate nucleus and putamen, the pertaining regions highlighted in our results, being specifically involved in schizophrenia[42](https://www.nature.com/articles/s41598-026-57667-z#ref-CR42 "Okada, N. et al. Subcortical volumetric alterations in four major psychiatric disorders: a mega-analysis study of 5604 subjects and a volumetric data-driven approach for classification. Mol. Psychiatry. 28, 5206–5216 (2023)."). Indeed, the right caudate has been found to be enlarged in schizophrenia patients[34](https://www.nature.com/articles/s41598-026-57667-z#ref-CR34 "van Erp, T. G. M. et al. Cortical brain abnormalities in 4474 individuals with schizophrenia and 5098 control subjects via the Enhancing Neuro Imaging Genetics Through Meta Analysis (ENIGMA) consortium. Biol. Psychiatry 84, 644–654 (2018)."),[42](https://www.nature.com/articles/s41598-026-57667-z#ref-CR42 "Okada, N. et al. Subcortical volumetric alterations in four major psychiatric disorders: a mega-analysis study of 5604 subjects and a volumetric data-driven approach for classification. Mol. Psychiatry. 28, 5206–5216 (2023)."). The correspondence of the saliency maps from this study to findings in the current body of literature confirms the plausibility of our results and supports the approach introduced.

The saliency maps generated for this study lack precision and regional focus. This becomes especially apparent on the architecture types that were not further analyzed. One reason for this problem might be the insufficient performance and generalization of the classifiers. The small sample size can be one contributing factor to the inadequate generalization ability. However, even a model performance does not guarantee that a model has captured a genuine underlying relationship[45](https://www.nature.com/articles/s41598-026-57667-z#ref-CR45 "Zhang, C., Bengio, S., Hardt, M. & Recht, B. & Vinyals, O. Understanding deep learning requires rethinking generalization. In International Conference on Learning Representations (2017)."). Consistent with these findings, our CNN experiments achieve very similar accuracy scores despite producing highly divergent saliency maps for most models. Hence, implausible saliency maps might indicate a dataset memorization.

Based on the learnings from the current study, upcoming work might strive for a region-focused saliency map generation by improving the classification performance, construct networks additionally capturing regions of interests e.g. segmentation networks or including saliency map related metrics in the training cost function. Due to the method-inherent smoothing of the Grad-CAM method, the conducted brain region mapping procedure might also be imprecise and thereby lead to slight distortions in region coverage.

To conclude, the generalizable approach employed in this work is a first step to enable the identification of regions of high relevance during the classification of pathologies by transitioning from local saliency explanations to accumulated global information. The anatomical findings of this study converge with findings of classical imaging studies on schizophrenia patients, giving the approach plausibility.

Methods

Data

The data used in this study was obtained from the MCIC collection[28](https://www.nature.com/articles/s41598-026-57667-z#ref-CR28 "Gollub, R. L. et al. The MCIC collection: A shared repository of multi-modal, multi-site brain image data from a clinical investigation of schizophrenia. Neuroinformatics 388, 367–388 (2013).") in July 2019. The collection contains structural T1-weighted MR images of 158 adult SC patients and 169 demographic, age, and sex-matched HC. Four research sites were involved in the data collection process from 2004 to 2006. All subjects provided informed consent to participate in the study that was approved by the human research committees at each of the sites. Patients had to be diagnosed with SCZ conforming to the Diagnostic and Statistical Manual of Mental Disorders, 4th Edition (DSM-IV)[46](https://www.nature.com/articles/s41598-026-57667-z#ref-CR46 "American Psychiatric Association. Diagnostic and Statistical Manual of Mental Disorders: DSM-IV (American Psychiatric Association, 1994)."). We included only data from sites A, C, and D because the images originating from site B were not publicly released due to IRB restrictions. Furthermore, the data from ten subjects failed transformation to BIDS format[47](https://www.nature.com/articles/s41598-026-57667-z#ref-CR47 "Gorgolewski, K. J. et al. BIDS apps: Improving ease of use, accessibility, and reproducibility of neuroimaging data analysis methods. PLoS Comput. Biol. 13, e1005209 (2017).") due to missing meta-data, leaving a subset of 101 SCZ and 91 HC for this study (Table 2).

Table 2 Sample characteristics of the MRI data set per class (Schizophrenia, Control).

Full size table

The brain images were skull-stripped and registered to MNI 1 mm 3 isovoxel with the nypipe toolbox developed by Gorgolewski et al.[48](https://www.nature.com/articles/s41598-026-57667-z#ref-CR48 "Gorgolewski, K. et al. Nipype: A flexible, lightweight and extensible neuroimaging data processing framework in python. Front Neuroinform. 5, 13 (2011)."). Image intensities in the 1% and 99% percentile were removed; empty image slices deleted and the image intensity distribution was normalized to −1 to 1. Since training time and complexity rises with increasing image size, the image size was down-scaled to 2 mm 3 isotopic resolution resulting in 643 voxels per image. If pretrained network weights were used, the corresponding network requirements were recreated in order to mimic the original training circumstances as far as possible and to increase transferability of the pretrained network weights (Supplementary Table 1).

Deep learning architectures

During this study, seven architecture types with and without pretrained weights were attuned to distinguish schizophrenia patients from a control group. Sequence 1, a three convolutional blocks deep 3D CNN, inspired by the VGG-16 architecture[22](https://www.nature.com/articles/s41598-026-57667-z#ref-CR22 "Simonyan, K. & Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. In 3rd International Conference on Learning Representations (ICLR 2015) 1–14 (Computational and Biological Learning Society, San Diego, 2015)."), was designed and trained from scratch (Supplementary Images 1). The second fully trained network, OhNet, is a reimplementation of one of the best performing 3D DL architectures in the field of schizophrenia classification[23](https://www.nature.com/articles/s41598-026-57667-z#ref-CR23 "Oh, J., Oh, B. L., Lee, K. U., Chae, J. H. & Yun, K. Identifying schizophrenia using structural MRI with a deep learning algorithm. Front Psychiatry 11, 481509 (2020).") with an adapted single output layer. In all pretrained architectures, the final network layer was replaced with two additional dense layers for information processing. The networks Med3D[24](https://www.nature.com/articles/s41598-026-57667-z#ref-CR24 "Chen, S., Ma, K. & Zheng, Y. Med3D: Transfer Learning for 3D Medical Image Analysis. (2019). http://arxiv.org/abs/1904.00625

"), a pretrained ResNet-10 3D adaptation for medical image analysis, and RiekeNet[26](https://www.nature.com/articles/s41598-026-57667-z#ref-CR26 "Rieke, J., Eitel, F., Weygandt, M., Haynes, J.-D. & Ritter, K. Visualizing Convolutional Networks for MRI-Based Diagnosis of Alzheimer’s Disease. In Understanding and Interpreting Machine Learning in Medical Image Computing Applications (ed. Stoyanov, D.) 24–31 (Springer International Publishing, 2018). https://doi.org/10.1007/978-3-030-02628-8_3

."), a four layers deep network for Alzheimer’s detection, were fine tuned without any further adaptations. In the case of BrainID fine tuning, the U-Net encoder of the network was used, as suggested by the authors for any downstream task requiring brain feature extraction. For the usage of the video processing networks MixedConv[27](https://www.nature.com/articles/s41598-026-57667-z#ref-CR27 "Tran, D. et al. A Closer Look at Spatiotemporal Convolutions for Action Recognition. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 6450–6459 (2018). https://doi.org/10.1109/CVPR.2018.00675

") and ResNet[27](https://www.nature.com/articles/s41598-026-57667-z#ref-CR27 "Tran, D. et al. A Closer Look at Spatiotemporal Convolutions for Action Recognition. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 6450–6459 (2018). https://doi.org/10.1109/CVPR.2018.00675

"), the first convolutional layer had to be replaced. As a suitable initialization, unadjusted weights were set to the average of the original convolutional weights. All architectures result in a single output node.

Deep learning classifier training

During the training process, the network performance was enhanced as much as possible while having the most robust training course. Every architecture type was trained with an AdamW optimizer using a batch size of nine images. If needed, default weight decays were optimized to achieve a smoother loss progression. When applicable, learning rate and epoch number were extracted from the original publications and subsequently adapted manually. Every CNN architecture was trained for at least 20 epochs until the convergence of the validation loss. When pretrained weights were available, transfer learning was applied. Fine-tuning of pretrained network layers did not prove beneficial and hence was not conducted. As a network regularization, dropout within the last fully connected layers was used. The dropout ratio was increased until a decrease in the validation accuracy was registered. A detailed record of all hyper parameters can be found in Supplementary Table 2 of the supplemental material. Validation accuracies were measured using a stratified five-fold cross validation using a 80%/20% train/validation split. For each architecture, the model trained on the best performing fold was selected for the subsequent local and global analysis stages.

Local explanation: saliency map generation

The saliency maps generated in this work were obtained by extending the original 2D Grad-CAM method proposed by Selvaraju et al.[19](https://www.nature.com/articles/s41598-026-57667-z#ref-CR19 "Selvaraju, R. R. et al. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization. In IEEE International Conference on Computer Vision (ICCV), 618–626 (IEEE Xplore, 2017). https://doi.org/10.1109/ICCV.2017.74

") to our 3D MRI images. The generated saliency maps are class selective and applicable without further architecture adaption, and therefore suitable for comparing and aggregating saliency information across a variety of CNN-architectures. This is a prerequisite for subsequent model selection as well as for extracting the required global explanations. Since all used networks examined in this study are constructed with a single output node, the control class is captured as a so-called counterfactual explanation. For this study, no normalization or intensity scaling of the generated saliency map values was applied. The resulting saliency map resolutions were rescaled to the input image size using nearest neighbor interpolation. The saliency maps used for the further analysis were generated from the validation set of the best-performing fold for each respective CNN-architecture. Since saliency maps based on incorrect image classifications might highlight regions that do not support a correct prediction, images with incorrect classification were generally excluded from the analysis.

Local explanation: saliency map evaluation

To assess the plausibility of individual decisions several metrics were obtained. In a first visual assessment individual saliency maps were averaged per CNN-architecture and per class. A plausible saliency map should be concentrated on the brain area of the image. Ideally, the attention should be focused on confined areas to indicate the differences in the localization of attention between the schizophrenia and the control group.

Next to the visual examination of the generated saliency maps, quantitative metrics such as the mass accuracy and the center of mass were applied. The MA ascertains that plausible classification predictions are based on voxels located within the brain area. A high concentration of attention in other regions, e.g., the image borders, would suggest the presence of a non-identified bias. For the calculation of the MA, the saliency map is compared with a ground truth[18](https://www.nature.com/articles/s41598-026-57667-z#ref-CR18 "Qian, J., Li, H., Wang, J. & He, L. Recent advances in explainable artificial intelligence for magnetic resonance imaging. Diagnostics 13, 1571 (2023)."). The metric depicts the amount of attention within the area of the ground truth in contrast to the sum of attention outside the region of interest. A conservative mask around the brain area including padding was chosen as a ground truth to account for attention blurring caused by the filter and pooling sizes of the last convolutional layer targeted by the Grad-CAM.

As a second evaluation metric, the CoM of each saliency map was calculated in order to assess the differences between schizophrenia and control groups. When looking at a convex-shaped area of attention or a multicentered attention map the CoM cannot capture the true nature of the distribution. Within a homogeneously spread attention map the CoM would be concentrated in the image center. For this reason the calculated CoM can not be interpreted as positions with high network attention. Based on the aforementioned metrics, two CNN architectures were selected for further evaluation.

Global explanations: regions of stable network attention

Though saliency maps provide a good local explanation for individual decisions, they do not provide any insight into systematic, recurring features of the disease. Based on the assumption that structural brain alterations in schizophrenia would influence the classifiers decision and thereby cause stable attention patterns in the networks saliency maps, we conducted a voxel-wise, two-sided t-test. The testing strategy thereby captures the global differences of saliency between schizophrenia patient images and control subjects.

In order to define areas of high consistency and reliability the multi-comparison problem was tackled with a TFCE-Error correction[29](https://www.nature.com/articles/s41598-026-57667-z#ref-CR29 "Smith, S. M. & Nichols, T. E. Threshold-free cluster enhancement: Addressing problems of smoothing, threshold dependence and localisation in cluster inference. NeuroImage 98, 83–98 (2009).") using 20,000 iterations. The resulting p-map was additionally bonferroni corrected with an alpha of 0.0001 and reduced to the biggest connected cluster of voxels to ensure significance. Significant voxels are matched to locally corresponding AAL atlas regions[30](https://www.nature.com/articles/s41598-026-57667-z#ref-CR30 "Tzourio-Mazoyer, B. et al. Automated anatomical labeling of activations in SPM using a macroscopic anatomical parcellation of the MNI MRI single-subject brain. NeuroImage 289, 273–289 (2002)."). Per region the number of hits is counted. During hit counting, no minimal number of hits per region was set. As it is inherently part of a convolution, the information gained through the Grad-CAM method is not precisely localized. Consequently, not only the region associated with the actual voxel position was considered, but also matches of neighboring voxels were counted proportionately. This second hit is evenly distributed over all voxel-adjacent regions. For a more region-size sensitive interpretation, the hit coverage per region was calculated.

In the search of reliable biomarkers, consensual regions with high and stable network attention would suggest a higher probability of a true underlying correlation within the data. Therefore the most stable regions of high network attention after two-staged error correction were intersected and mapped to associated AAL atlas brain regions.

Data availability

The MRI data used in this study is available from the MCIC collection upon request via COINS data sharing website (https://www.nitrc.org/projects/coins/). All pretrained network weights are provided as open-source models by the authors of the cited publications.

References

Tandon, R. et al. The schizophrenia syndrome, circa 2024: What we know and how that informs its nature. _Schizophr Res._264, 1–28 (2024).

Article PubMed Google Scholar

McCutcheon, R. A., Keefe, R. S. E. & McGuire, P. K. Cognitive impairment in schizophrenia: aetiology, pathophysiology, and treatment. _Mol. Psychiatry_. 28, 1902–1918 (2023).

Article PubMed PubMed Central Google Scholar

Abi-Dargham, A. et al. Candidate biomarkers in psychiatric disorders: state of the field. _World Psychiatry_. 22, 236–262 (2023).

Article PubMed PubMed Central Google Scholar

Chen, Z. S. et al. Modern views of machine learning for precision psychiatry. _Patterns_3, 100602 (2022).

Article PubMed PubMed Central Google Scholar

Chen, J., Patil, K. R., Yeo, B. T. T. & Eickhoff, S. B. Leveraging Machine Learning for Gaining Neurobiological and Nosological Insights in Psychiatric Research. _Biol. Psychiatry_. 93, 18–28 (2023).

Article PubMed Google Scholar

Zang, J. et al. Effects of Brain atlases and machine learning methods on the discrimination of schizophrenia patients: A multimodal MRI study. _Front. Neurosci._15, 697168 (2021).

Tavakoli, H., Rostami, R., Shalbaf, R. & Nazem-Zadeh, M. R. Diagnosis of schizophrenia and its subtypes using MRI and machine learning. _Brain Behav._15, e70219 (2025).

Article PubMed PubMed Central Google Scholar

Di Camillo, F. et al. Magnetic resonance imaging–based machine learning classification of schizophrenia spectrum disorders: a meta-analysis. _Psychiatry Clin. Neurosci._78, 732–743 (2024).

Article PubMed PubMed Central Google Scholar

Zhang, J. et al. Detecting schizophrenia with 3D structural brain MRI using deep learning. _Sci. Rep._13, 14433 (2023).

Article ADS CAS PubMed PubMed Central Google Scholar

Smucny, J., Shi, G. & Davidson, I. Deep learning in neuroimaging: overcoming challenges with emerging approaches. _Front Psychiatry_13, 912600 (2022).

Sadeghi, et al. An overview of artificial intelligence techniques for diagnosis of schizophrenia based on magnetic resonance imaging modalities: Methods, challenges, and future works. _Comput. Biol. Med._105554, 105554 (2022).

Article Google Scholar

Rakić, M., Cabezas, M., Kushibar, K., Oliver, A. & Lladó, X. Improving the detection of autism spectrum disorder by combining structural and functional MRI information. _NeuroImage Clin._102181, 102181 (2020).

Article Google Scholar

Sarveswaran, T. & Rajangam, V. An ensemble approach using multidimensional convolutional neural networks in wavelet domain for schizophrenia classification from sMRI data. _Sci. Rep._1025710257. (2025).

Article CAS Google Scholar

Krizhevsky, A., Sutskever, I. & Hinton, G. E. ImageNet classification with deep convolutional neural networks. In _Advances in Neural Information Processing Systems,_ vol. 25 (Curran Associates, Inc., 2012).

Eitel, F., Schulz, M.-A., Seiler, M., Walter, H. & Ritter, K. Promises and pitfalls of deep neural networks in neuroimaging-based psychiatric research. _Exp. Neurol._113608, 113608 (2021).

Article Google Scholar

Zhuang, F. et al. A comprehensive survey on transfer learning. _Proc. IEEE_76, 43–76 (2021).

Article ADS Google Scholar

Allgaier, J., Mulansky, L., Draelos, R. L. & Pryss, R. How does the model make predictions? A systematic literature review on the explainability power of machine learning in healthcare. _Artif. Intell. Med._102616, 102616 (2023).

Article Google Scholar

Qian, J., Li, H., Wang, J. & He, L. Recent advances in explainable artificial intelligence for magnetic resonance imaging. _Diagnostics_13, 1571 (2023).

Article PubMed PubMed Central Google Scholar

Selvaraju, R. R. et al. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization. In _IEEE International Conference on Computer Vision_ (ICCV), 618–626 (IEEE Xplore, 2017).https://doi.org/10.1109/ICCV.2017.74

Hu, M. et al. Structural and diffusion MRI based schizophrenia classification using 2D pretrained and 3D naive convolutional neural networks. _Schizophr. Res._341, 330–341 (2022).

Article Google Scholar

Wen, Y. et al. Bridging structural MRI with cognitive function for individual level classification of early psychosis via deep learning. _Front Psychiatry_13, 1075564 (2023).

Simonyan, K. & Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. In _3rd International Conference on Learning Representations_ (ICLR 2015) 1–14 (Computational and Biological Learning Society, San Diego, 2015).

Oh, J., Oh, B. L., Lee, K. U., Chae, J. H. & Yun, K. Identifying schizophrenia using structural MRI with a deep learning algorithm. _Front Psychiatry_11, 481509 (2020).

Chen, S., Ma, K. & Zheng, Y. Med3D: Transfer Learning for 3D Medical Image Analysis. (2019). http://arxiv.org/abs/1904.00625

Liu, P., Puonti, O., Hu, X. & Alexander, D. C. Brain-ID: Learning Contrast-Agnostic Anatomical Representations for Brain Imaging. In _Computer Vision – ECCV 2024_ (ed. Leonardis, A.) 322–340 (Springer Nature Switzerland, 2024). https://doi.org/10.1007/978-3-031-73254-6_19.

Chapter Google Scholar

Rieke, J., Eitel, F., Weygandt, M., Haynes, J.-D. & Ritter, K. Visualizing Convolutional Networks for MRI-Based Diagnosis of Alzheimer’s Disease. In _Understanding and Interpreting Machine Learning in Medical Image Computing Applications_ (ed. Stoyanov, D.) 24–31 (Springer International Publishing, 2018). https://doi.org/10.1007/978-3-030-02628-8_3.

Chapter Google Scholar

Tran, D. et al. A Closer Look at Spatiotemporal Convolutions for Action Recognition. In _2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition,_ 6450–6459 (2018). https://doi.org/10.1109/CVPR.2018.00675

Gollub, R. L. et al. The MCIC collection: A shared repository of multi-modal, multi-site brain image data from a clinical investigation of schizophrenia. _Neuroinformatics_388, 367–388 (2013).

Article Google Scholar

Smith, S. M. & Nichols, T. E. Threshold-free cluster enhancement: Addressing problems of smoothing, threshold dependence and localisation in cluster inference. _NeuroImage_98, 83–98 (2009).

Article Google Scholar

Tzourio-Mazoyer, B. et al. Automated anatomical labeling of activations in SPM using a macroscopic anatomical parcellation of the MNI MRI single-subject brain. _NeuroImage_289, 273–289 (2002).

Article Google Scholar

Fonov, V. et al. Unbiased average age-appropriate atlases for pediatric studies. _NeuroImage_54, 313–327 (2011).

Article PubMed Google Scholar

Tjoa, E. & Guan, C. ASo. E. A. A survey on explainable artificial intelligence (XAI): Toward medical XAI. _IEEE Trans. Neural Netw. Learn. Syst._4813, 4793–4813 (2021).

Article Google Scholar

Zhang, Y. et al. Grad-CAM helps interpret the deep learning models trained to classify multiple sclerosis types using clinical brain magnetic resonance imaging. _J. Neurosci. Methods_109098, 109098 (2021).

Article Google Scholar

van Erp, T. G. M. et al. Cortical brain abnormalities in 4474 individuals with schizophrenia and 5098 control subjects via the Enhancing Neuro Imaging Genetics Through Meta Analysis (ENIGMA) consortium. _Biol. Psychiatry_84, 644–654 (2018).

Article PubMed PubMed Central Google Scholar

Picó-Pérez, M. et al. Multimodal meta-analysis of structural gray matter, neurocognitive and social cognitive fMRI findings in schizophrenia patients. _Psychol. Med._52, 614–624 (2022).

Article PubMed Google Scholar

Dabiri, M. et al. Neuroimaging in schizophrenia: A review article. _Front Neurosci_16, 1042814 (2022).

Keshavan, M. S. et al. Neuroimaging in schizophrenia. _Neuroimaging Clin. N Am._30, 73–83 (2020).

Article PubMed Google Scholar

Howes, O. D., Cummings, C., Chapman, G. E. & Shatalina, E. Neuroimaging in schizophrenia: an overview of findings and their implications for synaptic changes. _Neuropsychopharmacol. Off Publ Am. Coll. Neuropsychopharmacol._48, 151–167 (2023).

Article Google Scholar

Mubarik, A. & Tohid, H. Frontal lobe alterations in schizophrenia: a review. _Trends Psychiatry Psychother._38, 198–206 (2016).

Article PubMed Google Scholar

Kaur, A. et al. Structural and functional alterations of the temporal lobe in schizophrenia: A literature review. _Cureus_12, e11177 (2020).

PubMed PubMed Central Google Scholar

Ohi, K. et al. Structural alterations of the superior temporal gyrus in schizophrenia: Detailed subregional differences. _Eur. Psychiatry_. 35, 25–31 (2016).

Article CAS PubMed Google Scholar

Okada, N. et al. Subcortical volumetric alterations in four major psychiatric disorders: a mega-analysis study of 5604 subjects and a volumetric data-driven approach for classification. _Mol. Psychiatry_. 28, 5206–5216 (2023).

Article PubMed PubMed Central Google Scholar

Kittleson, A. R., Woodward, N. D., Heckers, S. & Sheffield, J. M. The insula: Leveraging cellular and systems-level research to better understand its roles in health and schizophrenia. _Neurosci. Biobehav Rev._160, 105643 (2024).

Article PubMed PubMed Central Google Scholar

Kittleson, A. R. et al. A 2-year longitudinal investigation of insula subregional volumes in early psychosis.https://doi.org/10.1101/2024.11.25.24317916 (2024).

Zhang, C., Bengio, S., Hardt, M. & Recht, B. & Vinyals, O. Understanding deep learning requires rethinking generalization. In _International Conference on Learning Representations_ (2017).

American Psychiatric Association. _Diagnostic and Statistical Manual of Mental Disorders: DSM-IV_ (American Psychiatric Association, 1994).

Gorgolewski, K. J. et al. BIDS apps: Improving ease of use, accessibility, and reproducibility of neuroimaging data analysis methods. _PLoS Comput. Biol._13, e1005209 (2017).

Article PubMed PubMed Central Google Scholar

Gorgolewski, K. et al. Nipype: A flexible, lightweight and extensible neuroimaging data processing framework in python. _Front Neuroinform._5, 13 (2011).

Download references

Funding

Open Access funding enabled and organized by Projekt DEAL. No funding.

Author information

Author notes

These authors contributed equally: Alexandra Reichenbach and Alexander Windberger.

Authors and Affiliations

Center for Machine Learning, Heilbronn University, Heilbronn, Germany

Julia Jelitzki&Alexandra Reichenbach

Medical Faculty Heidelberg, University of Heidelberg, Heidelberg, Germany

Julia Jelitzki&Alexandra Reichenbach

Faculty of Informatics, Heilbronn University, Heilbronn, Germany

Alexander Windberger

Authors

Julia Jelitzki
Alexandra Reichenbach
Alexander Windberger

Contributions

All authors designed the research; J.J. perfomed the research, analyzed the data, prepared the figures, and wrote the first draft of the manuscript; A.R. and A.W. supervised the research and edited the manuscript. All authors reviewed the final manuscript.

Corresponding author

Correspondence to Alexandra Reichenbach.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Jelitzki, J., Reichenbach, A. & Windberger, A. Decision processes in 3D structural MRI schizophrenia classification evaluated with saliency maps. _Sci Rep_16, 18362 (2026). https://doi.org/10.1038/s41598-026-57667-z

Download citation

Received: 23 June 2025

Accepted: 09 June 2026

Published: 13 June 2026

Version of record: 13 June 2026

DOI: https://doi.org/10.1038/s41598-026-57667-z

Keywords

这篇还没有中文全文

该条目暂未提供中文翻译。标题/摘要已自动中译;本系统只对人工挑选的内容生成全文翻译。

挑中后 → markitdown 取正文 → 精翻 → 此处切换为译文