Interpretability

Interpretability Via Grad-Cams, Eye-Tracking, and Odds-Ratios

Interpretability refers to the ability to explain the mechanisms behind an AI's decision-making in order for clinicians to build trust in such AI systems. In our work, we have achieved interpretability via the following three methods:

  1. We visualized the regions of importance in medical images that are used by AI for making classifications. Such visualization techniques include Gradient-Weighted Class Activation Maps (Grad-CAMs); an example Grad-CAM heatmap of a full OCT report is shown at right in the above image (red/yellow colors highlight regions most important for classification, while blue/violet colors are least important). Read more here: Thakoor, K.A., Li, X., Tsamis, E., Sajda, P. and Hood, D.C. “Enhancing the Accuracy of Glaucoma Detection from OCT Probability Maps using Convolutional Neural Networks”. In 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 2036‑2040, 2019.
  2. We compared the eye movements of clinical experts on medical images during diagnoses with the concepts/image regions used by deep learning systems to make classifications. When there is alignment between AI concepts and expert eye movements, there is enhanced AI interpretability and trust between human experts and AI systems. When there is disagreement, there is potential for the AI to inform the expert of novel features useful for clinical diagnosis. Going forward, we aim to train AI systems with expert eye movements to constrain and inform such systems to enhance their efficiency, accuracy, and interpretability. Read more here: Thakoor, K., Koorathota, S., Hood, D., Sajda, P. “Robust and Interpretable Convolutional Neural Networks to Detect Glaucoma in Optical Coherence Tomography Images.” IEEE Transactions on Biomedical Engineering, 68(8), pp. 2456‑2466, August 2021. Early Access: https://ieeexplore.ieee.org/document/9286420, 8 December 2020.
  3. Using Fisher's Exact test and 3D CNNs optimized for multi-modal input (OCT and OCTA), we classified presence or absence of 5 key features associated with late stages of AMD. We then ranked the strength of association of these 5 clinical features with the presence of late stages of AMD (non-neovascular, or 'dry', AMD or neovascular, or 'wet', AMD). We found alignment across all 5 features for experts and AI when evaluating the strength of association of the features with occurence of dry AMD. However, for wet AMD occurrence, experts and AI agreed only on the strength of association of CNV (Choroidal Neovacularization). The disagreement between AI and experts for the remaining features suggests potential to find new AMD features of importance as well as future studies focused on AI-based segmentation/localization of features beyond global detection of feature presence/absence. Read more here: Thakoor, K.A., Yao, J., Bordbar, D., Moussa, O., Lin, W., Sajda, P., Chen, R. “A Multimodal Deep Learning System to Distinguish Late Stages of AMD and to Compare Expert vs. AI Ocular Biomarkers.” Scientific Reports, 12(1), p.1‑11, 2022.