Beyond Prediction Accuracy: Target-Space Recovery Profiles for Evaluating Model-Brain Alignment
This paper introduces a framework for evaluating alignment between artificial vision models and the human visual cortex that goes beyond scalar prediction accuracy. Using repeated fMRI data from the Natural Scenes Dataset, the authors decompose brain response spaces into reproducible dimensions and measure which of these dimensions are recovered by model predictions. A key finding is that pretrained and randomly initialized models can achieve similar prediction accuracy while showing distinct recovery profiles, revealing that accuracy alone can mask fundamental model-brain mismatches. The framework also enables brain-to-brain comparisons as a diagnostic human reference baseline.
Related guides (2)
Related events (8)
Joint Energy-Based Models Reveal a Generative-Discriminative Sweet Spot for Human-Aligned Vision
Researchers use Joint Energy-Based Models (JEMs) to isolate the effect of learning objective—independent of architecture, scale, and data—on human alignment in visual representations. By varying a single mixing coefficient between discriminative and generative training, they evaluate models across six human-alignment benchmarks and find that alignment peaks at intermediate points on the generative-discriminative continuum rather than at either extreme. The results suggest that hybrid objectives combining categorical structure from discriminative learning with input-structure sensitivity from generative learning yield the most human-like visual behavior. This challenges the framing of generative vs. discriminative as a binary choice for building human-aligned vision systems.
Label-Free Bias Identification in Vision Models via Gradient Probes on Concept Decompositions
This paper introduces a post-hoc, label-free method for identifying spurious correlations in frozen vision classifiers without requiring bias annotations, group labels, or retraining. The approach applies non-negative matrix factorization to intermediate activations to extract interpretable concept vectors, then ranks them using a gradient-based bias estimator derived from misclassified examples. On Colored MNIST, Waterbirds, and CelebA benchmarks, the method recovers known spurious cues and improves worst-group accuracy by up to 17.9 percentage points on Waterbirds by suppressing top-ranked concepts at inference time. Notably, the method surfaces decision-relevant directions that do not always coincide with annotated attributes, offering both an auditing tool and a debiasing handle for deployed models.
Topo-Omni: Topographic multimodal model discovers functionally selective brain regions consistent with human neuroimaging
Researchers introduce Topo-Omni, a topographic multimodal model that jointly represents visual, auditory, and language/cognitive processing on a single contiguous in-silico cortical sheet, built by fine-tuning a pretrained foundation model with a spatial smoothness objective. The model develops clusters consistent with human neuroimaging data, and driving or suppressing clusters selectively biases or impairs perception in ways that parallel human intervention studies. The authors use the model to screen for novel cortical networks in-silico and validate discoveries — including natural landscape and animal networks — in human neuroimaging data. The work bridges deep learning architectures and computational neuroscience, offering testable hypotheses about cortical organization.
Meta Introduces TRIBE v2: Predictive Foundation Model for Human Brain Activity
Meta AI has released TRIBE v2, a foundation model that predicts high-resolution fMRI brain activity in response to visual, auditory, and language stimuli. Trained on data from over 700 healthy volunteers, it achieves a 70x resolution increase over comparable models and supports zero-shot generalization to new subjects, languages, and tasks. The release includes model weights, codebase, a research paper, and an interactive demo under a CC BY-NC license. Meta positions the work as a bridge between neuroscience and AI development, enabling hypothesis testing without requiring human subjects in every experiment.
MC Dropout uncertainty estimation masks sub-region calibration failures in brain tumour segmentation
A preprint from arXiv evaluates Monte Carlo Dropout for voxel-level uncertainty estimation in glioma segmentation on 126 BraTS21 patients, comparing a pretrained SegResNet and a locally trained UNet-Res. While global uncertainty-error alignment is strong (AUROC ~0.97), the study finds that UNet-Res exhibits near-zero entropy and an ECE of 0.915 on the enhancing tumour sub-region despite a Dice of only 0.714, a severe miscalibration invisible to standard Dice and AUROC metrics. The paper argues that sub-region-specific calibration assessment is necessary for clinical safety and cannot be replaced by aggregate metrics alone.
MedFocus: Causal Visual Attribution Framework for Chest X-ray Reasoning in Large Vision-Language Models
This paper addresses the faithfulness of visual attribution methods in Large Vision-Language Models (LVLMs) applied to chest X-ray (CXR) analysis. The authors develop a causal evaluation framework using counterfactual editing to verify whether expert-annotated regions are causally responsible for model predictions, testing 11 attribution methods across six open-source LVLMs. Finding that existing attribution methods frequently fail to identify the actual visual evidence used by models, they propose MedFocus, a concept-based attribution method using unbalanced optimal transport to localize anatomical regions and measure their causal effect on outputs. MedFocus substantially outperforms prior methods and provides spatial, concept-level, and token-level attributions.
Phase diagram framework for choosing between cross-modal alignment and prediction in multimodal learning
A new arXiv preprint develops a unified linear framework to determine when cross-modal alignment (CA) versus cross-modal prediction (CP) is the better objective for multimodal representation learning. Under a spiked signal-plus-noise model, the authors derive separation ratios that expose complementary failure modes for each paradigm, producing a four-regime phase diagram (Both, CA only, CP only, Neither). A data-driven procedure lets practitioners locate their dataset in this diagram using a small labeled subsample before committing to training. Experiments on synthetic data, stereo-vision, image-caption pairs, and astrophysical data validate the framework, including a 'Neither' regime where cross-modal training is actively harmful.
VLMs May Not Globally Enhance Human Alignment over LLMs During Natural Reading
This paper compares matched LLM and VLM pairs in a text-only setting to isolate the effect of multimodal training history on human-like language processing. Using whole-cortex fMRI and eye-tracking data from natural reading, the authors find that multimodal pretraining does not confer a uniform global advantage in human alignment. However, VLMs show selective advantages when sentences contain stronger visual semantic content, with converging evidence from both neural and behavioral measures. The findings suggest language-internal representations remain the primary driver of human text processing alignment.

