paper
Who Needs Labels? Adapting Vision Foundation Models With the Metadata You Already Have
paperactiveprovisional
who-needs-labels-adapting-vision-foundation-models-with-the-metadata-you-already-have-76a6d1d2·1 events·first seen 13d agoAliases: Who Needs Labels? Adapting Vision Foundation Models With the Metadata You Already Have
Co-occurring entities
More like this (12)
Vision-Language ModelsSoft Label SupervisionLabVLA: Grounding Vision-Language-Action Models in Scientific LaboratoriesScaling LLM Reasoning from Minimal Labels: A Semi-Supervised Framework with a Lightweight VerifierMLSkip: Data Skipping for ML Filters via Lightweight MetadataGaze Heads: How VLMs Look at What They DescribeGaze Heads: How VLMs Look at What They DescribeVisual Geometry Foundation ModelsDSR Foundation ModelPhantoms and Disclosures: a Causal Framework for Auditing Synthetic DataData Journalist Agent: Transforming Data into Verifiable Multimodal StoriesAnalytics-Everywhere-Lab
Recent events (1)
FINO: Label-free adaptation of vision foundation models using metadata in scientific domains
Researchers propose FINO, a self-supervised method for adapting vision foundation models to specialized scientific domains without task labels, using metadata as a guidance signal instead. The approach combines a standard self-supervised objective with flexible handling of both discrete and continuous metadata to preserve informative factors while suppressing spurious ones. Evaluated across subcellular fluorescence microscopy, Earth observation, wildlife monitoring, and medical imaging, FINO outperforms both unsupervised domain adaptation and fully supervised fine-tuning, including domain-specific state-of-the-art models.