paper
Data Selection Through Iterative Self-Filtering for Vision-Language Settings
paperactiveprovisional
data-selection-through-iterative-self-filtering-for-vision-language-settings-bfa887b3·1 events·first seen 43h agoAliases: Data Selection Through Iterative Self-Filtering for Vision-Language Settings
Co-occurring entities
More like this (12)
Leveraging Audio-LLMs to Filter Speech-to-Speech Training Datacontrastive vision-language pretrainingTempoVLA: Learning Speed-Controllable Vision-Language-Action PoliciesReroute, Don't Remove: Recoverable Visual Token Routing for Vision-Language ModelsVision-Language Modelsvisual language modelModeling Complex Behaviors: Multi-Personality Composition and Dynamic Switching in Vision-Language ModelsRECALL: Recovery Experience Collection for Active Lifelong Learning in Vision-Language-Action ModelsLabVLA: Grounding Vision-Language-Action Models in Scientific LaboratoriesConnecting Speech to Words through ImagesTraining-Free Semantic Correction for Autoregressive Visual Modelsvision-language grounding
Recent events (1)
Self-Filtering: Iterative bootstrapped data selection for vision-language model training
Researchers propose Self-Filtering, a bootstrapped data curation method for vision-language models in which a CLIP model iteratively trains on and re-selects its own training data. The approach alternates between filtering high-confidence clean samples and preserving distributional diversity, without requiring curated reference datasets or pre-trained external models. Experiments show downstream performance improvements over standard noisy training pipelines.