Almanac
dataset

ImageNet100

datasetactiveprovisionalimagenet100-ce140319·1 events·first seen 7h ago

Aliases: ImageNet100

Co-occurring entities

More like this (12)

Recent events (1)

5arXiv · cs.LG·7h ago·source ↗

Large-scale benchmarking finds dataset distillation methods fail to outperform coresets on ImageNet-scale tasks

A new arXiv paper critically evaluates seven state-of-the-art dataset distillation (DD) methods against coreset selection (CS) strategies using standardized protocols on ImageNet-1K, ImageNet100, and ImageNette. Results show that some DD methods fail to beat random subsets, and SOTA DD approaches are comparable to or worse than coresets on large-scale datasets while incurring substantially higher construction costs. The paper also finds coresets achieve better coverage of the original data distribution in terms of representativeness and diversity, challenging the prevailing assumption that synthetic samples are inherently more expressive than real-data subsets.