paper
MLSkip: Data Skipping for ML Filters via Lightweight Metadata
paperactiveprovisional
mlskip-data-skipping-for-ml-filters-via-lightweight-metadata-2e2885f0·1 events·first seen 14d agoAliases: MLSkip: Data Skipping for ML Filters via Lightweight Metadata
Co-occurring entities
More like this (12)
Recent events (1)
MLSkip: Data skipping for ML filter predicates using Parquet metadata and neural network verification
MLSkip introduces data skipping techniques for ML-based filter predicates in databases, a problem not addressed by traditional min-max pruning methods. The approach leverages Parquet's existing min-max metadata combined with neural network verification techniques to prune non-qualifying row groups. On TPC-H and TPC-DS benchmarks with ReLU architectures, the method achieves 27.4% average pruning effectiveness for low-selectivity filters, improving to 38.31% with a proposed 2D convex hull metadata structure, yielding a 1.07× end-to-end speedup in DuckDB over PyTorch.