Almanac
benchmark

TPC-DS

benchmarkactiveprovisionaltpc-ds-aa6da011·1 events·first seen 14d ago

Aliases: TPC-DS

Co-occurring entities

More like this (12)

Recent events (1)

4arXiv · cs.LG·14d ago·source ↗

MLSkip: Data skipping for ML filter predicates using Parquet metadata and neural network verification

MLSkip introduces data skipping techniques for ML-based filter predicates in databases, a problem not addressed by traditional min-max pruning methods. The approach leverages Parquet's existing min-max metadata combined with neural network verification techniques to prune non-qualifying row groups. On TPC-H and TPC-DS benchmarks with ReLU architectures, the method achieves 27.4% average pruning effectiveness for low-selectivity filters, improving to 38.31% with a proposed 2D convex hull metadata structure, yielding a 1.07× end-to-end speedup in DuckDB over PyTorch.