Entity · benchmark

OpenML

benchmarkactiveopenml-65780468·1 events·first seen May 19, 2026

Aliases: OpenML

Co-occurring entities

Q-statistic Greedy Ensemble Selection Friedman-Nemenyi Test Cascade Stacking Tabular Foundation Models (TFMs)

More like this (12)

Core ML Concrete ML OpenMed MLX OpenVINO Open Compute Project ONNX MLsys 2026 ICML MAML OLMo OpenMLE

Recent events (1)

5arXiv · cs.AI·May 19, 2026·source ↗

Ensembling Tabular Foundation Models: A Diversity Ceiling and a Calibration Trap

This paper benchmarks six ensemble strategies across six tabular foundation models (TFMs) on 153 OpenML classification tasks, finding that ensembling provides minimal gains over the best single TFM. The best ensemble strategy (two-level cascade stacking) achieves only +0.18% accuracy improvement at 253× the compute cost. A key finding is that logistic-regression meta-learner stacking improves accuracy while severely degrading calibration (log-loss), because sharpening class boundaries destroys probability estimates. The authors recommend greedy ensemble selection as the practical default.

Evaluation and Benchmarking Enterprise Deployment Patterns Q-statistic Greedy Ensemble Selection Friedman-Nemenyi Test +3 more