Almanac
benchmark

Spider 2.0

benchmarkactiveprovisionalspider-2-0-b599f8b2·1 events·first seen 6d ago

Aliases: Spider 2.0

Co-occurring entities

More like this (12)

Recent events (1)

5arXiv · cs.AI·6d ago·source ↗

TAHOE: Error-driven hint learning system substantially improves Text-to-SQL on Spider 2.0

TAHOE is a Text-to-SQL system that treats prompt optimization as a dynamic data management problem, building a structured Hint Bank from compiler, execution, and user feedback without updating model parameters. On the Spider 2.0-Snow benchmark using GPT-5.5, it raises pass rate from 61.95% to 79.42% and achieves 100% Snowflake syntax compliance while reducing compiler-feedback rounds from 2.79 to 0.12. The learned Hint Bank transfers to weaker models, yielding a 19.7 percentage-point gain on Doubao-2.0-lite. The approach targets the production deployment gap between Text-to-SQL prototypes and real-world database environments with strict dialects and large schemas.