Almanac
technique

NP-Hard

techniqueactivenp-hard-739234e1·1 events·first seen 1mo ago

Aliases: NP-Hard

Co-occurring entities

More like this (12)

Recent events (1)

5Hugging Face Blog·1mo ago·source ↗

NPHardEval Leaderboard: Benchmarking LLM Reasoning via Computational Complexity Classes

The NPHardEval leaderboard evaluates large language models on reasoning tasks drawn from computational complexity classes (P, NP, NP-Hard), providing a structured framework for assessing algorithmic reasoning capabilities. The benchmark uses dynamic problem updates to mitigate data contamination, a persistent challenge in static benchmarks. Results are hosted on Hugging Face and aim to reveal systematic differences in how frontier models handle problems of varying computational hardness.