FrontisAI
frontisai-7c12a11b·1 events·first seen 4h agoAliases: FrontisAI
Co-occurring entities
More like this (12)
Recent events (1)
EnterpriseClawBench: A benchmark for enterprise agents derived from real workplace sessions
Researchers introduce EnterpriseClawBench, an enterprise agent benchmark constructed from proprietary real-world workplace sessions, yielding 852 reproducible tasks with fixtures, prompts, role classes, skill subclasses, and semantic rubrics. Because the sessions contain internal enterprise content, the benchmark data is not publicly released, but the construction and evaluation protocol is the reusable contribution. The best evaluated configuration (Codex with GPT-5.5) achieves only 0.663, indicating substantial headroom. The paper argues enterprise agent evaluation must report harness-model combinations, artifact delivery, visual quality, cost, runtime, and skill-transfer behavior rather than collapsing to a single score.