Entity · benchmark

SWE-Explore

benchmarkactiveswe-explore-074f8545·1 events·first seen Jun 8, 2026

Aliases: SWE-Explore

Co-occurring entities

SWE-bench

More like this (12)

SWE-Interact SWE-Pro SWE-1.7 Open-SWE SWE-Perf FrontierSWE SWE-Agent SWE-Smith DeepSWE SWE-fficiency SWE-Gym SWE-bench

Recent events (1)

5arXiv · cs.CL·Jun 8, 2026·source ↗

SWE-Explore: New benchmark isolates repository exploration capability in coding agents

SWE-Explore is a new benchmark targeting repository exploration as a distinct, fine-grained capability of coding agents, separate from end-to-end task resolution. It covers 848 issues across 10 programming languages and 203 open-source repositories, with line-level ground truth derived from successful agent trajectories. Evaluation across retrieval methods, coding agents, and specialized localizers finds that agentic explorers outperform classical retrieval, and that line-level coverage and efficient ranking remain the key differentiators at the frontier. The benchmark addresses a gap in SWE-bench-style evaluations that treat task resolution as a binary outcome.

Evaluation and Benchmarking Agent and Tool Ecosystem SWE-Explore SWE-bench