Almanac
organization

Korea Proprietary AI Foundation Model Program

organizationactiveprovisionalkorea-proprietary-ai-foundation-model-program-bffffc49·1 events·first seen 15d ago

Aliases: Korea Proprietary AI Foundation Model Program

Co-occurring entities

More like this (12)

Recent events (1)

5arXiv · cs.CL·15d ago·source ↗

K-BrowseComp: A Web Browsing Agent Benchmark Grounded in Korean Contexts

K-BrowseComp is a new 400-problem benchmark for evaluating web-browsing agents in Korean-language contexts, with a 300-problem manually validated subset and a 100-problem adversarially constructed synthetic split. Frontier models including GPT-5.5, DeepSeek-V4-Pro, and GLM-5.1 achieve only 30–46% on the verified subset, a significant drop from English BrowseComp performance, while Korean proprietary models score 0–10%. The benchmark exploits the asymmetry between problem creation and solving difficulty, and the adversarial synthetic split caps the strongest model at 26%, positioning it as a targeted stress test for agentic web-browsing capability.