Entity · product

pandapower

productactivepandapower-4003aa33·1 events·first seen Jun 1, 2026

Aliases: pandapower

Co-occurring entities

Meta Llama 3.1 405B Alibaba Qwen3-Coder-480B-A35B-Instruct API knowledge boundary probing Meta PowerCodeBench proactive documentation injection

More like this (12)

power distribution PandaOmics PowerCodeBench power-law scaling G*Power datacenter power delivery hierarchy Digital Pantheon page-agent lightningpixel TurboPuffer power oversubscription pydantic-ai

Recent events (1)

5arXiv · cs.CL·Jun 1, 2026·source ↗

PowerCodeBench: Knowledge Boundary Probing and Intervention for LLM-Based Power System Code Generation

This paper introduces PowerCodeBench, an execution-validated benchmark for evaluating LLMs on power-system simulation code generation using the pandapower library. The authors identify that failures are dominated by API-knowledge boundary errors (hallucinated function names, misused parameters) rather than reasoning failures, and propose a boundary-aware intervention combining API demand estimation with targeted documentation injection. Evaluated across ten open-weight models (1.5B–480B) and four commercial APIs on 2,000 tasks, the intervention yields 32–56 accuracy point improvements while using only 41% of baseline prompt-token cost. Open-weight models in the 70B–120B range match commercial mid-tier accuracy, with Llama-3.1-405B and Qwen3-Coder-480B leading.

Evaluation and Benchmarking Open Weights Progress pandapower Meta Llama 3.1 405B Alibaba +7 more