5arXiv cs.LG (Machine Learning)·22d ago

SchGen: LLM-Based PCB Schematic Generation via Semantic Code Representations

SchGen is presented as the first large language model system capable of generating editable PCB schematics from natural-language descriptions. The approach introduces a semantically grounded code representation that replaces verbose, geometry-heavy schematic formats with relative placement and pin-name-based wiring primitives, reframing the problem as a semantics-driven matching task. A large-scale dataset was constructed via a human-agent collaborative pipeline converting open-source hardware designs into the new representation. Experiments show SchGen outperforms alternative representations and larger general-purpose LLMs on wire connectivity accuracy and functional correctness.

Frontier Model Releases Agent and Tool Ecosystem semantic code representation human-agent collaborative pipeline wire connectivity accuracy PCB schematic generation SchGen

Related guides (2)

Frontier Model ReleasesTopic guide

Frontier Model Releases: The Race From Language to Action

Read asBeginner In-depth

Agent and Tool EcosystemTopic guide

Agent and Tool Ecosystem: How AI Is Learning to Act, Not Just Answer

Read asBeginner In-depth

Related events (8)

4arXiv · cs.CL·1mo ago·source ↗

LLM-Based Grammar Adaptation for Metamodel-Grammar Co-Evolution in Model-Driven Engineering

This paper proposes using LLMs to automate grammar adaptation when metamodels evolve in model-driven engineering, replacing tedious manual work and outperforming rule-based methods. Evaluated on six real-world Xtext DSLs using Claude Sonnet 4.5, ChatGPT 5.1, and Gemini 3, all three LLMs achieved 100% adaptation consistency on test DSLs versus 62-84% for rule-based approaches. A longitudinal study on QVTo showed LLMs successfully reused learned adaptations across all evolution steps without manual editing. However, on large-scale grammars (EAST-ADL, 297 rules), LLM adaptation consistency dropped well below 90%, revealing a scalability limitation.

Agent and Tool Ecosystem Xtext Claude Sonnet 4.5 QVTo +3 more

6Hugging Face Blog·1mo ago·source ↗

StarCoder: A State-of-the-Art LLM for Code

Hugging Face and ServiceNow released StarCoder, a large language model for code trained on permissively licensed data from The Stack dataset. The model targets code generation, completion, and understanding tasks and is positioned as an open-weights alternative to proprietary code models. The release includes model weights, training details, and an associated technical report.

Open Weights Progress Agent and Tool Ecosystem ServiceNow AI BigCode The Stack v2 +2 more

5arXiv · cs.CL·19d ago·source ↗

PowerCodeBench: Knowledge Boundary Probing and Intervention for LLM-Based Power System Code Generation

This paper introduces PowerCodeBench, an execution-validated benchmark for evaluating LLMs on power-system simulation code generation using the pandapower library. The authors identify that failures are dominated by API-knowledge boundary errors (hallucinated function names, misused parameters) rather than reasoning failures, and propose a boundary-aware intervention combining API demand estimation with targeted documentation injection. Evaluated across ten open-weight models (1.5B–480B) and four commercial APIs on 2,000 tasks, the intervention yields 32–56 accuracy point improvements while using only 41% of baseline prompt-token cost. Open-weight models in the 70B–120B range match commercial mid-tier accuracy, with Llama-3.1-405B and Qwen3-Coder-480B leading.

Evaluation and Benchmarking Open Weights Progress pandapower Meta Llama 3.1 405B Alibaba +7 more

4arXiv · cs.AI·10d ago·source ↗

SECDA-DSE: LLM-guided design space exploration for FPGA accelerator generation

SECDA-DSE is a framework that integrates LLMs into the SECDA hardware-software co-design ecosystem to automate design space exploration (DSE) of FPGA-based AI accelerators. The system combines a structured architecture candidate generator with an LLM Stack using retrieval-augmented generation and chain-of-thought prompting, plus an iterative feedback loop. Evaluation demonstrates end-to-end synthesis and execution of three accelerator designs on real FPGA hardware, with results showing the approach captures kernel-specific compute/memory trade-offs while reducing manual design effort.

Training Infrastructure Agent and Tool Ecosystem chain-of-thought prompting SECDA-DSE Retrieval-Augmented Generation

6arXiv · cs.CL·9d ago·source ↗

ModSleuth: Agentic system audits invisible dependency graphs in modern LLM training pipelines

Researchers introduce ModSleuth, an agentic system that recursively reconstructs LLM dependency graphs from public artifacts, recovering 1,060 source-verified dependencies across four major LLM releases. The system formalizes direct and indirect dependencies and operation-centered relationships to handle fragmented, inconsistent documentation. Applied at scale, the resulting graphs expose multi-hop license obligations, train-evaluation coupling, and discrepancies between released and training-time artifacts — issues that are practically invisible to manual auditing.

Evaluation and Benchmarking AI Safety Research ModSleuth Which Models Are Our Models Built On? Auditing Invisible Dependencies in Modern LLMs

5arXiv · cs.AI·11d ago·source ↗

FASE: Fast Adaptive Semantic Entropy for uncertainty quantification in multi-agent code generation

Researchers introduce Fast Adaptive Semantic Entropy (FASE), a metric for approximating functional correctness in LLM-generated code using minimum spanning trees of structural and semantic dissimilarity graphs, replacing costly LLM-driven equivalence checks. Evaluated on HumanEval and BigCodeBench with Qwen3-Embedding-8B, FASE achieves a 25% improvement in Spearman correlation and 19% increase in ROCAUC over prior semantic entropy methods. Critically, it requires only ~0.3% of the runtime cost of traditional semantic entropy approaches, making it practical for real-world multi-agent workflows.

Evaluation and Benchmarking Agent and Tool Ecosystem Qwen3 Embedding Fast Adaptive Semantic Entropy BigCodeBench +1 more

5Openai Blog·1mo ago·source ↗

A Hazard Analysis Framework for Code Synthesis Large Language Models

OpenAI published a hazard analysis framework specifically targeting code synthesis LLMs, addressing the safety and risk dimensions of models that generate executable code. The framework likely identifies threat categories, failure modes, and mitigation strategies relevant to deploying code-generating AI systems. This represents an early structured attempt to apply safety engineering methodology to a specific LLM capability domain. The work is relevant to both AI safety research and enterprise deployment considerations for coding assistants.

AI Safety Research Agent and Tool Ecosystem hazard analysis framework code synthesis LLMs OpenAI

6Hugging Face Blog·1mo ago·source ↗

CodeGemma - Google's Official Code-Focused LLM Release

Google has released CodeGemma, a family of code-specialized large language models, announced via the Hugging Face blog. CodeGemma builds on the Gemma model family and is targeted at code generation and understanding tasks. The release represents Google's continued push into open-weights code LLMs to compete with models like Code Llama and DeepSeek Coder.

Frontier Model Releases Open Weights Progress Gemma Code Llama Google +4 more