Almanac
product

Tree-sitter

productactiveprovisionaltree-sitter-52a82573·1 events·first seen 5d ago

Aliases: Tree-sitter

Co-occurring entities

More like this (12)

Recent events (1)

6arXiv · cs.CL·5d ago·source ↗

Weave of Formal Thought: Sound-and-complete constrained decoding with learned latent syntax for code LLMs

The paper introduces Weave of Formal Thought (WoFT), a framework combining a formally sound-and-complete constrained decoder for code generation with a latent-variable fine-tuning method that teaches LLMs to interleave grammar non-terminals during generation. The constrained decoder extends generalized LR (GLR) parsing with speculative lexing to handle context-sensitive lexing and maximal-munch tokenization, addressing gaps in prior constrained-decoding work. A reweighted wake-sleep (RWS) fine-tuning objective on StarCoder2-3B achieves a 14.3% relative reduction in per-token cross-entropy over a text-only SFT baseline on Python, suggesting that explicit structural scaffolding recovers information lost in flat autoregressive training.