6arXiv cs.CL (Computation and Language)·2d ago

Negation-capable fuzzy logic FFN replacement yields interpretable grammatical licensing detectors in transformers

Researchers propose replacing the standard transformer feed-forward sublayer with explicit fuzzy set operations (intersection and set-difference), creating a negation-capable FFN (NC-FFN) whose hidden units carry interpretable logical form. At 125M scale on OpenWebText, NC-FFN matches GELU baseline perplexity while remaining legible by construction. Adding soft sequence quantifiers with learned forgetting rates recovers grammatical licensing deficits and produces units that detectably fire on grammatical licensors (comparatives, passive participles, negative-polarity items) without dictionary learning. The work advances mechanistic interpretability by providing a parameter-neutral architecture whose computations are readable as grammatical mechanisms.

Evaluation and Benchmarking LAMBADA OpenWebText Explicit Fuzzy Logic in the Feed-Forward Layer: Self-Forgetting Quantifiers Discover Legible Grammatical-Licensing Detectors NC-FFN

Related guides (1)

Evaluation and BenchmarkingTopic guide

AI Evaluation and Benchmarking: From Leaderboards to the Limits of Measurement

Read asBeginner In-depth

Related events (8)

6arXiv · cs.CL·Jun 5, 2026·source ↗

NF-CoT: Latent reasoning with normalizing flows preserves autoregressive LLM advantages

Researchers propose NF-CoT, a latent reasoning framework that replaces discrete chain-of-thought token streams with continuous intermediate states modeled by normalizing flows embedded inside an LLM backbone. The approach uses a TARFlow-style normalizing flow head alongside the standard language model head, enabling exact likelihoods, KV-cache-compatible left-to-right decoding, and policy-gradient optimization in latent space. On code-generation benchmarks, NF-CoT improves pass rates over both explicit CoT and prior latent-reasoning baselines while reducing intermediate reasoning cost. The work addresses a key limitation of existing latent reasoning methods, which typically sacrifice probabilistic tractability or autoregressive compatibility.

Inference Economics Alignment and RLHF TARFlow NF-CoT Latent Reasoning with Normalizing Flows

3arXiv · cs.CL·Jun 16, 2026·source ↗

Revisiting LLM systematicity in negation understanding via in-context learning

A new arXiv preprint analyzes how well large language models handle negation from two angles: behavioral systematicity (whether models correctly recognize negation expressions and scope) and representational systematicity (whether function vectors can be reliably constructed from in-context examples). Results show LLMs partially succeed at negation cue recognition via in-context learning but struggle with scope recognition, with performance varying by output format. Function vectors can be composed for cue extraction but are harder to extract for scope recognition tasks.

Evaluation and Benchmarking Revisiting the Systematicity in Negation in the Era of In-Context Learning

5arXiv · cs.CL·Jun 24, 2026·source ↗

FMLM+ introduces Posterior Refinement for fast non-autoregressive language generation

Researchers introduce FMLM+, a framework combining Flow Map Language Models with masking-style noise schedules to enable joint sequence generation with per-token global consistency scoring. The key contribution is Posterior Refinement, an inference-time self-correction strategy that matches discrete baseline performance with 32x fewer neural function evaluations (NFEs). The approach improves the speed-quality tradeoff over both Masked Diffusion Models and standard FLMMs across multiple benchmarks, addressing longstanding factorization error problems in non-autoregressive generation.

Frontier Model Releases Inference Economics Posterior Refinement Flow Map Language Models FMLM++2 more

5arXiv · cs.AI·Jun 17, 2026·source ↗

Fixed-Point Reasoning Model (FPRM): Stable looped Transformers with adaptive compute via fixed-point halting

Researchers introduce FPRM, a Transformer-based Fixed-Point Reasoning Model that uses fixed-point convergence as a halting mechanism in looped architectures, addressing signal propagation problems through pre-norm layers and residual scaling. Looped architectures provide inductive bias for compositional reasoning, but suffer from depth-induced signal degradation when halting is deferred; FPRM resolves this while enabling compute to scale with task difficulty. The model is evaluated on Sudoku, Maze, state-tracking, and ARC-AGI benchmarks. This contributes to the growing body of work on adaptive-compute and iterative-refinement architectures for reasoning.

Evaluation and Benchmarking Fixed-Point Reasoning Model Fixed-Point Reasoners: Stable and Adaptive Deep Looped Transformers ARC-AGI

4arXiv · cs.CL·4d ago·source ↗

Multi-stage explainability framework translates transformer speech models into clinical cognitive impairment narratives

A new arXiv preprint proposes a framework for making transformer-based speech cognitive impairment detection clinically interpretable by combining SHAP token attribution, linguistic feature analysis, and a four-stage LLM reasoning pipeline using LLaMA-3.1-70B-Instruct. The system is built on the SpeechCARE-Adaptive Gating Network multimodal model (F1=72.11% on NIA PREPARE) and maps outputs to four cognitive-linguistic dimensions. Physician evaluation on 70 samples showed strong alignment with clinical profiles and a System Usability Scale score of 82/100, suggesting practical clinical workflow integration potential.

Evaluation and Benchmarking AI Safety Research NIA PREPARE Llama 3.3 70B Instruct SpeechCARE-Adaptive Gating Network +3 more

6arXiv · cs.CL·Jun 25, 2026·source ↗

Weave of Formal Thought: Sound-and-complete constrained decoding with learned latent syntax for code LLMs

The paper introduces Weave of Formal Thought (WoFT), a framework combining a formally sound-and-complete constrained decoder for code generation with a latent-variable fine-tuning method that teaches LLMs to interleave grammar non-terminals during generation. The constrained decoder extends generalized LR (GLR) parsing with speculative lexing to handle context-sensitive lexing and maximal-munch tokenization, addressing gaps in prior constrained-decoding work. A reweighted wake-sleep (RWS) fine-tuning objective on StarCoder2-3B achieves a 14.3% relative reduction in per-token cross-entropy over a text-only SFT baseline on Python, suggesting that explicit structural scaffolding recovers information lost in flat autoregressive training.

Evaluation and Benchmarking Agent and Tool Ecosystem Generalized LR Parsing Tree-sitter Weave of Formal Thought +2 more

6Hugging Face Blog·May 19, 2026·source ↗

Making LLMs lighter with AutoGPTQ and transformers

Hugging Face announces native integration of AutoGPTQ into the transformers library, enabling 4-bit quantized inference for large language models. The integration allows users to load and run GPTQ-quantized models directly through the standard transformers API with minimal code changes. This lowers the hardware barrier for deploying LLMs by significantly reducing VRAM requirements while maintaining competitive performance.

Open Weights Progress Inference Economics Transformers Hugging Face AutoGPTQ +2 more

6arXiv · cs.CL·2d ago·source ↗

LOTUS: Looped Transformers bridge latent and explicit chain-of-thought reasoning at 3B scale

Researchers introduce LOTUS (Looped Transformers with parallel supervision on latents), a latent chain-of-thought method that processes reasoning steps in hidden states rather than decoded tokens. LOTUS is claimed to be the first latent-CoT approach to match explicit CoT performance at the 3B parameter scale, while reducing thought-phase latency by 2.5x–6.9x. The method uses a looped (recurrent-depth) Transformer backbone with parallel cross-entropy supervision on gold CoT-step tokens at each latent position, and the latent space is shown to be interpretable by projecting through the base LM head to recover reasoning steps.

Evaluation and Benchmarking Inference Economics Chain-of-Thought Reasoning LOTUS Bridging the Gap Between Latent and Explicit Reasoning with Looped Transformers

Negation-capable fuzzy logic FFN replacement yields interpretable grammatical licensing detectors in transformers

Related events (8)

6arXiv · cs.CL·Jun 5, 2026·source ↗