Almanac
← Events
4arXiv cs.CL (Computation and Language)·24d ago

Gumbel Machine: Counterfactual Student Writing Generation via Gumbel Noise Steering

The paper introduces the Gumbel Machine, a modular framework for generating counterfactual text that improves student writing while preserving similarity to the original. Central to the approach is β-Hindsight control, a controlled decoding algorithm that uses Gumbel noise as a tunable similarity mechanism during LLM generation. Experiments on student writing datasets show the method produces outputs that are both rubric-consistent and close to the reference text. The approach is positioned as more flexible and practically applicable than prior domain-specific counterfactual generation methods.

Related guides (1)

Related events (8)

5arXiv · cs.CL·25d ago·source ↗

QUIET: Multi-Blank Cascaded Story Cloze Benchmark for LLM Creative Generation

QUIET (Quality Understanding via Interlocked Evaluation Testing) is a new benchmark designed to evaluate LLM creative generation capability rather than discriminative recognition, addressing limitations of benchmarks like Story Cloze Test and HellaSwag. The benchmark places 10-20 blanks with explicit content constraints and cascade dependencies into complete stories, requiring open-ended generation rather than multiple-choice selection. Scoring uses an information-theoretic automated protocol operationalizing a 'calibrated surprise' framework: score = satisfy * (1 + lambda * surprise), combining constraint satisfaction with a surprise measure, enabling objective automated evaluation without human graders or LLM-as-Judge subjectivity.

5Hugging Face Blog·1mo ago·source ↗

Assisted Generation: a new direction toward low-latency text generation

Hugging Face introduces assisted generation (speculative decoding) as a practical technique for reducing LLM inference latency. The approach uses a smaller draft model to propose token candidates that a larger model then verifies in parallel, enabling multiple tokens to be accepted per forward pass. The blog post explains the mechanism and demonstrates integration into the Hugging Face Transformers library.

4Hugging Face Blog·1mo ago·source ↗

Generating Human-level Text with Contrastive Search in Transformers

Hugging Face introduces contrastive search, a decoding strategy for autoregressive language models that aims to produce more coherent and human-like text compared to standard methods like beam search or nucleus sampling. The technique works by balancing a model's confidence in its next-token prediction against a contrastive penalty that discourages repetitive or degenerate outputs. The blog post describes integration of contrastive search into the Hugging Face Transformers library, making it accessible to practitioners.

4Hugging Face Blog·1mo ago·source ↗

Improving Prompt Consistency with Structured Generations

This Hugging Face blog post examines how structured generation outputs can improve consistency in LLM evaluation pipelines. It explores techniques for constraining model outputs to specific formats, reducing variability in prompt-based assessments. The post addresses a practical challenge in evaluation workflows where inconsistent response formats degrade measurement reliability.

7Openai Blog·1mo ago·source ↗

Finding GPT-4's Mistakes with GPT-4: CriticGPT

OpenAI has developed CriticGPT, a GPT-4-based model trained to write critiques of ChatGPT outputs, helping human trainers identify errors during RLHF. The system is designed to address a core scalable oversight challenge: human raters often miss subtle mistakes in long or complex model outputs. CriticGPT-assisted trainers outperformed unassisted trainers in catching model errors, suggesting a path toward more reliable RLHF pipelines.

4arXiv · cs.AI·9d ago·source ↗

Gamified writing experiment studies when humans adopt AI suggestions vs. maintain creative autonomy

A preprint from arXiv introduces 'Nonslop,' a gamified writing experiment with 74 participants designed to study authentic human preferences in AI-assisted creative writing. The system deliberately inverts the helpful-assistant pattern by disincentivizing AI suggestion acceptance, simulating a dystopian framing to reveal genuine user behavior rather than default compliance. The study analyzes when users choose creative autonomy versus accepting AI assistance across different task types and response characteristics. Findings bear on questions of individual voice, authenticity, and the tension between efficiency and human expression in LLM-augmented writing.

5arXiv · cs.CL·15d ago·source ↗

Counterfactual context revision framework for auditing LLM-based stance simulation in online discussions

Researchers introduce a counterfactual context revision framework to audit how LLMs simulate individual users' stances in online discussions. By applying controlled text-only and multimodal (meme-based) revisions to conversational contexts, they measure how readily simulated stances shift in response to semantically independent changes. Results show effective and robust stance transitions across both revision types and polarization-preference mechanisms, raising concerns about whether LLM simulations reflect genuine user-specific beliefs or are highly context-sensitive artifacts. The work contributes an evaluation framework and highlights risks of using LLMs to model online opinion dynamics.

5arXiv · cs.CL·22d ago·source ↗

LLUMI: Fine-Tuning Open-Source LLMs for Mental Health Writing Assistance Using Reddit Community Feedback

LLUMI is a two-component system (a generation model and an improvement model) designed to provide mental health writing assistance using smaller open-source LLMs hosted in privacy-preserving, on-premise environments. The system leverages Reddit community endorsement signals (upvotes/downvotes) to construct preference pairs for SFT and DPO training, then further aligns outputs via human evaluation across readability, empathy, connection, actionability, and safety dimensions. Results show LLUMI achieves performance comparable to proprietary GPT-based models on linguistic and human evaluations, suggesting community-derived preference signals can substitute for expensive expert labeling in sensitive domains.