Almanac
← Events
5OpenAI Blog·1mo ago

New AI classifier for indicating AI-written text

OpenAI launched a classifier designed to distinguish between AI-generated and human-written text. The tool was positioned as an aid for detecting content produced by large language models. OpenAI acknowledged limitations including unreliability on short texts and non-English content, and noted the classifier should not be used as a sole decision-making tool.

Related guides (3)

Related events (8)

5arXiv · cs.CL·12d ago·source ↗

SV-Detect: AI-generated text detection via steering vectors in representation space

SV-Detect proposes a method for detecting machine-generated text by extracting steering vectors from the hidden representations of a frozen language model, constructing layer-wise directions that separate human from AI-written text. A lightweight classifier trained on projection features achieves strong performance both in-distribution and under distribution shift across domains, source models, and editing attacks like polishing and rewriting. The approach reframes AI-text detection as a representation-space probing problem, with interpretation analyses showing the learned directions capture stylistic cues beyond surface features.

5arXiv · cs.LG·15d ago·source ↗

OpAI-Bench: Benchmark for detecting AI text across progressive human-AI co-editing workflows

Researchers introduce OpAI-Bench, a benchmark for studying AI-text detection across progressive human-to-AI document revision workflows, covering document, sentence, token, and span granularities. Starting from human-written documents, the benchmark constructs nine sequentially revised versions per sample under five AI edit operations and varying AI coverage levels across four domains. Key findings include that mixed-authorship intermediate versions are often harder to detect than fully human or heavily AI-edited endpoints, revealing non-monotonic detection patterns absent from existing benchmarks. The work addresses a gap in AI-text detection research as real-world documents increasingly result from iterative human-AI co-editing rather than pure generation.

6Openai Blog·1mo ago·source ↗

AI-Written Critiques Help Humans Notice Flaws in Summaries

OpenAI trained critique-writing models to identify flaws in AI-generated summaries, finding that human evaluators catch significantly more errors when assisted by model-generated critiques. A key finding is that scale improves critique-writing ability more than summary-writing ability. The work is framed as a step toward using AI to assist human oversight of AI systems on difficult tasks, relevant to scalable oversight research.

5arXiv · cs.CL·12d ago·source ↗

Adversarial methodology improves detection of AI-generated social bot content

Researchers introduce an adversarial framework that simulates malicious actors impersonating real social media users to generate training data for AI-content detection. The approach produces a multilingual, cross-platform dataset of paired human and AI-generated messages. Models trained on this adversarial data significantly outperform existing content-based bot detection systems on out-of-distribution real-world data.

5arXiv · cs.AI·2d ago·source ↗

Multi-domain benchmark for detecting AI-generated text-rich images from GPT-Image-2

Researchers introduce a new benchmark of 8,602 images across six categories (commercial posters, infographics, academic posters, receipts, tables, UI screenshots) specifically for detecting AI-generated text-rich images produced by OpenAI's GPT-Image-2. Five zero-shot detectors are evaluated, revealing highly domain-dependent performance and severe sensitivity to JPEG compression even in the strongest conventional detector. A multimodal VLM is also explored as a detector, showing promise but limitations on structured formats. The work highlights a gap in existing benchmarks that focus on object-centric rather than text-layout-centric images.

6Google Deepmind Blog·1mo ago·source ↗

SynthID Detector — a new portal to help identify AI-generated content

Google DeepMind announced SynthID Detector, a new web portal unveiled at Google I/O 2025 that allows users to check whether content was generated by AI. The tool extends the existing SynthID watermarking system, which embeds imperceptible signals into AI-generated text, images, audio, and video. The portal is intended to help people verify the provenance of online content at scale.

4arXiv · cs.CL·29d ago·source ↗

Image-Semantic Guided Detection of AI-Generated Modern Chinese Poetry Using MLLMs

This paper proposes a multimodal detection method for identifying AI-generated modern Chinese poetry by incorporating images that reflect poetic content alongside text. The approach uses example-driven prompting to integrate meaning, imagery, and emotional cues from images as a complement to textual analysis. A Gemini-based detector using this method achieves 85.65% Macro-F1, outperforming both plain-text LLM baselines and the traditional RoBERTa detector. The work extends AI-generated content detection research into a domain—modern Chinese poetry—previously unaddressed by prior studies.

6Hugging Face Blog·1mo ago·source ↗

Introducing SynthID Text

Hugging Face published a blog post introducing SynthID Text, Google DeepMind's watermarking technique for AI-generated text. The method embeds imperceptible signals into LLM outputs by modifying token sampling distributions, enabling detection of AI-generated content without degrading text quality. The post likely covers integration with Hugging Face's transformers library, making the technique accessible to the broader ML community.