Entity · paper

Learning to Summarize with Human Feedback

paperactivelearning-to-summarize-with-human-feedback-8be1a186·1 events·first seen May 20, 2026

Aliases: Learning to Summarize with Human Feedback

Co-occurring entities

Reinforcement Learning from Human Feedback OpenAI

More like this (12)

Reinforcement Learning from Human Feedback Recursive Summarization A Human-in-the-Loop Corpus for LLM-Based Simplification of Scientific Summaries Less is More: Quality-Aware Training Data Selection for Scientific Summarization A Tree-of-Thoughts Inspired Hybrid Approach for Legal Case Judgement Summarization using LLMs Detect, Unlearn, Restore: Defending Text Summarization Models Against Data Poisoning Personalized Evaluation as Learning A Training-Free Mixture-of-Agents Framework for Multi-Document Summarization using LLMs and Knowledge Graphs clinical text summarization Reinforcement Learning with Metacognitive Feedback hierarchical summarization Reward Learning from Comparisons

Recent events (1)

6Openai Blog·May 20, 2026·source ↗

Learning to Summarize with Human Feedback

OpenAI published research applying reinforcement learning from human feedback (RLHF) to train language models for improved summarization quality. The work demonstrated that models trained with human preference signals outperform those trained purely on supervised objectives for summarization tasks. This paper is an early foundational contribution to the RLHF methodology that later became central to aligning large language models.

Evaluation and Benchmarking Alignment and RLHF Reinforcement Learning from Human Feedback OpenAI Learning to Summarize with Human Feedback