paper
Learning to Summarize with Human Feedback
paperactive
learning-to-summarize-with-human-feedback-8be1a186·1 events·first seen 28d agoAliases: Learning to Summarize with Human Feedback
Co-occurring entities
More like this (12)
Reinforcement Learning from Human FeedbackRecursive SummarizationPersonalized Evaluation as LearningA Training-Free Mixture-of-Agents Framework for Multi-Document Summarization using LLMs and Knowledge Graphsclinical text summarizationhierarchical summarizationReward Learning from ComparisonsLearning to Reason by Analogy via Retrieval-Augmented Reinforcement Fine-TuningFine-tuning GPT-2 from Human PreferencesRetell AIWatch, Remember, Reason: Human-View Video Understanding with MLLMsMulti-Turn Evaluation of Deep Research Agents Under Process-Level Feedback
Recent events (1)
Learning to Summarize with Human Feedback
OpenAI published research applying reinforcement learning from human feedback (RLHF) to train language models for improved summarization quality. The work demonstrated that models trained with human preference signals outperform those trained purely on supervised objectives for summarization tasks. This paper is an early foundational contribution to the RLHF methodology that later became central to aligning large language models.