technique

human response time

techniqueactiveprovisionalhuman-response-time-9b311b7e·1 events·first seen 19d ago

Aliases: human response time

Co-occurring entities

Transformers In-Context Reward Adaptation Reinforcement Learning from Human Feedback in-context learning

More like this (12)

item response theory prompt sensitivity agent-task efficiency inference-time intervention Human-Vehicle Interaction Benchmark human alignment (neural/behavioral)HumanEvalFIM OpenAI Responses API human uncertainty alignment Human Activity Recognition (HAR)Real-Time Clustering human red teaming

Recent events (1)

6arXiv · cs.LG·19d ago·source ↗

In-Context Reward Adaptation for Robust Preference Modeling

This paper proposes In-Context Reward Adaptation (ICRA), a transformer-based framework that infers reward structures from small sets of preference demonstrations at inference time, without retraining. The key finding is that standard transformers exhibit asymptotic bias toward ground-truth rewards, but incorporating human response time as an auxiliary signal resolves this limitation and enables generalization to unseen preference domains. The approach addresses a core limitation of static RLHF reward models, which fail to handle heterogeneous or shifting human value distributions.

Evaluation and Benchmarking Alignment and RLHF Transformers In-Context Reward Adaptation Reinforcement Learning from Human Feedback +2 more