Entity · technique

Conformal Decision Theory

techniqueactiveconformal-decision-theory-8794f69c·1 events·first seen May 28, 2026

Aliases: Conformal Decision Theory

Co-occurring entities

Calibrated Collective Oversight (CCO)Attainable Utility Preservation SWE-bench MACHIAVELLI

More like this (12)

Conformal Prediction Bayesian decision theory Conformal Risk Control Mondrian Conformal Prediction Cost-Sensitive Conformal Prediction and Human-in-the-Loop Abstention for Imbalanced High-Stakes Decision Support: A Multi-Domain Benchmark Judge Response Theory Decision Transformer Stereotypes-to-Decisions (S2D)item response theory Beyond Aggregate Risk: Role-Stratified Conformal Risk Control for LLM Tool Calls decision-content decoupled reinforcement learning Representation-Conditioned Diffusion Models

Recent events (1)

7arXiv · cs.AI·May 28, 2026·source ↗

Calibrated Collective Oversight (CCO): Scalable Oversight with Finite-Time Statistical Guarantees

This paper introduces Calibrated Collective Oversight (CCO), a framework for maintaining human oversight of agentic AI systems that may exceed human capabilities. CCO aggregates diverse scoring functions into a conservatism penalty inspired by Attainable Utility Preservation, then calibrates this penalty online via Conformal Decision Theory to ensure undesirable outcomes stay below a user-specified threshold with finite-time bounds and no distributional assumptions. Evaluated on a modified SWE-bench (adversarially misaligned agent) and MACHIAVELLI (ethical violations), CCO allows weaker overseers to constrain stronger agents while preserving reward, with empirical violation rates closely matching specified targets.

Evaluation and Benchmarking AI Safety Research Calibrated Collective Oversight (CCO)Attainable Utility Preservation Conformal Decision Theory +4 more