Almanac
← Events
4arXiv cs.AI (Artificial Intelligence)·13d ago

Paper proposes Transferability and Predictability metrics to extend ISO 26262 for autonomous vehicles

A preprint from arXiv proposes decomposing the ISO 26262 'Controllability' concept into two measurable sub-dimensions—Transferability (AV handoff to fallback mechanisms) and Predictability (external agents' ability to anticipate AV behavior)—to make functional safety standards applicable to driverless SAE Level 4/5 systems. The authors provide a mathematical framework for quantifying Predictability and introduce a designed-versus-achievable gap to distinguish architectural claims from scene-conditioned fallback capability. The proposed metrics are designed to align with both ISO 26262 and ISO/PAS 21448 (SOTIF), making safety claims falsifiable and traceable across operational design domains.

Related guides (1)

Related events (8)

4arXiv · cs.AI·12d ago·source ↗

IA-VQC-DPC: Intervention-aware quantum predictive control with safety attribution for learned policies

A new arXiv preprint introduces Intervention-Aware Variational Quantum Differentiable Predictive Control (IA-VQC-DPC), a framework that trains variational quantum circuit policies under a primal-dual intervention budget to penalize over-reliance on downstream safety filters (Control-Barrier-Function projections). The work also proposes a safety-attribution protocol that decomposes trajectory corrections into policy-level versus filter-level contributions, enabling measurement of whether a policy has genuinely learned safe behavior or is merely being silently repaired by its safety layer. Experiments on BOPTEST building-control emulators show the quantum policy achieves significantly lower pre-filter violations than a matched classical policy at equal parameter budget, with a notable negative result: a learned energy head is only safe when paired with a distribution-aware runtime guard.

6arXiv · cs.CL·24d ago·source ↗

Activation Steering for Synthetic Safety Data Generation: Diversity as a Critical Quality Axis

This paper investigates whether activation steering (AS) can generate high-quality synthetic training data for downstream safety detection classifiers, filling a gap in the literature. Across 4 safety concepts × 2 models × 4 steering methods, the authors find that AS-generated data outperforms prompt-generated data on 3 of 4 concepts, but only 41 of 136 configurations succeed, indicating a narrow effective regime. The study introduces sample- and set-level diversity as a previously absent quality axis, finding that higher steering strength reduces diversity and that the harmonic mean of success, coherence, and diversity correlates more reliably with downstream AUROC than prior metrics alone. The results provide a practical heuristic for practitioners tuning AS hyperparameters for safety data generation.

5arXiv · cs.LG·2d ago·source ↗

Predictability as a Fine-Grained Privacy Metric Complementary to Differential Privacy

A new arXiv preprint introduces 'privacy via predictability,' a framework that measures privacy leakage as the incremental gain in an attacker's ability to predict sensitive information after observing an algorithm's output, conditioned on the attacker's prior knowledge. The authors show predictability and differential privacy are generally incomparable, but that predictability implies mutual-information DP in worst-case regimes. They develop a generalized method of moments framework for asymptotic analysis and derive a predictability-calibrated output perturbation scheme for empirical risk minimization. The work positions predictability as a complementary, finer-grained alternative to DP for settings where attacker knowledge and query families can be specified.

6arXiv · cs.AI·1mo ago·source ↗

Lost in Fog: Sensor Perturbations Expose Reasoning Fragility in Driving VLAs

This paper presents a controlled robustness study of Vision-Language-Action (VLA) models in autonomous driving, evaluating Alpamayo R1 (10B parameters) across ~18,000 inference trials under eight sensor perturbation types including noise, lighting extremes, and fog. The key finding is that Chain-of-Causation (CoC) reasoning consistency is a high-fidelity proxy for trajectory reliability: when CoC explanations change post-perturbation, trajectory deviation spikes 5.3× (r=0.99 across attack types). Enabling CoC generation is associated with 11.8% average improvement in trajectory accuracy, and degradation under noise is approximately linear (R²=0.957), while standard preprocessing defenses offer only marginal benefit.

6arXiv · cs.AI·19d ago·source ↗

SafeSteer: Localized On-Policy Distillation for Efficient Safety Alignment

SafeSteer proposes a safety alignment method that targets only 'safety tokens' in the output distribution rather than applying global fine-tuning, arguing that safety features are inherently sparse. It constructs a safety teacher via activation steering, then restricts a reverse KL penalty to selected safety tokens during training. The approach achieves strong safety performance across seven benchmarks with minimal capability degradation, requiring only 100 harmful samples—less than 1% of data used by prior baselines.

5arXiv · cs.LG·19d ago·source ↗

Verifiable Belief-Space Neural Safety Filters for Interactive Robotics via Conformal Prediction

This paper proposes an algorithmic framework to certify high-probability safety of belief-space safety filters (BeliefSF) in interactive robotics, addressing the challenge that neural approximations and runtime inference errors make formal guarantees difficult. The approach uses conformal prediction focused on regions where inference is reliable, preserving standard sample complexity while certifying a less conservative filter. Evaluation on a simulated human-vehicle interaction benchmark demonstrates the method produces significantly more permissive safety guarantees than a standard conformal prediction baseline.

5arXiv · cs.AI·2d ago·source ↗

Distributionally robust optimization framework for probabilistic runtime verification of AI agents

A new arXiv preprint introduces a sound and efficient framework for verifying probabilistic security policies for AI agents operating in complex digital environments, addressing limitations of prior Datalog-based approaches that assumed deterministic policies or predicate independence. The method uses distributionally robust optimization to compute sound upper bounds on policy violation probability without requiring independence assumptions between predicates. Evaluated on benchmarks for terminal and tool-calling agents, the approach outperforms prior art on the security-utility trade-off.

4Openai Blog·1mo ago·source ↗

Generalizing from Simulation: OpenAI Sim-to-Real Robotics Transfer

OpenAI published results on sim-to-real transfer for robot controllers, demonstrating that policies trained entirely in simulation can be deployed on physical robots and respond to unplanned environmental changes. The work represents a shift from open-loop to closed-loop control systems in robotics. This is a 2017 research milestone predating current frontier model work but relevant to the historical trajectory of OpenAI's robotics program.