4arXiv cs.LG (Machine Learning)·1mo ago

The Privacy Price of Tail-Risk Learning: Effective Tail Sample Size in Differentially Private CVaR Optimization

This paper characterizes how differential privacy affects the statistical complexity of CVaR (Conditional Value at Risk) optimization, showing that the effective sample size governing private tail-risk learning is εnτ rather than n, where τ is the tail mass. Complete minimax rates are derived for scalar estimation and finite classes under pure DP, with lower bounds extending to approximate DP. For convex Lipschitz learning, the CVaR-specific privacy cost necessarily scales as 1/(εnτ), with dimension dependence inherited from private stochastic convex optimization. The results reduce private CVaR learning to private learning on Θ(nτ) tail records as the canonical hard subproblem.

AI Safety Research Differential Privacy Approximate DP Private Stochastic Convex Optimization CVaR (Conditional Value at Risk)

Related guides (1)

AI Safety ResearchTopic guide

AI Safety Research: From Lab Policies to Real-World Flashpoints

Read asBeginner In-depth

Related events (8)

5arXiv · cs.LG·46h ago·source ↗

Predictability as a Fine-Grained Privacy Metric Complementary to Differential Privacy

A new arXiv preprint introduces 'privacy via predictability,' a framework that measures privacy leakage as the incremental gain in an attacker's ability to predict sensitive information after observing an algorithm's output, conditioned on the attacker's prior knowledge. The authors show predictability and differential privacy are generally incomparable, but that predictability implies mutual-information DP in worst-case regimes. They develop a generalized method of moments framework for asymptotic analysis and derive a predictability-calibrated output perturbation scheme for empirical risk minimization. The work positions predictability as a complementary, finer-grained alternative to DP for settings where attacker knowledge and query families can be specified.

Evaluation and Benchmarking AI Safety Research Differential Privacy Generalized Method of Moments Predictability as a Fine-Grained Measure for Privacy

5arXiv · cs.LG·26d ago·source ↗

Perturbation Theory for Spherical Hellinger-Kantorovich Flows with Differential Privacy Guarantees

This paper develops a perturbation theory for Spherical Hellinger-Kantorovich (SHK) gradient flows, which couple transport and reaction dynamics and coincide with birth-death Langevin dynamics. The authors derive dimension-free bounds on log-likelihood ratios and Rényi/KL divergences when two potentials differ, quantifying how perturbations propagate over time. These results are applied to differential privacy: the likelihood-ratio control yields explicit Pure-DP guarantees for SHK-based samplers implementing the exponential mechanism, while KL bounds provide Approximate-DP certificates. A utility bound is also derived that separates intrinsic exponential-mechanism suboptimality from finite-time sampling error.

AI Safety Research Alignment and RLHF Differential Privacy KL Divergence Spherical Hellinger-Kantorovich geometry +4 more

5arXiv · cs.LG·46h ago·source ↗

Optimal deterministic multicalibration achieved, resolving open problem on randomization necessity

A new arXiv preprint resolves an open problem in multicalibration theory by constructing a minimax-optimal multicalibration algorithm that outputs a deterministic predictor, achieving the same O(ε⁻³) sample complexity previously only attainable by randomized predictors. The result extends to outcome indistinguishability, deterministic omnipredictors, and panpredictors with optimal sample complexity, resolving multiple open problems from recent works. Multicalibration is a fairness and reliability property requiring calibration to hold across reweighted subgroups, making this relevant to trustworthy ML research.

Evaluation and Benchmarking AI Safety Research outcome indistinguishability Optimal Deterministic Multicalibration and Omniprediction multicalibration

6arXiv · cs.LG·4d ago·source ↗

RING attack exploits differential privacy to amplify backdoor success in federated learning

A new arXiv paper challenges the assumption that differential privacy (DP) inherently protects federated learning (FL) against backdoor attacks, demonstrating that DP's noise mechanism actually masks the statistical signatures that defenses rely on to detect malicious updates. The authors propose RING, an attack that exploits this masking effect by having compromised clients collaboratively craft adversarial perturbations that reconstruct a strong backdoor signal at aggregation time. Evaluated across four datasets against six state-of-the-art defenses, RING achieves a 90.3% average attack success rate under moderate privacy budgets, up to 26x better than baselines. Proposed countermeasures incur significant utility trade-offs, exposing a fundamental security gap in DP-FL deployments.

AI Safety Research RING Your Privacy My Cloak: Backdoor Attacks on Differentially Private Federated Learning

4arXiv · cs.LG·15d ago·source ↗

TailLoR: Spectral-domain continual learning via protected principal components

TailLoR is a new parameter-efficient finetuning method for continual learning that uses the singular value decomposition of pre-trained weights as a fixed reference frame, applying low-rank updates only to the singular value matrix. A soft spectral penalty discourages updates aligned with dominant singular directions, reducing catastrophic interference while routing adaptation into long-tail spectral coordinates. The approach targets the forgetting problem in continual learning through a principled spectral lens.

Open Weights Progress TailLoR: Protecting Principal Components in Parameter-Efficient Continual Learning TailLoR

7Google Deepmind Blog·1mo ago·source ↗

VaultGemma: The world's most capable differentially private LLM

DeepMind introduces VaultGemma, a large language model trained from scratch using differential privacy (DP), claiming it as the most capable DP-trained model to date. The announcement positions VaultGemma as a significant advance in privacy-preserving AI, combining strong utility with formal privacy guarantees. The blog post is brief and likely precedes a more detailed technical disclosure.

Open Weights Progress AI Safety Research Differential Privacy Gemma Google DeepMind +2 more

5arXiv · cs.AI·1mo ago·source ↗

CARV: Compute-Aware Variance Reduction for Diffusion Teacher Gradient Estimation

CARV is a hierarchical Monte Carlo estimation framework that reduces gradient variance when using frozen pretrained diffusion models as teachers in downstream pipelines such as text-to-3D distillation and data attribution. The approach amortizes expensive upstream computation (rendering, simulation, encoding) over cheap diffusion-noise resamples, augmented by timestep importance sampling and stratified-inverse-CDF construction. In text-to-3D experiments, CARV delivers 2–3× effective compute multipliers; in single-step distillation, it cuts gradient variance by an order of magnitude but does not improve FID, revealing that MC variance is not the bottleneck in that regime.

Inference Economics Multimodal Progress Model Distillation CARV importance sampling +4 more

4arXiv · cs.LG·3d ago·source ↗

SDE approximation for TD learning with linear features under Markovian noise

A new arXiv preprint replaces the classical ODE description of linear TD(0) learning with a stochastic differential equation (SDE) approximation that accounts for Markovian sampling noise. The model separates contraction dynamics governed by the projected Bellman operator from the influence of Markovian long-run covariance, providing a theoretical explanation for the constant-stepsize error floor. The work is a theoretical contribution to the foundations of reinforcement learning policy evaluation.

Alignment and RLHF TD(0)A Diffusion Approximation for Temporal-Difference Learning with Linear Features under Markovian Noise