Entity · technique

TD(0)

techniqueactivetd-0--8c801ed4·1 events·first seen Jun 17, 2026

Aliases: TD(0)

Co-occurring entities

A Diffusion Approximation for Temporal-Difference Learning with Linear Features under Markovian Noise

More like this (12)

DTD TDW-MAT TTT-E2E TTT-Discover TTT-Discover T²MLR LC-DP TPC-DS DualDPO Twin Delayed DDPG System 0 nDCG@10

Recent events (1)

4arXiv · cs.LG·Jun 17, 2026·source ↗

SDE approximation for TD learning with linear features under Markovian noise

A new arXiv preprint replaces the classical ODE description of linear TD(0) learning with a stochastic differential equation (SDE) approximation that accounts for Markovian sampling noise. The model separates contraction dynamics governed by the projected Bellman operator from the influence of Markovian long-run covariance, providing a theoretical explanation for the constant-stepsize error floor. The work is a theoretical contribution to the foundations of reinforcement learning policy evaluation.

Alignment and RLHF TD(0)A Diffusion Approximation for Temporal-Difference Learning with Linear Features under Markovian Noise