Almanac
technique

R-NaD

techniqueactiveprovisionalr-nad-2dd7603b·1 events·first seen 20h ago

Aliases: R-NaD

Co-occurring entities

More like this (12)

Recent events (1)

5arXiv · cs.AI·20h ago·source ↗

Solver-dependent Nash equilibrium selection on zero-sum polytopes: regularized methods select max-entropy members

A new arXiv preprint investigates whether different Nash equilibrium solvers systematically select different members of the Nash polytope in two-player zero-sum games. Using six analytically tractable games including Kuhn poker, the authors find that regularized last-iterate methods (R-NaD, magnetic mirror descent) converge to the maximum-entropy Nash equilibrium — interpretable as an information projection — while regret-averaging methods (CFR, CFR+, fictitious play) drift to lower-entropy boundary solutions. The distinction has downstream consequences for performance against sub-optimal opponents in games with sequential or hidden-information structure, with implications for multi-agent AI training and game-solving pipelines.