Almanac
technique

Preference Coordinated Multi-agent Policy Optimization

techniqueactiveprovisionalpreference-coordinated-multi-agent-policy-optimization-ab98f69f·1 events·first seen 2d ago

Aliases: Preference Coordinated Multi-agent Policy Optimization

More like this (12)

Recent events (1)

4arXiv · cs.AI·2d ago·source ↗

PCMA: Learning coordinated agent-specific preferences for multi-objective multi-agent RL

A new arXiv preprint introduces Preference Coordinated Multi-agent Policy Optimization (PCMA), a method for cooperative multi-objective multi-agent reinforcement learning (MOMARL) that learns agent-specific preferences to enable complementary trade-offs across agents. The authors formulate cooperative MOMARL as a team-optimal game and provide a first-order improvement decomposition showing that preference diversity can induce team improvement. Experiments on cooperative MOMA environments and a traffic-control scenario demonstrate improvements in both performance and trade-off coordination.