Almanac
product

AgenticRL

productactiveprovisionalagenticrl-eb7a3170·1 events·first seen 13d ago

Aliases: AgenticRL

Co-occurring entities

More like this (12)

Recent events (1)

6arXiv · cs.AI·13d ago·source ↗

AgenticRL: Self-refining LLM-guided reward design and policy refinement for UAV navigation

AgenticRL is a framework that uses a multimodal GPT agent to automate reward function generation, policy training via PPO, and closed-loop self-refinement for UAV navigation tasks. The agent evaluates trained policies through diagnostic feedback, identifies failure modes, and iteratively refines rewards without human intervention. Evaluated across five navigation tasks, the closed-loop refinement improves policy behavior by 71% over initial rewards, with sim-to-real transfer achieving 91% real-world success rate and 94% sim-to-real accuracy.