Almanac
paper

Self-Refining Agentic Reinforcement Learning for Vision-Conditioned UAV Navigation

paperactiveprovisionalself-refining-agentic-reinforcement-learning-for-vision-conditioned-uav-navigation-3e99400b·1 events·first seen 14d ago

Aliases: Self-Refining Agentic Reinforcement Learning for Vision-Conditioned UAV Navigation

Co-occurring entities

More like this (12)

Recent events (1)

6arXiv · cs.AI·14d ago·source ↗

AgenticRL: Self-refining LLM-guided reward design and policy refinement for UAV navigation

AgenticRL is a framework that uses a multimodal GPT agent to automate reward function generation, policy training via PPO, and closed-loop self-refinement for UAV navigation tasks. The agent evaluates trained policies through diagnostic feedback, identifies failure modes, and iteratively refines rewards without human intervention. Evaluated across five navigation tasks, the closed-loop refinement improves policy behavior by 71% over initial rewards, with sim-to-real transfer achieving 91% real-world success rate and 94% sim-to-real accuracy.