Almanac
technique

PACT

techniqueactiveprovisionalpact-9743f5ec·1 events·first seen 29h ago

Aliases: PACT

More like this (12)

Recent events (1)

4arXiv · cs.AI·29h ago·source ↗

PACT: Hybrid SLM deliberation architecture improves reactive RL policies in unfamiliar environments

Researchers propose PACT (Plan, Align, Commit, Think), a hybrid architecture pairing a fast reactive RL policy with an asynchronous small language model planner for deliberation. The SLM generates and validates candidate action plans via simulation before committing to execution, bypassing the RL policy without retraining. Evaluated on FrozenLake configurations of increasing difficulty, PACT outperforms baselines using only a 2B-parameter SLM, suggesting complementary strengths between deliberative planning and reactive execution.