#actor-critic05/11/2025
Train a Model-Native Agent to Internalize Planning, Memory and Tool Use with End-to-End RL
'A compact neural agent learns to plan, store and compose symbolic tools end-to-end with reinforcement learning, demonstrating emergent multi-step reasoning on synthetic arithmetic tasks.'