Interpreting LQR through Optimal Control and Reinforcement Learning

less than 1 minute read

Published:

This post explains the Linear Quadratic Regulator (LQR) from two complementary viewpoints: classical optimal control and reinforcement learning. Starting from the finite- and infinite-horizon optimal control formulation, it derives the Riccati equation and optimal feedback law, then reinterprets the same results through value functions, Q-functions, policy iteration, and value iteration. Drawing on the connection highlighted in Reinforcement Learning and Adaptive Dynamic Programming for Feedback Control, the post shows how LQR serves as a clean bridge between control theory and RL, clarifying how dynamic programming ideas underpin both frameworks.

Policy Iteration

  • Results

Value Iteration

  • Results

Model-free Q learning

References

  1. Reinforcement Learning and Feedback Control
  2. Understanding Connection Between PMP and HJB Equations from the Perspective of Hamilton Dynamics
  3. mjctrl & RMP2