Taxonomy of Robotic Manipulation: Inverse Kinematics (IK), Operational Space Control (OSC), Impedance Control, Riemannian Motion Policy (RMP) and Geometric Fabrics

Implementation Details of Cartoon-Diffusion-Model Permalink

less than 1 minute read

Published: June 30, 2025

Denoising Diffusion Probabilistic Models (DDPM) are a class of generative models that learn to synthesize data by reversing a gradual noising process. Inspired by nonequilibrium thermodynamics, DDPMs offer strong sample quality and training stability. DDIM improves sampling efficiency by skipping steps in the reverse process while preserving sample fidelity. Recently, diffusion models have been applied beyond image generation—most notably in imitation learning, where Diffusion Policy leverages them to model trajectory distributions for robotic control. This blog walks through the core concepts behind diffusion models and their variants, with code examples available at this repository.

Shed Some Light on Proximal Policy Optimization (PPO) and Its Application Permalink

15 minute read

Published: May 31, 2025

Proximal Policy Optimization (PPO) is a reinforcement learning algorithm that refines policy gradient methods like REINFORCE using importance sampling and a clipped surrogate objective to stabilize updates. PPO-Penalty explicitly penalizes KL divergence in the objective function, and PPO-Clip instead uses clipping to prevent large policy updates. In many robotics tasks, PPO is first used to train a base policy (potentially with privileged information). Then, a deployable controller is learned from this base policy using imitation learning, distillation, or other techniques. This blog explores PPO’s core principle, with code available at this repository.

From Q-Learning to Deep Q-Learning and Deep Deterministic Policy Gradient (DDPG) Permalink

16 minute read

Published: March 10, 2025

Q-learning, an off-policy reinforcement learning algorithm, uses the Bellman equation to iteratively update state-action values, helping an agent determine the best actions to maximize cumulative rewards. Deep Q-learning improves upon Q-learning by leveraging deep Q network (DQN) to approximate Q-values, enabling it to handle continuous state spaces but it is still only suitable for discrete action spaces. Further advancement, Deep Deterministic Policy Gradient (DDPG), combines Q-learning’s principles with policy gradients, making it also suitable for continuous action spaces. This blog starts by discussing the basic components of reinforcement learning and gradually explore how Q-learning evolves into DQN and DDPG, with application for solving the cartpole environment in Isaac Gym simulator. Corresponding code can be found at this repository.

Dwell on Differential Dynamic Programming (DDP) and Iterative Linear Quadratic Regulator (iLQR) Permalink

6 minute read

Published: October 10, 2024

Although optimal control and reinforcement learning appear to be distinct field, they are, in fact closely related. Differential Dynamic Programming (DDP) and Iterative Linear Quadratic Regulator (iLQR), two powerful algorithms commonly utilized in trajectory optimizations, exemplify how model-based reinforcement learning can bridge the gap between these domains. This blog begins by discussing the fundational principles, including Newton’s method and Bellman Equation. It then delves into the specifics of the DDP and iLQR algorithms, illustrating their application through the classical problem of double pendulum swing-up control.

Lihan Lian

Taxonomy of Robotic Manipulation: Inverse Kinematics (IK), Operational Space Control (OSC), Impedance Control, Riemannian Motion Policy (RMP) and Geometric Fabrics

Diifusion Models

DDPM

Results

DDIM

Results

References

You May Also Enjoy

Implementation Details of Cartoon-Diffusion-Model Permalink

Shed Some Light on Proximal Policy Optimization (PPO) and Its Application Permalink

From Q-Learning to Deep Q-Learning and Deep Deterministic Policy Gradient (DDPG) Permalink

Dwell on Differential Dynamic Programming (DDP) and Iterative Linear Quadratic Regulator (iLQR) Permalink