Taxonomy of Robotic Manipulation: Inverse Kinematics (IK), Operational Space Control (OSC), Impedance Control, Riemannian Motion Policy (RMP) and Geometric Fabrics

less than 1 minute read

Published:

Proximal Policy Optimization (PPO) is a reinforcement learning algorithm that refines policy gradient methods like REINFORCE using importance sampling and a clipped surrogate objective to stabilize updates. PPO-Penalty explicitly penalizes KL divergence in the objective function, and PPO-Clip instead uses clipping to prevent large policy updates. In many robotics tasks, PPO is first used to train a base policy (potentially with privileged information). Then, a deployable controller is learned from this base policy using imitation learning, distillation, or other techniques. This blog explores PPO’s core principle, with code available at repo1 and repo2.

Diifusion Models

DDPM

  • Results

DDIM

  • Results

References

  1. The Breakthrough Behind Modern AI Image Generators - Diffusion Models Part 1
  2. Denoising Diffusion Probabilistic Models (DDPM Paper)
  3. Denoising Diffusion Implicit Models (DDIM Paper)
  4. Diffusion Model Paper Explanation, PyTorch Implementation Walk Through and corresponding github repo
  5. diffusion-DDPM-pytorch & diffusion-DDIM-pytorch