Reinforcement learning for robotic manipulation
ROLE & KEY CONTRIBUTIONS
Solo, ongoing reinforcement-learning work on manipulation — core value-based algorithms implemented from scratch, with the environments to train them.
- Implemented Q-Learning and DQN from scratch, including replay buffer, target network, exploration schedules, and reward shaping;
- Built MuJoCo environments for a drawer-opening manipulation task;
- Stood up a working end-to-end RL pipeline, updated as the research grows.
Overview
To understand RL algorithms at the implementation level — not just as library calls — I built Q-Learning and Deep Q-Networks (DQN) from scratch in Python and applied them to robotic-arm manipulation experiments, including drawer-opening tasks.
What's inside
- Tabular Q-Learning and DQN implemented from first principles: replay buffer, target network, epsilon-greedy exploration schedules, and reward shaping for sparse manipulation rewards.
- MuJoCo simulation environments for the manipulation tasks, connecting my mechanical-design background to the learning pipeline: the same workspace analysis used for hardware validation defines the RL task space.
- Drawer-opening as the benchmark task — a contact-rich problem where naive reward design fails and the agent must sequence reaching, grasping, and pulling.
Why it matters for my work
Mechanical designers who understand learning-based control design different hardware: actuation that is torque-transparent, mechanisms whose state is observable, structures that survive the exploration phase. This project is my bridge between the two worlds.