Learning Extrinsic Dexterity with Parameterized Manipulation Primitives

Örebro University

ED-PMP uses hierarchical RL with parameterized primitives to solve occluded grasping tasks without the need for complex manual controller design.

Abstract

Many practically relevant robot grasping problems feature a target object for which all grasps are occluded, e.g., by the environment. Single-shot grasp planning invariably fails in such scenarios. Instead, it is necessary to first manipulate the object into a configuration that affords a grasp. We solve this problem by learning a sequence of actions that utilize the environment to change the object's pose.

Concretely, we employ hierarchical reinforcement learning to combine a sequence of learned parameterized manipulation primitives. By learning the low-level manipulation policies, our approach can control the object's state through exploiting interactions between the object, the gripper, and the environment. Designing such a complex behavior analytically would be infeasible under uncontrolled conditions, as an analytic approach requires accurate physical modeling of the interaction and contact dynamics. In contrast, we learn a hierarchical policy model that operates directly on depth perception data, without the need for object detection, pose estimation, or manual design of controllers.

We evaluate our approach on picking box-shaped objects of various weight, shape, and friction properties from a constrained table-top workspace. Our method transfers to a real robot and is able to successfully complete the object picking task in 98% of experimental trials.

ED-PMP

Extrinsic Dexterity with Parameterized Manipulation Primitives

Interpolate start reference image.

Our ED-PMP method aims to break down complex tasks into sub-tasks and reduces the need for manual primitive design. It comprises high-level and low-level agents. High-Level Agent (top): The high-level agent takes a height map as input to a DQN, implemented using an FCN model. It then outputs pixel-wise maps of Q values, where each pixel corresponds to a starting pose and a primitive. Low-Level Agent (down): The low-level agent combines the current end-effector pose and contact force as the state of a DQN model. It iteratively estimates a series of actions to accomplish the sub-task within a designated number of iterations, denoted as T.

BibTeX


      @misc{yang2023learning,
      title={Learning Extrinsic Dexterity with Parameterized Manipulation Primitives},
      author={Shih-Min Yang and Martin Magnusson and Johannes A. Stork and Todor Stoyano},
      year={2023},
      eprint={2310.17785},
      archivePrefix={arXiv},
      primaryClass={cs.RO}
      }