Hierarchical Policy for Non-prehensile Multi-object Rearrangement with Deep Reinforcement Learning and Monte Carlo Tree Search

AI-generated keywords: Non-prehensile Multi-object Hierarchical Policy Monte Carlo Tree Search Path Primitives

AI-generated Key Points

  • Hierarchical policy for non-prehensile multi-object rearrangement (NPMO)
  • Complexity of NPMO task due to considering object reach and order of movement
  • High-level policy: Monte Carlo Tree Search (MCTS) algorithm with a designed policy network
  • Low-level policy: Robot plans paths using path primitives instead of single-step discrete actions
  • Experimental results show higher success rates, fewer steps, and shorter path lengths compared to state-of-the-art approaches
  • Contributions: Modeling and solving NPMO with hierarchical policy, high-level MCTS policy accelerated by a trained policy network, low-level policy using path primitives
  • Effective solution combining deep reinforcement learning techniques with Monte Carlo Tree Search
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Fan Bai, Fei Meng, Jianbang Liu, Jiankun Wang, Max Q. -H. Meng

License: CC BY 4.0

Abstract: Non-prehensile multi-object rearrangement is a robotic task of planning feasible paths and transferring multiple objects to their predefined target poses without grasping. It needs to consider how each object reaches the target and the order of object movement, which significantly deepens the complexity of the problem. To address these challenges, we propose a hierarchical policy to divide and conquer for non-prehensile multi-object rearrangement. In the high-level policy, guided by a designed policy network, the Monte Carlo Tree Search efficiently searches for the optimal rearrangement sequence among multiple objects, which benefits from imitation and reinforcement. In the low-level policy, the robot plans the paths according to the order of path primitives and manipulates the objects to approach the goal poses one by one. We verify through experiments that the proposed method can achieve a higher success rate, fewer steps, and shorter path length compared with the state-of-the-art.

Submitted to arXiv on 18 Sep. 2021

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2109.08973v1

This work presents a hierarchical policy for addressing the task of non-prehensile multi-object rearrangement (NPMO). NPMO involves planning feasible paths and transferring multiple objects to their predefined target poses without grasping. The complexity of this task is deepened by the need to consider how each object reaches its target and the order in which objects are moved. To tackle these challenges, the authors propose a hierarchical policy that divides and conquers the problem. In the high-level policy, a Monte Carlo Tree Search (MCTS) algorithm efficiently searches for the optimal rearrangement sequence among multiple objects. This search is guided by a designed policy network, which benefits from both imitation learning and reinforcement learning. The MCTS-based approach allows for strong long-term decision-making capabilities. In the low-level policy, the robot plans paths using path primitives, which are basic motion sequences used to push objects towards their goal poses. Unlike previous approaches that use single-step discrete actions, the proposed method reduces the depth and width of the search tree by utilizing these path primitives. Experimental results demonstrate that the proposed method achieves higher success rates, requires fewer steps, and has shorter path lengths compared to state-of-the-art approaches. The contributions of this work include modeling and solving NPMO with a hierarchical policy; proposing a high-level MCTS policy accelerated by a policy network trained with imitation and reinforcement learning; designing a low-level policy that plans paths using path primitives; achieving higher success rates; requiring fewer steps; and having shorter path lengths compared to state-of-the art approaches. Overall, this work provides an effective solution for non prehensile multi object rearrangement tasks by combining deep reinforcement learning techniques with Monte Carlo Tree Search.
Created on 09 Dec. 2023

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.