Workflow for completing natural-language request with metric-semantic representation of environment

Authors

  • Nguyen Van Hung Institute of Automation, Academy of Military Science and Technology
  • Truong Xuan Tung Le Quy Don Technical University image/svg+xml
  • Le Viet Hong Institute of Automation, Academy of Military Science and Technology
  • Le Khanh Thanh Institute of Automation, Academy of Military Science and Technology

Keywords:

Natural-language request; Path planning; Task planning; Metric-semantic map; 3D scene graph.

Abstract

In mobile robotics and autonomous systems, a natural-language request can be completed by converting it into high-level and low-level tasks. To accomplish such a request, both these types of tasks must be implemented, along with an efficient method to bridge them. However, this problem is still open. This work presents a two-phase workflow (figure 1), including Comprehension and Implementation, based on a metric-semantic map to address this problem. In the Comprehension phase, also known as automated planning, the natural language request is converted into actionable plans using semantic information from the map. These plans are then passed to the Implementation phase, where tasks like navigation or manipulation are executed utilizing geometric information from the map. Moreover, we also conduct an experiment to illustrate how a natural-language request is implemented on a specific metric-semantic presentation of the environment, namely a 3D Scene Graph, with the following complete sequence: from creating the 3D Scene graph until getting the feasible output path. In addition, this work highlights limitations that need to be addressed in the future to enhance the proposed workflow.

Downloads

Download data is not yet available.

Downloads

Published

2025-04-15

Issue

Section

Electronics & Automation