Graph-Based Framework on Bimanual Manipulation Planning from Cooking Recipe

Takata, Kota; Kiyokawa, Takuya; Yamanobe, Natsuki; Ramirez-Alpizar, Ixchel G.; Wan, Weiwei; Harada, Kensuke

doi:10.3390/robotics11060123

Open AccessArticle

Graph-Based Framework on Bimanual Manipulation Planning from Cooking Recipe

by

Kota Takata

¹,

Takuya Kiyokawa

¹,

Natsuki Yamanobe

²

,

Ixchel G. Ramirez-Alpizar

²

,

Weiwei Wan

¹

and

Kensuke Harada

^1,2,*

¹

Graduate School of Engineering Science, Osaka University, Osaka 565-0871, Japan

²

ICPS Research Center, National Institute Advanced Industrial Science and Technology, Tsukuba 305-8560, Japan

^*

Author to whom correspondence should be addressed.

Robotics 2022, 11(6), 123; https://doi.org/10.3390/robotics11060123

Submission received: 14 October 2022 / Revised: 4 November 2022 / Accepted: 8 November 2022 / Published: 11 November 2022

(This article belongs to the Special Issue The State-of-the-Art of Robotics in Asia)

Download

Browse Figures

Versions Notes

Abstract

:

It is difficult to effectively operate a dual-arm robot using only the information written in a cooking recipe. To cope with this problem, this paper proposes a graph-based approach on bimanual cooking motion planning from a cooking recipe. In our approach, we first decompose the cooking recipe into graph elements. Then, we try to connect the graph elements taking into account the attributes of the input/output nodes. If two graph elements cannot be connected to each other, we search for a graph element that can be inserted between them from a motion database. Since the constructed graph includes the whole sequence of the robot’s motions to perform the cooking task, we can generate a task sequence of a dual-arm manipulator simultaneously performing two different tasks by using two arms. Through experimental study, we show that it is possible to generate robot motions from a cooking recipe and perform the cooking motions while simultaneously moving the left and right arms.

Keywords:

robot; motion planning; cooking; task; manipulation graph

1. Introduction

One of the ultimate goals of robotics research is to replace humans with robots in everyday household chores. This research focuses on cooking as one such household chore performed by humans. If a robot is to cook a meal, it must understand cooking recipes that were written for humans and must process ingredients appropriately based on these recipes. In order to deal with this problem, we propose a graph-based approach to convert a cooking recipe into an executable format for a robot and plan the robot’s motion based on the recipe.

The information described in the recipe is not sufficient for a robot to perform a cooking task. To control a robot based on a cooking recipe, it is necessary for a robot to add the actions which are not explicitly described in the recipe that a human performs unconsciously. For example, if “cut carrots” is described in a recipe we need to execute the action of “placing the carrots on the cutting board” beforehand. In addition, we need the action of “grasping the knife” before the cutting operation. However, these instructions are usually not described in the cooking recipe.

In this study, we propose a motion planner for the robotic cooking tasks by using the information from a cooking recipe. The planner can automatically determine objects and motions that are not explicitly described in the cooking recipe. As shown in Figure 1, our method represents a cooking task using a graph structure that can express the relationship among actions, tools, hands and objects with their state changes. Before a robot motion is planned, the elements of the network structure are stored in a database named the “action library” where each graph element includes a set of arguments defining the status of a task. From the arguments of the input node, we can detect the lacking information of the cooking recipe. The lacking information is added by the corresponding graph element obtained from searching the database. In addition, by introducing the hand node, our proposed planner automatically determines the bimanual coordination by assigning the extracted motion from a cooking recipe to one of the hands.

In this paper, after explaining the related works in Section 2, we explain each element of the proposed network structure in Section 3. We explain how to compose the network structure from a cooking recipe in Section 4. We show the results of experiments in Section 5.

2. Related Works

To plan the motion of a robot performing a task, Wolfe et al. [1] proposed hierarchical structured planning problems including the task and the motion layers, our proposed framework can also be categorized into the task and motion planning problems. For the task planning, a series of logic operations has been proposed such as in [2,3]. For the motion planning, the motion of a robot is planned by using several methods, such as the topological optimization [4,5] and graph search [6]. Recently, some extensions of the task and motion planning have been proposed, such as partially observable environment [7], symbolic method [8], real-time planning [9] and regrasp planning [10,11].

In addition, dual-arm manipulation has been receiving attention by several researchers such as [12,13]. In recent years, motion planning for dual-arm manipulation methods has been proposed, such as sequential planning [14,15], coordinated planning [16], assembly planning [17] and assembly planning with regrasp [18].

In addition, some researchers have focused on the robotic cooking task. Yamazaki et al. [19] performed robotic cooking operations like cut and peel by recognizing the foods and the cutting board. Mu et al. [20] analyzed the mechanics of the cutting operation for cooking ingredients. Yamaguchi et al. [21] created a robot to learn a pouring operation.

Recently, some studies have been conducted on robotic motion planning from linguistic information such as recipes. Inagawa et al. [22] propose a method to analyze a cooking recipe and generate a cooking behavior that can be executed by a robot based on the obtained behavior code. Beetz et al. [23] applied the natural language processing to recipes uploaded in the web and performed the action planning of robots. Lisca et al. [24] proposed a framework for conducting chemical experiments from linguistic information. However, In these studies, information that is not explicitly described in the linguistic information cannot be used to generate the robot’s motion. Kazhoyan et al. [25], however, prepared reasoning rules written in action descriptions and used them for supplying information which is not contained in the language. In addition, large-scale recipe data have been trained with RNN to build inference models [26]. Paulius et al. [27,28,29] proposed a representation framework called FOON (Functional Object-Oriented Network) to model the relationship between actions and objects and the state changes of objects in a task. On the other hand, we have proposed the framework on task and motion planning of dual-arm manipulator from a cooking recipe composed of food image and cooking instructions. In our previous research [30], we explained the entire structure of robot motion planning and detailed the recognition of ingredients from the food image and coordination of two arms. This paper focuses on the motion planning framework based on the graph structure, which solves the mentioned problem of planning the robot’s cooking task from a cooking recipe. We detail the method for generating the graph element, named the functional unit, and add the actions not explicitly described in the recipe. Our graph-based method clearly visualizes the action sequence and can obtain several information items from the graph structure. If two actions are parallel to each other, they can be executed by two arms at the same time. In addition, the connection condition of two functional units is clearly defined using the input/output arguments. To add the missing information, instead of preparing the reasoning rules, our method uses the graph element named the functional unit with placeholders. With our graph definition, the condition for completion of missing information can also be clearly defined by using the input/output arguments.

3. Proposed Method

In this paper, we represent a cooking task using a graph structure including the actions associated with objects and the state changes of objects. This graph structure enables us to identify actions and objects that are necessary for executing a cooking task but are not described in a cooking recipe. Our proposed planner generates a sequence of cooking task motions while complementing it with actions that are not explicitly described in a cooking recipe. The graph structure used in this study extends the FOON (Functional Object-Oriented Network) [27,28] to handle tools like cooking utensils and to consider the robot states. Section 3.1 describes the network structure used in this study and Section 3.2 describes the procedure for creating a graph based on a cooking recipe.

3.1. Graph Structure

The graph structure used in this study is a directed graph consisting of edges and three types of nodes, i.e., the object, hand and motion nodes. The nodes have attributes and can represent their state change in the manipulation task. This network structure is created by connecting multiple functional units where a functional unit denotes the smallest unit of the graph structure, representing the information contained in a single action. A functional unit also consists of edges and three types of nodes. In the following, we first describe the details of the nodes and functional units included in the graph structure.

3.1.1. Object Node

An object node represents the object in action. The types of objects in cooking include ingredients, seasonings, tools and containers. An object node includes attributes of type, name, place and state (see Table 1). If two instances of the same object type have two different attributes, they can be treated as different object nodes. This makes it possible to determine the action needed for executing the task. For example, as shown in Figure 2, it is possible to distinguish between “a knife that exists in a storage location” and “a knife that is being grasped by the robot”. The knife needed for the cutting operation can be specified as the “grasped knife” on the right side of Figure 2. If the knife is in the storage area, we can determine that the grasping action is required before the cutting action.

3.1.2. Hand Node

A hand node is assigned for each robotic hand used in the cooking task. It includes the usage status of the robot hand as an attribute. Specifically, it has the names of the objects grasped by the hand. With these attributes, it is possible to determine the task sequence according to the number of robot arms used. For example, we can plan the motion of a robot stir frying a food while simultaneously holding a spatula in one hand and a cooking pan in the other.

3.1.3. Motion Node

A motion node represents an action. In a cooking task, there are two types of motions: main motion and sub-motion (Table 2). The main motion includes cooking motions like cut, pour and boil. These actions are usually described in a cooking recipe. The sub-motion is used to prepare for a cooking motion, e.g., pick and place, grasp and release. This node does not have any attributes.

3.1.4. Functional Unit

A functional unit consists of a motion node, hand nodes and object nodes where object and hand nodes work as inputs and outputs for a motion node. The attributes of the input objects and hand nodes are updated according to the changes caused by the execution of the cooking task. The functional unit represents the objects needed for executing the motion and its changes during the motion execution. An example of the operation “cut potato in half” is shown in Figure 3. To execute this operation, the cutting board, the potato and the knife need to be in the input objects and hand nodes. We also show that, according to the motion execution, the state of the potato and the knife changes, highlighted in yellow in Figure 3.

3.1.5. Discussion

Different from FOON, our proposed network structure can take into account tools like cooking utensils and possible orders of actions by considering the attributes of a hand node. For example, consider the case where a knife or a chopping board is in a storage location when the robot performs a cutting operation. In such a case, our method can include actions such as “grasping the knife” and “preparing the cutting board” in the task planning (Figure 4).

In addition, the execution order is determined based on the status of the robot hand. In the example shown in Figure 4, there are multiple execution sequences, but by taking into account the status of the robot hands, we can determine the execution order, i.e., grasping of a knife should be carried out after the picking and placing of a potato; this is because the hand should not grasp anything before executing the pick and place.

3.2. Graph Construction

By using the graph structure described in Section 3.1, we can identify the missing actions and objects in a cooking recipe. We store the functional units on basic operations used for cooking in a database. If missing actions or objects are identified, we search in the database and make adequate modifications to a functional unit stored in the database. We use this modified functional unit to add the missing actions and objects. In this section, we first describe the motion database and then explain the procedure for creating a network adding the missing actions and objects.

3.2.1. Motion Database

In this subsection, we explain the motion database (DB) used in this study. In this study, we use two types of motion DBs for main motion and sub-motion, which will be referred to as the main motion DB and the sub-motion DB, respectively. In each DB, functional units in which the parameters have not been set yet are stored. We call such functional units the functional units with placeholders. By setting adequate parameters to the arguments, an appropriate functional unit can be created.

The main motion DB stores the functional units with placeholders corresponding to the main motion, i.e., cooking motion explicitly described in a cooking recipe. The type of object required, included in a functional unit, is determined for each main motion. For example, the objects required for the “pouring motion” are the “ingredient” to be poured, the “container” in which the ingredient is placed before pouring and the “container” into which the ingredient is poured. As shown in this example, even if an object cannot be uniquely defined, the type of object can be defined for each operation and objects that cannot be uniquely defined can be expressed as functional units with placeholders as shown in Figure 5 where placeholders are marked in yellow. Such functional units with placeholders are constructed for each main motion and stored in the main motion DB.

In the sub-motion DB, functional units for each sub-motion, i.e., the motion which is not described in the cooking recipe, are stored. Sub-motions are related to the handling of objects and include “Pick & Place,” “Grasp,” ”Release “Pick & Place”, “Grasp”, “Release”, etc. The types of objects required for these motions can also be defined and they can be expressed as functional units with placeholders as shown in Figure 6 where placeholders are marked in yellow. Such functional units with placeholders are constructed for each sub-motion and stored in the sub-motion DB.

3.2.2. Graph Connection

In this subsection, we explain the procedure for creating a whole graph structure to achieve the cooking task from a recipe. Figure 7 and Figure 8 show the outline of the creation procedure using the motion DB.

The input is a sentence described in a cooking recipe. First, it is parsed to obtain the relation between the object and the motion (Figure 7➀), where the i-th motion obtained here is denoted as the main motion (i) (

i = 1, 2, \dots, N

). Then, we search for the functional unit with placeholders stored in the main motion DB corresponding to the main motion (i), and substitute the extracted object and motion from the recipe for the arguments and create a functional unit (Figure 7➁). The next step is to recognize the object initially placed in the robot’s working environment. We obtain the name, state and place of the object and create the object node. Then, the functional unit is created in which the object node obtained here becomes the output node (the input and motion nodes are assumed to be empty nodes) (Figure 7➂). Finally, the functional units obtained here are connected based on the merging algorithm to complete the network (Figure 7➃). The details of this merging algorithm are described in the following paragraph.

The flowchart of the network connection is shown in Figure 9. In the connection algorithm, we first create a list of input nodes (Figure 9(A)) where the elements of this list are the input of the functional unit for the main motion i (abbreviated as the functional unit (i)) (

i = 1, \dots, N

). Then, we create a list of output nodes (Figure 9(B)) where the elements of this list are the output nodes of a functional unit to be connected to the functional unit (i). For example, when a robot starts a cooking task, the elements of the output node list are the output of a functional unit, the input of which includes the objects’ initial configuration.

Next, the elements of the input/output node lists are compared with the input/output of functional units to see if they have the same names and attributes (Figure 9(C)). For example, if the elements of the output node list match the input of the functional unit (j) as shown in Figure 10, we connect the functional unit (j) to the network structure by deleting these elements from the output node list and appending the output of the functional unit (j) to the output node list (Figure 9(C-1)).

Furthermore, if the elements of the output node list do not match the input of any functional units for the main motion (abbreviated as the main-functional units) as shown in Figure 11, we create a new functional unit for the sub-motion (abbreviated as the sub-functional unit) and try to connect it to the network structure (Figure 9(C-2)). We substitute the elements of the output node list for the arguments of a sub-functional unit stored in the sub-motion DB. Then, the sub-functional unit is connected to the network structure by deleting the elements of the output node list and appending the output node of the sub-functional unit to the output node list (Figure 9(B)). The functional units are iteratively connected by comparing the elements of the input/output node lists with the input/output of functional units (Figure 9(C)). At each connection, we check if the input node list is empty (Figure 9(D)). If the input node list is empty, i.e., all the input nodes of the main motion are combined with functional units (Figure 9(D)), the actions necessary to execute the main motion are added and the main motion is ready to be executed. Finally, the order to execute each motion is determined, taking into account the number of available arms (Figure 9(E)). Here, an executable sequence will not exist if the arm is grasping an object and cannot perform any other action. If there is no executable sequence, we create and combine sub-functional units that eliminate such cases to obtain an executable task planning (Figure 9(E-1)).

After the above process is performed for the main motion (i) (

i = 1, 2, \dots, N

), we obtain a whole network structure that represents the cooking task.

3.3. Bimanual Task Planning

We introduce a dual-arm robot to efficiently perform the cooking task. Introduction of the hand nodes to a functional unit enables us to consider the number of hands used for each cooking action and to plan the cooking task considering multiple hands performing multiple tasks in parallel. In order to reduce the execution time, we plan the cooking task while taking into account whether two actions can be simultaneously performed with two different hands. An example is shown in Figure 12 where we explain the method of determining which hand to be used and explain the method of simultaneously executing multiple tasks.

First, the hand to be used in each action is determined by considering the distance from the hand to the object to be grasped. Figure 13 shows an example where the objects placed in the storage space (Right) and (Left) are grasped by the R-Hand and L-Hand, respectively. If there are multiple candidates, the hand to be used is determined by evaluating the length of the trajectory to execute the action. Figure 13 shows an example where the objects placed in the work space are grasped by the hand with shorter length of trajectory.

From the graph structure, we can extract two motion nodes which can be simultaneously performed. If two different hands can be assigned for these two actions, we plan these two actions to be executed simultaneously.

3.4. Motion Planning

In this section, we describe the motion planning method to realize the planned task sequence. Before executing the cooking task, we define multiple grasping poses to stably grasp a tool as shown in Figure 14. When executing the cooking task, we use a vision sensor to detect the pose of the tool. Once we obtain it, we select the highest priority grasping pose where the IK is solvable. The priority for grasping poses is set in a heuristic fashion. After the grasping pose is determined, the trajectory of the robot is planned by using RRT-connect.

4. Experiment

To verify the effectiveness of our approach, we conducted the experiments using a real dual-arm manipulator.

4.1. Experimental Setup

As a cooking recipe, we consider two simple Japanese sentences that are equivalent to “Cut potatoes in half and pour them into a bowl” and “Heat oil in a frying pan over high heat and fry pork” in English. We used CaboCha [31] to obtain the dependency structure of Japanese sentences written in a recipe. In this experiment, we prepared a work space and a storage space in a kitchen environment. We assume that the objects are stored in the left or right storage spaces before the cooking task starts (Figure 15). We use two UR-3 robot arms in which a 2-Fingered 85 mm Robotiq Gripper is attached at the tip. UR-3 has 6DOF and can reach the area with 500 mm radius. Its payload is 3 kg. For more detailed specifications, please refer to the website of UR robots [32]. Since we use two robot arms with independent controllers, we approximately realized the coordination of two arms. When two arms move at the same time, we send the independent motion command to both arms at the same time. By using RRT-connect, we can specify the via points between the initial and final configurations. The trajectory between two via points is determined by using the native function prepared in the controller of the UR-3 robot.

We constructed the cutting board, spatula, knife and stove by ourselves using a 3D printer.

We attached an ArUco marker to each tool. The pose of the tool is detected by capturing the image of the ArUco marker through a 2D RGB camera (Figure 16). We prepared eight grasping poses for the cutting board and one grasping pose for other tools. On the other hand, the relative pose of ingredients from the tools is assumed to be known. After detecting the position/orientation of the tool, the target wrist pose can be obtained. Then, the manipulator is commanded by using the open-loop control.

4.2. Results

We first show some examples of functional units and corresponding robot motion. We show two main motion (cut in half and pour) and four sub-motion (two pick and place, grasp and release) actions in Figure 17. When cutting a potato in half (a), the cutting board is placed on the work space with a potato placed on it. By cutting, the state of the potato changes from whole to cut. In addition, the state of the knife changes from clean to dirty and the hand is always holding the knife. On the other hand, by grasping the knife (e), its place changes from stand for knife to hand.

We then generated a graph structure corresponding to each sentence. The graph structure corresponding to “Cut potatoes in half and pour them into a bowl” is shown in Figure 18 where the motion node of the functional unit stored in the main-motion DB is cut (half) and pour. To realize the cooking task, we need five functional units stored in the sub-motion DB where their motion nodes are three pick and place, a grasp and a release. Here, the pick and place of a potato and grasping of the knife are performed in parallel. Figure 19 shows the motion of a robot performing this cooking task.

The graph structure corresponding to “Heat oil in a frying pan over high heat and fry pork” is shown in Figure 20 where motion nodes of functional unit stored in the main-motion DB are pour, turn on, heat (high heat) and stir fry. To realize the cooking task, we need two functional units stored in the sub-motion DB where their motion nodes are pick and place, grasp. Figure 21 shows the robot’s motions while performing this cooking task. As a result, the robot was able to cook autonomously and we confirmed the effectiveness of our method for motion planning from cooking recipes consisting of simple sentences. In both cases, the calculation time of the motion planning is between 1 and 2 min.

5. Conclusions

In this paper, we proposed a method for planning a cooking task using a dual-armed robot by representing the cooking task in a graph structure. Our proposed method can deal with actions and objects that are not explicitly described in the cooking recipe and automatically add them to the task contents. We applied the proposed method to sentences of a simple cooking recipe and confirmed that the dual-armed robot can perform the cooking task based on the result of task planning.

In the future, we will verify the effectiveness of the proposed method for cooking recipes with more complex sentences. In addition, when cutting food, the manipulator has to carefully adjust the force applied to the food from the knife. Such closed-loop control is not considered in the current paper and is our future research topic. Furthermore, we have manually defined each functional unit. The automatic construction of functional units is considered to be our future research topic.

Author Contributions

Conceptualization, K.H. and I.G.R.-A.; methodology, K.T., T.K., W.W. and K.H.; software, K.T., W.W. and T.K.; validation, K.T.; formal analysis, K.T., N.Y., K.T. and I.G.R.-A.; investigation, K.H. and T.K.; resources, K.H.; data curation, K.T.; writing—original draft preparation, K.T.; writing—review and editing, K.H., N.Y., I.G.R.-A., W.W. and T.K.; visualization, K.T. and K.H.; supervision, K.H., T.K., N.Y. and I.G.R.-A.; project administration, K.H., T.K., N.Y. and I.G.R.-A.; funding acquisition, K.H. and N.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This paper is based on results obtained from a project, JPNP20006, commissioned by the New Energy and Industrial Technology Development Organization.

Data Availability Statement

For more figures, videos and explanation on this research project, visit https://www.roboticmanipulation.org/res/cook, accessed on 14 October 2022.

Conflicts of Interest

The authors declare no conflict of interest.

References

Wolfe, J.; Marthi, B.; Russell, S. Combined task and motion planning for mobile manipulation. In Proceedings of the 20th International Conference on Automated Planning and Scheduling, Toronto, ON, Canada, 12–16 May 2010. [Google Scholar]
Wally, B. Flexible production systems: Automated generation of operations plans based on ISA-95 and PDDL. IEEE Robot. Autom. Lett. 2019, 4, 4062. [Google Scholar] [CrossRef] [Green Version]
Zhang, S.; Jiang, Y.; Sharon, G.; Stone, P. Multirobot symbolic planning under temporal uncertainty. In Proceedings of the 16th Conference on Autonomous Agents and Multi-Agent Systems, São Paulo, Brazil, 8–12 May 2017; pp. 501–510. [Google Scholar]
Ratliff, N.; Zucker, M.; Bagnell, J.; Srinivasa, S. CHOMP: Gradient optimization techniques for efficient motion planning. In Proceedings of the IEEE International Conference on Robotics and Automation, Kobe, Japan, 12–17 May 2009. [Google Scholar]
Schulman, J.; Duan, Y.; Ho, J.; Lee, A.; Awwal, I.; Bradlow, H.; Pan, J.; Patil, S.; Goldberg, K.; Abbeel, P.; et al. Motion planning with sequential convex optimization and convex collision checking. Int. J. Robot. Res. 2014, 33, 1251–1270. [Google Scholar] [CrossRef] [Green Version]
Simeon, T.; Laumond, J.; Cortes, J.; Shbani, A. Manipulation planning with probabilistic roadmaps. Int. J. Robot. Res. 2004, 23, 729–746. [Google Scholar] [CrossRef] [Green Version]
Kaelbling, L.; Lozano-Perez, T. Integrated task and motion planning in belief space. Int. J. Robot. Res. 2013, 32, 1194–1227. [Google Scholar] [CrossRef] [Green Version]
Garrett, C.; Lozano-Pérez, T.; Kaelbling, L. Sampling-based methods for factored task and motion planningm. Int. J. Robot. Res. 2018, 37, 1796–1825. [Google Scholar] [CrossRef] [Green Version]
Woosley, B.; Dasgupta, P. Integrated real-time task and motion planning for multiple robots under path and communication uncertainties. Robotica 2018, 36, 353–373. [Google Scholar] [CrossRef]
Wan, W.; Harada, K. Developing and comparing single-arm and dual-arm regrasp. IEEE Robot. Autom. Lett. 2016, 1, 243–250. [Google Scholar] [CrossRef] [Green Version]
Wan, W.; Harada, K. Regrasp planning using 10,000 grasps. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, Vancouver, BC, Canada, 24–28 September 2017; pp. 1929–1936. [Google Scholar]
Siciliano, B. Advanced bimanual manipulation: Results from the dexmart project. In Advanced Bimanual Manipulation: Results from the Dexmart Project; Springer Science & Business Media: New York, NY, USA, 2012; Volume 80. [Google Scholar]
Krüger, J.; Schreck, G.; Surdilovic, D. Dual arm robot for flexible and cooperative assembly. CIRP Ann. 2011, 60, 5–8. [Google Scholar] [CrossRef]
Cohen, B.; Phillips, M.; Likhachev, M. Planning single-arm manipulations with n-arm robots. In Proceedings of the 8th Annual Symposium on Combinatorial Search, Ein Gedi, Israel, 11–13 June 2015. [Google Scholar]
Kurosu, J.; Yorozu, A.; Takahashi, M. Simultaneous dual-arm motion planning for minimizing operation time. Appl. Sci. 2017, 7, 2110. [Google Scholar] [CrossRef] [Green Version]
Ramirez-Alpizar, I.; Harada, K.; Yoshida, E. Human-based framework for the assembly of elastic objects by a dual-arm robot. Robomech. J. 2017, 4, 20. [Google Scholar] [CrossRef]
Stavridis, S.; Doulgeri, Z. Bimanual assembly of two parts with relative motion generation and task related optimization. In Proceedings of the 2018 IEEE/RSJ Int.l Conference on Intelligent Robots and Systems, Madrid, Spain, 1–5 October 2018; pp. 7131–7136. [Google Scholar]
Moriyama, R.; Wan, W.; Harada, K. Dual-arm assembly planning considering gravitational constraints. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, Macau, China, 4–8 November 2019; pp. 5566–5572. [Google Scholar]
Yamazaki, K.; Watanabe, Y.; Nagahama, K.; Okada, K.; Inaba, M. Recognition and manipulation integration for a daily assistive robot working on kitchen environments. In Proceedings of the 2010 IEEE International Conference on Robotics and Biomimetics, Tianjin, China, 14–18 December 2010; pp. 196–201. [Google Scholar]
Mu, X.; Xue, Y.; Jia, Y.B. Robotic cutting: Mechanics and control of knife motion. In Proceedings of the 2019 International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada, 20–24 May 2019; pp. 3066–3072. [Google Scholar]
Yamaguchi, A.; Atkeson, C.G. Stereo Vision of Liquid and Particle Flow for Robot Pouring. In Proceedings of the 2016 IEEE-RAS 16th International Conference on Humanoid Robots (Humanoids), Cancun, Mexico, 15–17 November 2016; pp. 1173–1180. [Google Scholar]
Inagawa, M.; Takei, T.; Imanishi, E. Japanese Recipe Interpretation for Motion Process Generation of Cooking Robot. In Proceedings of the 2020 IEEE/SICE International Symposium on System Integration, Honolulu, HI, USA, 12–15 January 2020; pp. 1394–1399. [Google Scholar]
Beetz, M.; Klank, U.; Kresse, I.; Maldonado, A.; Mösenlechner, L.; Pangercic, D.; Rühr, T.; Tenorth, M. Robotic roommates making pancakes. In Proceedings of the 2011 11th IEEE-RAS International Conference on Humanoid Robots, Bled, Slovenia, 26–28 October 2011; pp. 529–536. [Google Scholar]
Lisca, G.; Nyga, D.; Bálint-Benczédi, F.; Langer, H.; Beetz, M. Towards robots conducting chemical experiments. In Proceedings of the 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems, Hamburg, Germany, 28 September–3 October 2015; pp. 5202–5208. [Google Scholar]
Kazhoyan, G.; Beetz, M. Programming Robotic Agents with Action Descriptions. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Vancouver, BC, Canada, 24–28 September 2017. [Google Scholar]
Chen, H.; Tan, H.; Kuntz, A.; Bansal, M.; Alterovitz, R. Enabling robots to understand incomplete natural language instructions using commonsense reasoning. In Proceedings of the 2020 IEEE International Conference on Robotics and Automation (ICRA), Paris, France, 31 May–31 August 2020; pp. 1963–1969. [Google Scholar]
Paulius, D.; Huang, Y.; Milton, R.; Buchanan, W.D.; Sam, J.; Sun, Y. Functional object-oriented network for manipulation learning. In Proceedings of the 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems, Daejeon, Korea, 9–14 October 2016. [Google Scholar]
Paulius, D.; Jelodar, A.B.; Sun, Y. Functional Object-Oriented Network: Construction & Expansion. In Proceedings of the 2018 IEEE International Conference on Robotics and Automation, Brisbane, Australia, 21–25 May 2018. [Google Scholar]
Paulius, D.; Dong, K.S.P.; Sun, Y. Task Planning with a Weighted Functional Object-Oriented Network. In Proceedings of the 2021 IEEE International Conference on Robotics and Automation (ICRA), Xi’an, China, 30 May–5 June 2021. [Google Scholar]
Takata, K.; Kiyokawa, T.; Ramirez-Alpizar, I.; Yamanobe, N.; Wan, W.; Harada, K. Efficient Task/Motion Planning for a Dual-arm Robot from Language Instructions and Cooking Images. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, Kyoto, Japan, 23–27 October 2022. [Google Scholar]
CaboCha. Available online: https://taku910.github.io/cabocha/ (accessed on 14 October 2022).
UR Robot. Available online: https://www.universal-robots.com/ (accessed on 14 October 2022).

Figure 1. Overview of the proposed framework where the graph structure is obtained from the working space recognition where lacking information is added by using the action library.

Figure 2. An example where the same objects are represented in different nodes.

Figure 3. An example of functional unit of “Cut potato in half”.

Figure 4. An example of task planning including the tool use.

Figure 5. An example of the functional unit with placeholders stored in the main motion DB.

Figure 6. An example of the functional unit with placeholders stored in the sub-motion DB.

Figure 7. Flowchart of Graph construction from motion DB.

Figure 8. Overview of graph construction by using the functional units.

Figure 9. Flowchart of combine algorithm.

Figure 10. Connection of a functional unit when it has the same name and attributes in the input/output node list.

Figure 11. Connection of a functional unit when it does not have the same name and attributes in the input/output node list.

Figure 12. Motion planning taking the simultaneous execution with multiple hands where the task assigned to R hand and L hand are marked in yellow and blue, respectively (Left: single arm, right: dual arm).

Figure 13. Selection of hand according to the area determined on the table.

Figure 14. Definition of grasping pose considering its priority.

Figure 15. Experimental conditions.

Figure 16. Camera equipped with the hand to see the AR markers attached at the tools.

Figure 17. Some example of functional units used in the experiment.

Figure 18. Obtained graph structure for the instruction “Cut potatoes in half and pour them into a bowl”.

Figure 19. Result of experiment for the instruction “Cut potatoes in half and pour them into a bowl”.

Figure 20. Obtained graph structure for the instruction “Heat oil in a frying pan over high heat and fry pork”.

Figure 21. Result of experiment for the instruction “Heat oil in a frying pan over high heat and fry pork”.

Table 1. Examples of objects’ name and attributes.

Type	Name	Attribute1 (Place)	Attribute2 (State)
Food	Potato, Beef	Storage space, Cutting board	Whole, Cut, Chopped
Seasoning	Sugar, Salt, Water	Storage space, Cup, Bowl	Whole, Mixed
Tool	Knife, Spatula, Ladle	Storage space, Hand	Clean, Dirty
Container	Cutting board, Bowl	Storage space, Work space	ingredient inside (Potato)

Table 2. Type of motion node.

Motion	Name
	Cut (half)
Main motion	Poar
	Boil
	Pick & Place
Sub-motion	Grasp
	Release

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Takata, K.; Kiyokawa, T.; Yamanobe, N.; Ramirez-Alpizar, I.G.; Wan, W.; Harada, K. Graph-Based Framework on Bimanual Manipulation Planning from Cooking Recipe. Robotics 2022, 11, 123. https://doi.org/10.3390/robotics11060123

AMA Style

Takata K, Kiyokawa T, Yamanobe N, Ramirez-Alpizar IG, Wan W, Harada K. Graph-Based Framework on Bimanual Manipulation Planning from Cooking Recipe. Robotics. 2022; 11(6):123. https://doi.org/10.3390/robotics11060123

Chicago/Turabian Style

Takata, Kota, Takuya Kiyokawa, Natsuki Yamanobe, Ixchel G. Ramirez-Alpizar, Weiwei Wan, and Kensuke Harada. 2022. "Graph-Based Framework on Bimanual Manipulation Planning from Cooking Recipe" Robotics 11, no. 6: 123. https://doi.org/10.3390/robotics11060123

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Graph-Based Framework on Bimanual Manipulation Planning from Cooking Recipe

Abstract

1. Introduction

2. Related Works

3. Proposed Method

3.1. Graph Structure

3.1.1. Object Node

3.1.2. Hand Node

3.1.3. Motion Node

3.1.4. Functional Unit

3.1.5. Discussion

3.2. Graph Construction

3.2.1. Motion Database

3.2.2. Graph Connection

3.3. Bimanual Task Planning

3.4. Motion Planning

4. Experiment

4.1. Experimental Setup

4.2. Results

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI