Monday, September 26, 2022
HomeRoboticsJob planning in robotics - Robohub

Job planning in robotics – Robohub

Suppose we’ve got a robotic in a easy world just like the one beneath. Let’s contemplate commanding our robotic to carry out a activity corresponding to “take the apple from the shelf and put it on the desk”.

Easy activity planning instance world: A robotic can transfer between a finite set of places, and may choose and place objects at these places.

I’d argue we people have fairly good instinct for a way a robotic may obtain this activity. We may describe what the robotic ought to do by breaking the answer down into particular person actions. For instance:

  • Transfer to the shelf
  • Choose up the apple
  • Transfer again to the desk
  • Place the apple

That is high-quality for our easy instance — and actually, can also be high-quality for a lot of real-world robotics purposes — however what if the duty will get extra difficult? Suppose we add extra objects and places to our easy world, and begin describing more and more complicated targets like “be sure that all of the apples are on the desk and every little thing else is within the rubbish bin”? Moreover, what if we need to obtain this in some optimum method like minimizing time, or variety of whole actions? Our reasoning rapidly hits its restrict as the issue grows, and maybe we need to contemplate letting a pc do the work.

That is what we regularly check with as activity planning: Autonomously reasoning concerning the state of the world utilizing an inner mannequin and developing a sequence of actions, or a plan, to realize a aim.

In my view, activity planning isn’t as common as different areas in robotics you may encounter as a result of… fairly frankly, we’re simply not that good at robotics but! What I imply is that robots are difficult, and lots of commercially accessible techniques are automating easy, repetitive duties that don’t require this stage of high-level planning — it could be overkill! There are lots of lower-level issues to resolve in robotics corresponding to standing up good {hardware}, sturdy sensing and actuation, movement planning, and preserving prices down. Whereas activity planning skews in direction of extra educational settings for that reason, I eagerly await the not-so-distant future the place robots are much more succesful and the whole trade is pondering extra about planning over longer horizons and more and more subtle duties.

On this submit, I’ll introduce the essential items behind activity planning for robotics. I’ll deal with Planning Area Definition Language (PDDL) and take you thru some fundamental examples each conceptually and utilizing the pyrobosim software.

Defining a planning area

Job planning requires a mannequin of the world and the way an autonomous agent can work together with the world. That is often described utilizing state and actions. Given a selected state of the world, our robotic can take some motion to transition to a different state.

Through the years, there have been a number of languages used to outline domains for activity planning. The primary activity planning language was the STanford Analysis Institute Drawback Solver (STRIPS) in 1971, made common by the Shakey challenge.

Since then, a number of associated languages for planning have come up, however one of the vital common in the present day is Planning Area Definition Language (PDDL). The primary PDDL paper was printed in 1998, although there have been a number of enhancements and variations tacked on over time. To briefly describe PDDL, it’s laborious to beat the unique paper.

“PDDL is meant to specific the “physics” of a website, that’s, what predicates there are, what actions are potential, what the construction of compound actions is, and what the results of actions are.”


The purpose of languages like PDDL is that they’ll describe a whole area of potential issues the place a robotic can take the identical actions, however in numerous environments and with completely different targets. As such, the basic items of activity planning with PDDL are a task-agnostic area and a task-specific drawback.

Utilizing our robotic instance, we are able to outline:

  • Area: The duty-agnostic half
    • Predicates: (Robotic ?r), (Object ?o), (Location ?loc), (At ?r ?loc), (Holding ?r ?o), and so forth.
    • Actions: transfer(?r ?loc1 ?loc2), choose(?r ?o ?loc), place(?r ?o ?loc)
  • Drawback: The duty-specific half
    • Objects: (Robotic robotic), (Location shelf), (Location desk), (Object apple)
    • Preliminary state: (HandEmpty robotic), (At robotic desk), (At apple shelf)
    • Objective specification: (At apple desk)

Since domains describe how the world works, predicates and actions have related parameters, that are denoted by ?, whereas a selected object doesn’t have any particular characters to explain it. Some examples to make this concrete:

  • (Location ?loc) is a unary predicate, that means it has one parameter. In our instance, shelf and desk are particular location situations, so we are saying that (Location desk) and (Location shelf) are a part of our preliminary state and don’t change over time.
  • (At ?r ?loc) is a binary predicate, that means it has two parameters. In our instance, the robotic could start on the desk, so we are saying that (At robotic desk) is a part of our preliminary state, although it might be negated because the robotic strikes to a different location.

PDDL is an motion language, that means that the majority of a website is outlined as actions (typically known as operators) that work together with our predicates above. Particularly, an motion comprises:

  • Parameters: Enable the identical kind of motion to be executed for various mixtures of objects that will exist in our world.
  • Preconditions: Predicates which should be true at a given state to permit taking the motion from that state.
  • Results: Adjustments in state that happen on account of executing the motion.

For our robotic, we may outline a transfer motion as follows:

(:motion transfer
:parameters (?r ?loc1 ?loc2)
:precondition (and (Robotic ?r)
(Location ?loc1)
(Location ?loc2)
(At ?r ?loc1))
:impact (and (At ?r ?loc2)
(not (At ?r ?loc1)))



In our description of transfer, we’ve got

  • Three parameters ?r ?loc1, and ?loc2.
  • Unary predicates within the preconditions that slim down the area of parameters that make an motion possible — ?r should be a robotic and ?loc1 and ?loc2 should be places, in any other case the motion doesn’t make sense.
  • One other precondition that’s state-dependent: (At ?r ?loc1). Which means that to carry out a transfer motion beginning at some location ?loc1, the robotic should already be in that location.

Notice that in some circumstances, PDDL descriptions enable for typing, which helps you to outline the area of actions inline with the parameters reasonably than explicitly together with them as unary predicates — for instance, :parameters ?r – Robotic ?loc1 – Location ?loc2 – Location … however that is simply syntactic sugar.

Equally, the results of the motion can add new predicates or negate present ones (in STRIPS these had been specified as separate addition and deletion lists). In our instance, after performing a transfer motion, the robotic is now not at its earlier location ?loc1 and as a substitute is at its supposed new location ?loc2.

The same idea may be utilized to different actions, for instance choose and place. If you happen to take a while to course of the PDDL snippets beneath, you’ll hopefully get the gist that our robotic can manipulate an object solely whether it is on the similar location as that object, and it’s at the moment not holding one thing else.

(:motion choose
:parameters (?r ?o ?loc)
:precondition (and (Robotic ?r)
(Obj ?o)
(Location ?loc)
(HandEmpty ?r)
(At ?r ?loc)
(At ?o ?loc))
:impact (and (Holding ?r ?o)
(not (HandEmpty ?r))
(not (At ?o ?loc)))

(:motion place
:parameters (?r ?o ?loc)
:precondition (and (Robotic ?r)
(Obj ?o)
(Location ?loc)
(At ?r ?loc)
(not (HandEmpty ?r))
(Holding ?r ?o))
:impact (and (HandEmpty ?r)
(At ?o ?loc)
(not (Holding ?r ?o)))

So given a PDDL area, we are able to now come up a plan, or sequence of actions, to resolve numerous kinds of issues inside that area … however how is that this accomplished in observe?

Job planning is search

There’s good purpose for all this overhead in defining a planning area and an excellent language to specific it in: On the finish of the day, activity planning is solved utilizing search algorithms, and far of the literature is about fixing complicated issues as effectively as potential. As activity planning issues scale up, computational prices enhance at an alarming price — you’ll typically see PSPACE-Full and NP-Full thrown round within the literature, which ought to make planning folks run for the hills.

State-space search

One option to implement activity planning is utilizing our mannequin to carry out state-space search. Given our drawback assertion, this might both be:

  • Ahead search: Begin with the preliminary state and increase a graph till a aim state is reached.
  • Backward search: Begin with the aim state(s) and work backwards till the preliminary state is reached.

Utilizing our instance, let’s see how ahead state-space search would work given the aim specification (At apple desk):

(1/6) In our easy world, the robotic begins on the desk. The one legitimate motion from right here is to maneuver to the shelf.

(2/6) After shifting to the shelf, the robotic may both choose the apple or transfer again to the desk. Since shifting again to the desk leads us to a state we’ve got already seen, we are able to ignore it.

(3/6) After choosing the apple, we may transfer again to the desk or place the apple again on the shelf. Once more, inserting the apple would lead us to an already visited state.

(4/6) As soon as the robotic is on the desk whereas holding the apple, we are able to place the apple on the desk or transfer again to the shelf.

(5/6) At this level, if we place the apple on the desk we attain a aim state! The sequence of actions resulting in the state is our ensuing plan.

(6/6) Because the variety of variables will increase, the search drawback (and time to discover a answer) grows exponentially.

Deciding which states to increase throughout search might be purely primarily based on a predetermined traversal technique, utilizing commonplace approaches like breadth-first search (BFS), depth-first search (DFS), and the like. Whether or not we resolve so as to add prices to actions or deal with all of them as unit value (that’s, an optimum plan merely minimizes the whole variety of actions), we may as a substitute resolve to make use of grasping or hill-climbing algorithms to increase the following state primarily based on minimal value. And eventually, no matter which algorithm we use, we most likely need to hold observe of states we’ve got already beforehand visited and prune our search graph to stop infinite cycles and increasing pointless actions.

In movement planning, we regularly use heuristics throughout search; one frequent instance is using A* with the straight-line distance to a aim as an admissible heuristic. However what is an excellent heuristic within the context of activity planning? How would you outline the space to a aim state with out a helpful metric like distance? Certainly, a fantastic portion of the literature focuses on simply this. Strategies like Heuristic Search Planning (HSP) and Quick-Ahead (FF) search to outline heuristics by fixing relaxed variations of the issue, which incorporates eradicating preconditions or destructive results of actions. The de facto commonplace for state-space search in the present day is a variant of those strategies named Quick Downward, whose heuristic depends on constructing a causal graph to decompose the issue hierarchically — successfully taking the computational hit up entrance to rework the issue right into a handful of approximate however smaller issues that proves itself worthwhile in observe.

Beneath is an instance utilizing pyrobosim and the PDDLStream planning framework, which specifies a extra complicated aim that makes use of the predicates we described above. In comparison with our instance on this submit, the video beneath has a number of delicate variations in that there’s a separate class of objects to characterize the rooms that the robotic can navigate, and places can have a number of placement areas (for instance, the counter has separate left and proper areas).

  • (At robotic bed room)
  • (At apple0 table0_tabletop)
  • (At banana0 counter0_left)
  • (Holding robotic water0)

Various search strategies

Whereas ahead state-space search is without doubt one of the commonest methods to plan, it isn’t the one one. There are lots of different search strategies that search to construct alternate knowledge constructions whose aim is to keep away from the computational complexity of increasing a full state-transition mannequin as above. Some frequent strategies and phrases you will note embrace:

Usually, these search strategies are inclined to outperform state-space search in duties the place there are a number of actions that may be carried out in any order relative to one another as a result of they depend upon, and have an effect on, mutually unique components of the world. One of many canonical easy examples is the “dinner date instance” which is used to reveal Graphplan. Whereas these slides describe the issue in additional element, the concept is that an individual is getting ready a dinner date the place the top aim is that dinner and a gift be prepared whereas the home is clear — or (dinner current ¬garb). By enthusiastic about the issue on this planning graph construction, we are able to achieve the next insights:

  1. Planning seeks to search out actions that may be executed independently as a result of they’re mutually unique (mutex). This the place the notion of partial ordering is available in, and why there are computational positive aspects in comparison with specific state-space search: As a substitute of individually looking out by alternate paths the place one motion happens earlier than the opposite, right here we merely say that the actions are mutex and we are able to determine which to execute first after planning is full.
  2. This drawback can’t be solved at one stage as a result of cooking requires clear arms (cleanH) and wrapping the current requires quiet, whereas the 2 strategies of taking out the rubbish (carry and dolly) negate these predicates, respectively. So within the answer beneath, we should increase two ranges of the planning graph after which backtrack to get a concrete plan. We first execute prepare dinner to take away the requirement on cleanH, after which carry out the opposite two actions in any order — it doesn’t matter.
  3. There’s another answer the place on the first stage we execute the wrap motion, and on the second stage we are able to execute prepare dinner and dolly (in both order) to realize the aim. You’ll be able to think about this answer could be favored if we moreover required our dinner date hero to have clear arms earlier than beginning their date — gross!

Planning graph for the dinner date drawback [Source].
Straight strains are preconditions and results, strains with bins are no-operation (no-op) hyperlinks, and curved strains are mutual exclusion (mutex) hyperlinks.
One answer to the issue is highlighted in blue by backtracking from the aim state.

To carry this all full-circle, you’ll typically discover state-space search implementations that use strategies like Graphplan on a relaxed model of the issue to estimate the price to aim as a heuristic. If you wish to dig into the main points, I’d suggest A Tutorial on Planning Graph-Primarily based Reachability Heuristics, by Daniel Bryce and Subbarao Kambhampati.

In the direction of extra complicated duties

Let’s get again to our cellular robotic instance. What if we need to describe extra complicated motion preconditions and/or aim specs? In any case, we established at first of this submit that straightforward duties most likely don’t want the large hammer that could be a activity planner. For instance, as a substitute of requesting {that a} particular object be positioned at a selected location, we may transfer in direction of one thing extra common like “all apples needs to be on the desk”.

New instance world containing objects, apple0 and apple1, which belong to the identical object kind (apple).

To discover this, let’s add predicates denoting objects belonging to classes. For instance, (Sort ?t) can outline a selected kind and (Is ?o ?t) to indicate that an object belongs to such a sort. Concretely, lets say that (Sort apple), (Is apple0 apple), and (Is apple1 apple).

We will then add a brand new derived predicate (Has ?loc ?entity) to finish this. Derived predicates are simply syntactic sugar — on this case, it helps us to succinctly outline a whole area of potential states from our library of present predicates. Nevertheless, it lets us categorical extra elaborate concepts in much less textual content. For instance:

  • (Has desk apple0) is true if the thing apple0 is on the location desk.
  • (Has desk apple) is true if not less than one object of kind apple is on the location desk.

If we select the aim state (Has desk apple) for our instance, an answer may contain inserting both apple0 or apple1 on the desk. One implementation of this derived predicate may look as follows, which makes use of the exists key phrase in PDDL.

(:derived (Has ?loc ?entity)
; CASE 1: Entity is a selected object occasion, e.g.,
; (Has desk apple0)
(and (Location ?loc)
(Obj ?entity)
(At ?entity ?loc)
; CASE 2: Entity is a selected object class, e.g.,
; (Has desk apple)
(exists (?o)
(and (Obj ?o)
(Location ?loc)
(Sort ?entity)
(Is ?o ?entity)
(At ?o ?loc)

One other instance could be a HasAll derived predicate, which is true if and provided that a selected location comprises each object of a selected class. In different phrases, should you deliberate for (HasAll desk apple), you would want each apple0 and apple1 to be relocated. This might look as follows, this time making use of the indicate key phrase in PDDL.

(:derived (HasAll ?loc ?objtype)
(and (Obj ?o)
(Sort ?objtype)
(Is ?o ?objtype))
(and (Location ?loc)
(At ?o ?loc))

You possibly can think about additionally utilizing derived predicates as preconditions for actions, to equally add abstraction in different components of the planning pipeline. Think about a robotic that may take out your trash and recycling. A derived predicate might be helpful to permit dumping a container into recycling provided that all of the objects contained in the container are manufactured from recyclable materials; if there’s any non-recyclable object, then the robotic can both select the violating object(s) or just toss the container within the trash.

Beneath is an instance from pyrobosim the place we command our robotic to realize the next aim, which now makes use of derived predicates to specific places and objects both as particular occasion, or by their sorts:

  • (Has desk0_desktop banana0)
  • (Has counter apple1)
  • (HasNone rest room banana)
  • (HasAll desk water)


This submit barely scratched the floor of activity planning, but in addition launched a number of assets the place you’ll find out extra. One central useful resource I’d suggest is the Automated Planning and Appearing textbook by Malik Ghallab, Dana Nau, and Paolo Traverso.

Some key takeaways are:

  1. Job planning requires an excellent mannequin and modeling language to allow us to describe a task-agnostic planning framework that may be utilized to a number of duties.
  2. Job planning is search, the place the output is a plan, or a sequence of actions.
  3. State-space search depends on intelligent heuristics to work in observe, however there are different partial-order strategies that purpose to make search tractable another way.
  4. PDDL is without doubt one of the commonest languages utilized in activity planning, the place a area is outlined because the software to resolve a number of issues comprising a set of objects, preliminary state, and aim specification.
  5. PDDL isn’t the one option to formalize activity planning. Listed below are a number of papers I loved:

After studying this submit, the next ought to now make a bit of extra sense: A number of the more moderen activity planning techniques geared in direction of robotics, such because the ROS2 Planning System (PlanSys2) and PDDLStream, leverage a few of these planners talked about above. Particularly, these use Quick Downward and POPF.

One key pitfall of activity planning — not less than in how we’ve seen it thus far — is that we’re working at a stage of abstraction that isn’t essentially excellent for real-world duties. Let’s contemplate these hypothetical eventualities:

  • If a plan includes navigating between two rooms, however the rooms aren’t linked or the robotic is just too massive to cross by, how we do know this earlier than we’re truly executing the plan?
  • If a plan includes inserting 3 objects on a desk, however the objects bodily is not going to match on that desk, how can we account for this? And what if there’s another mixture of objects that does match and does fulfill our aim?

To handle these eventualities, there presumably is a option to introduce extra knowledgeable data to our area as new predicates to make use of in motion preconditions. One other is to think about a richer area of motion parameters that causes about metric particulars (e.g., particular poses or paths to a aim) extra so than a purely symbolic area would. In each circumstances, we’re pondering extra about motion feasibility in activity planning. This leads us to the sector generally often known as activity and movement planning (TAMP), and would be the focus of the following submit — so keep tuned!

Within the meantime, if you wish to discover the cellular robotic examples on this submit, try my pyrobosim repository and check out the activity and movement planning documentation.

You’ll be able to learn the unique article at

Sebastian Castro
is a software program engineer within the Sturdy Robotics Group (RRG) on the MIT Laptop Science and Synthetic Intelligence Laboratory (CSAIL).

Sebastian Castro
is a software program engineer within the Sturdy Robotics Group (RRG) on the MIT Laptop Science and Synthetic Intelligence Laboratory (CSAIL).



Please enter your comment!
Please enter your name here

Most Popular