The Case for human supervisors for Robot warehouse pickers

The Long tail of Machine Learning becomes longer in Supply Chains

In layman’s terms, what the long tail of Machine learning entails is that it will be very challenging to train a Machine to every possible scenario that it will see and will have to make decision on. If you have worked hands on in a Supply Chain environment, you know that the possibilities of weird things happening and prevelance of Murphy’s Law is extremely high in Supply Chains. So our dream of fully autonomouus, self adjusting Supply Chain may be farther than anticipated, as far as timeline goes. In this post, I will use an example of warehouse operations to explain why machines will need a human companion to keep things humming.

We will use the warehouse picking automation scenario. Let us assume that we have a warehouse where picking process is primarily managed by bots. If you are familiar with AI algorithms, you know that some of the methodologies that support algorithms driving this process will be Semantic segmentation, Classification + Localization, Object detection, Instance segmentation etc..

However, beyond the hype of methods, the reality of  automated warehouse operations means that underlying Algorithms must deal with a changing range of products especially in businesses where product design is changed for promotional purpose. And this will be one of the biggest bottlenecks on the path to a fully autonomous robotic picking system. The picking algorithm needs reliable input from the object detection system because ( no brainer ) failure in picking can be expensive causing delayed order completion or even nonfulfillment of customer orders.

So a process control needs to be in place that can make the automated picking processes being adaptable to changes in the environment. And this control also needs to ensure that in addition to making the mistake right, the workflow does not stop/get hampered.

The Learning and Operation Symbiosis (LOS) Model

As a disclaimer, this is not any standard model but invented and named by………my imagination 😁 So my LOS model consists of the two segments- Learning and Operation as shown in Figure below.



The first part is algorithm based so no robotic equipment is needed as it aims to build a detection model for objects which can be calculated on external computing resources. Therefore, images of the objects to differentiate must be generated. A lot of pictures are needed to calculate such a model so that images of an object from different perspectives and angles of rotation must be created. The different rotation angles and perspectives are needed as the object can appear in every orientation in a warehouse . For subsequent calculation of the detection model from the images they are stored in a database.Capture

If the object detection model exceeds the defined performance indicators it is used in a real picking environment which is describe by Operation in the process model. These performance indicators must be defined evaluated during testing.


There the object detection model is applied to the robot control to find objects the robot must pick. An image of the target shelf is recorded by a camera mounted at the mobile robot. The model locates the target object within the images and defines grasping points from the orientation of the object and possible grasping points from the database where master data is saved. If the robot succeeds everything is fine. However, if a problem occurs, e.g. the target object isn’t detected in the shelf because of changes of its design or it is obscured by another object, the robots calls for a human picker.


The human fulfills two important tasks:

  • If the object is in the shelf, he completes the order by picking the object.
  • Furthermore, he must give feedback describing why detection was not possible according to the system’s error message and, if the object is in the image, where it is located.


The system uses this information to improve the detection model for the next try by including the additional images recorded in cooperation with the human picker at the shelf for model calculation. But as the calculation of such a model on a standard computer lasts several days re-calculation cannot be done in real-time on the robot as could be observed during testing.If an object detection model performs very poor the object will be sent back to Learning: more images must be recorded with the Picture Recording Machine.



Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s