Dynamic robot routing optimization : State–space decomposition for operations research-informed reinforcement learning

Nowadays there are numerous powerful software packages available for most areas of machine learning (ML). There is a growing interest in implementing artificial intelligence for operations research in the industrial environment. While numerous classic operations research solvers ensure optimal solutions, they often struggle
with real-time dynamic objectives and environments, such as dynamic routing problems, which require periodic algorithmic recalibration. To deal with dynamic environments, deep reinforcement learning has shown great potential with its capability as a self-learning and optimizing mechanism. However, the real-world applications of reinforcement learning are relatively limited due to lengthy training time and inefficiency in high-dimensional state spaces. In this study, we introduce two methods to enhance reinforcement learning for dynamic routing optimization. The first method involves transferring knowledge from classic operations research solvers to reinforcement learning during training, which accelerates exploration and reduces lengthy training time. The second method uses a state–space decomposer to transform the high-dimensional stat space into a low-dimensional latent space, which allows the reinforcement learning agent to learn efficiently in the latent space. Lastly, we demonstrate the applicability of our approach in an industrial application of an automated welding process, where our approach identifies the shortest welding pathway of an industrial robotic arm to weld a set of dynamically changing target nodes, poses and sizes. The suggested method cuts computation time by 25% to 50% compared to classic routing algorithms.

Cite

Citation style:
Could not load citation form.

Access Statistic

Total:
Downloads:
Abtractviews:
Last 12 Month:
Downloads:
Abtractviews:

Rights

Use and reproduction: