Learning Resource-aware Control



Dominik Baumann


  A graph concerning resource aware control Copyright: © Niklas Funk / MPI-IS

In modern engineering systems, such as fleets of autonomous vehicles or mobile robots, we are often dealing with high-dimensional systems with complex dynamics. At the same time, those systems need to interact with each other, e.g., for coordination. For this purpose, they are connected over (typically wireless) communication networks. While wireless networks offer unprecedented flexibility, they also add another layer of complexity: apart from the dynamics, we need to consider that a wireless network has limited bandwidth. If multiple agents share the same network, bandwidth becomes a scarce resource. Thus, conventional control methods relying on periodic communication may not be conceivable.

For such systems, we have two problems to solve: first we need to develop a control strategy that can deal with the dynamics of the system; second we need to respect the constraints of the wireless network. To make matters worse, a separated design of an optimal control and communication strategy will, in general, not yield the overall optimal strategy. That is, control and communication need to be jointly optimized. Since this problem is hard to solve for general, nonlinear systems, we leverage deep reinforcement learning, DRL for short, techniques. In DRL, the agent learns an optimal behavior through interactions with the environment. That way, we can learn joint control and communication strategies from data that seamlessly scale from low-dimensional linear to high-dimensional nonlinear systems. We demonstrate the effectiveness of our method in challenging simulation environments and in experiments on a real-world robotic system. Moreover, we present a first approach toward checking the stability of the learned resource-aware control policy. See our publications CDC 2018 and arXiv 2019.

Please find more information such as videos and code on:

Learning Event-triggered Control from Data through Joint Optimization