University of Zurich – Self-Driving Drones

Researchers from the University of Zurich have demonstrated a drone that can detect and avoid fast-moving objects.

Team is training drones how to navigate city streets by having them study things that already know how: cars and bicycles.

It is leveraging deep learning based object detection techniques for detecting drones. In this paper, we have conducted experiments with different Convolutional Neural Network (CNN) based network architectures namely Zeiler and Fergus (ZF), Visual Geometry Group (VGG16).

By using data already collected from self-driving car experiments the method put forward in the paper has advantages over the GPS/obstacle avoidance system found in current drones on the market.

Leveraging Computer vision the system uses each frame and produces two outputs, a steering angle ,and a collision probability. Adding the steering angle will allow drones to adapt to tighter more dynamic environments without suffering from stimulation overload, a concern with current systems.

The researchers’ algorithm uses a single layer convolutional neural network which they have dubbed DroNet. The outputs from the steering angle and collision probability are then used to control the drone. The improved response time has as much to do with the use of machine learning as it does with the way the controls are mapped to the outputs. For example, speed is controlled by the collision probability, which allows the drone to react intuitively to obstacles at a distance rather than creating unnecessarily large buffer zones to prevent crashes.

Research Doc:

RAL18_Loquercio.pdf (

Civilian drones are soon expected to be used in a wide variety of tasks, such as aerial surveillance, delivery, or monitoring of existing architectures. Nevertheless, their deployment in urban environments has so far been limited. Indeed, in unstructured and highly dynamic scenarios drones face numerous challenges to navigate autonomously in a feasible and safe way. In contrast to the traditional map-localize-plan methods, this paper explores a data-driven approach to cope with the above challenges. To do this, we propose DroNet, a convolutional neural network that can safely drive a drone through the streets of a city. Designed as a fast 8-layers residual network, DroNet produces, for each single input image, two outputs: a steering angle, to keep the drone navigating while avoiding obstacles, and a collision probability, to let the UAV recognize dangerous situations and promptly react to them. But how to collect enough data in an unstructured outdoor environment, such as a city? Clearly, having an expert pilot providing training trajectories is not an option given the large amount of data required and, above all, the risk that it involves for others vehicles or pedestrians moving in the streets. Therefore, we propose to train a UAV from data collected by cars and bicycles, which, already integrated into urban environments, would expose other cars and pedestrians to no danger. Although trained on city streets, from the viewpoint of urban vehicles, the navigation policy learned by DroNet is highly generalizable. Indeed, it allows a UAV to successfully fly at relative high altitudes, and even in indoor environments, such as parking lots and corridors.


In unstructured and highly dynamic scenarios drones face numerous challenges to navigate autonomously in a feasible and safe way. Due to the danger that flying a drone can cause in an urban environment, collecting training data results impossible. For that reason, DroNet learns how to fly by imitating the behavior of manned vehicles that are already integrated in such environment. It produces a steering angle and a collision probability for the current input image captured by a forward-looking camera. Then, these high-level commands are transferred to control commands so that the drone keeps navigating, while avoiding obstacles.

DroNet is both versatile and efficient. First, it works on very different environments, both indoor and outdoor, without any initial knowledge about them. Indeed, with neither a map of the environment nor retraining or fine-tuning, our method generalizes to scenarios completely unseen at training time, including indoor corridors, parking lots, and high altitudes. Second, DroNet was designed to require very little computational resource compared to most of the existing deep convolutional networks. This allows real-time performance, even on a CPU.

UZH – Universität Zürich