**1. Introduction**

There is a clear trend toward using drones for aerial cinematography and media production in general. The main reasons for the emergence of this technology are twofold: First, small drones can be equipped with high-quality cameras, but, at the same time, they are not too expensive, which makes them appealing for amateur and professional users. Second, due to their maneuverability, they broaden the aesthetic possibilities for media production, as they can create novel and unique shots. Besides, media production applications to cover outdoor events can benefit from the use of teams of drones, mainly to produce multi-view shots and film multiple action points concurrently. From a logistic perspective, a multi-drone system could be deployed easily (e.g., instead of camera cranes) to operate in such large-scale, outdoor scenarios without requiring the pre-existence of complex infrastructure.

The main issue with a multi-drone system for media production is its complexity of operation. Currently, two operators per drone are usually required: one to pilot the drone and another to handle

camera movement. The media *director* is the person in charge of the whole system from the production point of view. However, there are additional aspects that need to be accounted for: ensuring safety in operations avoiding collisions and no-fly zones, deciding which cameras allocate to each shot, considering battery levels of drones, etc. For that, enhancing the system with autonomous capabilities to plan and execute shots is rather helpful to alleviate the director's burden and allow her/him to focus on the artistic part.

There exist methods for autonomous mission planning with teams of multiple drones. Indeed, general purpose methods for multi-robot task allocation and scheduling could be adapted to tackle this problem. However, a media director is not necessarily familiar with these kinds of algorithms and their robotics language. Therefore, our objective is to close this gap between a media crew and these autonomous, intelligent systems, so that a director can use an autonomous fleet of drones for media production. In particular, we think that there is a need for a standard language and tools for cinematography mission description. With such tools, a director could talk to her/his autonomous cinematographers in a transparent manner, abstracting herself/himself from the planning and execution procedures behind.

In this paper, we propose a set of tools for mission description in media production. First, a novel language for mission description is presented. This language is used by the media director to specify the desired shots when filming an event. The output is an XML-based file that contains details for all shots specified by the director at each action point. Then, this *shooting mission* is interpreted by the system to be translated into a list of tasks that can be understood and executed by the drones. The proposed language acts as a scripting system for director story-telling, who can hence focus on the artistic part without specific knowledge in multi-robot autonomous planning. Last, we also propose a graphical tool, the so-called director's *Dashboard*, to create and manage XML-based shooting missions. This Dashboard allows the director to interact with the fleet of drones by sending/canceling missions and monitoring them during execution.

This paper continues our prior work [1], where we proposed a taxonomy of cinematography shots to implement and a preliminary version of our Dashboard. Here, we add the language to specify autonomous cinematography missions, and a complete description of the final version of the Dashboard, enhanced with new functionalities. Besides, we showcase the use of our tools in example scenarios to film sport and outdoor events, like a rowing or cycling race. We integrated our system with a real team of drones within the framework of the EU-funded project MULTIDRONE (https: //multidrone.eu/), which aims at building a team of several drones for autonomous media production. We describe the whole process to write, translate, and execute example shooting missions, as well as details of the actual aerial platforms developed for media production.

The remainder of this paper is as follows. Section 2 reviews related work. Section 3 gives an overview of the system, Section 4 presents our language for shooting mission description. Section 5 describes our director's Dashboard. Section 6 depicts some experimental results showcasing the use of our tools, and Section 7 gives conclusions and future work.

#### **2. Related Work**

Media production is adopting drones as replacements for dollies (static cranes) and helicopters, as their deployment is easier and has less costs associated. Thus, the use of drones in cinematography has increased recently, in parallel with the improvement on their technology. For instance, the New York City Drone Film Festival (https://www.nycdronefilmfestival.com.) was the world's first film festival dedicated to drones in cinematography. Besides, drones are quite attractive for live coverage of outdoor events, as they can provide novel angles and visual effects. In October 2016, drones were relevant for news agencies in the coverage of the massive destruction caused by Hurricane Matthew on Haiti [2]. They have also been used in major international sport events, such as the Olympics [3,4].

In the current state-of-the-art solutions for media production with drones, the director usually specify targets subjects or points of interest to be filmed in pre-production, together with a temporarily ordered script, the camera motion types, etc. Then, the drone pilot and the cameraman must execute the plan manually in a coordinated fashion. There are also commercial drones, such as DJI [5], AirDog [6], 3DR SOLO [7], or Yuneec Typhoon [8], that can implement certain cinematographic functionalities autonomously. However, they are typically prepared to track a target visually or with a GNSS and keep it on the image frame (*follow-me* mode), not focusing on high-level cinematography principles for shot performance.

Besides, there are some end-to-end solutions for semi-autonomous aerial cinematographers [9,10]. In these works, a director specifies high-level commands such as shot types and positions for single-drone shooting missions. Then, the drone is able to compute autonomously navigation and camera movement commands to execute the desired shots, but the focus is on static scenes. In [9], an outdoor application to film people is proposed, and different types of shots from the cinematography literature are introduced (e.g., close-up, external, over-the-shoulder, etc.). Timing for the shots is considered by means of an easing curve that drives the drone along the planned trajectory (i.e., this curve can modify its velocity profile). In [10], an iterative quadratic optimization problem is formulated to obtain smooth trajectories for the camera and the look-at point (i.e., the place where the camera is pointing at). No time constraints or moving targets are included. In general, these approaches use algorithms to compute smooth camera trajectories fulfilling aesthetic and cinematographic constraints, which can be formulated as an optimization problem [11–13].

In a multi-drone context, there is little work on autonomous systems for media production. In [14], the optimal number of drones to cover all available targets without occlusion is computed. However, cameras must always be facing targets and smooth transitions are not considered. A more advanced method is presented in [15], where they propose an online planning algorithm to compute optimal trajectories for several drones filming dynamic, indoor scenes. User-defined aesthetic objectives and cinematographic rules are combined with collision and field-of-view avoidance constraints, but they do not address interfacing with the media director.

Regarding user interfaces, there exist several drone *apps* to support photographers, video makers, photogrammetrists, and other professional profiles during their activities. As relevant features, they usually include drone mapping, geo-fencing, and flight logging. In particular, based on regulatory information from each country, some apps can help users indicating no-flight zones such as airports, control zones (CTZ), etc. Local meteorological information (e.g., wind speed, direction, temperature, etc.) is also used in some apps to inform the pilot about flight safety. Moreover, these apps can support the creation of accurate, high-resolution maps, 3D models, and real-time 2D live maps. More specifically on media production, some apps offer autonomous tracking features for shooting video, see for example in [16]. This app supports the execution of several shooting modes, such as *Orbit me* and *Follow me*, as well as the planning in advance of specific flights and shots using a *Mission Hub* on a computer. Thus, the user can pre-program paths that will be later followed by the drone [17,18]. This is done specifying a set of temporally ordered, desired *key-frames* on a 3D reconstruction of the scene, as a rudimentary cinematography plan. Nevertheless, these apps are thought and implemented for single-drone flight and shooting. From a media perspective, the functionality of using more than one drone within the same shooting mission is not properly addressed.

#### **3. System Overview**

We present in this paper a set of tools so that a director can interface with an autonomous fleet of drone cinematographers and govern the system from an editorial point of view. The main contributions are a novel language for mission description in media production and a graphical tool to define those missions and interact with the autonomous system. The general architecture of the system is depicted in Figure 1.

**Figure 1.** General scheme of the system architecture. The director defines shooting missions with the Dashboard and sends them to the Mission Controller. This Mission Controller uses a Planner to compute plans that are then sent to the drones. During the execution of the plan, the Mission Controller sends events to the drones to trigger actions and reports periodically to the Dashboard about the execution/system status.

The director and the editorial team can specify a set of artistic shots and associate them with different events happening in time (e.g., a race start or racers reaching a relevant point). This is done through the *Dashboard*, which is a web-based graphical tool that allows her/him to interact with the rest of the system. The whole set of director shots, together with information of the events with which they are associated, constitutes the so-called *shooting mission*, which is saved in a database during the editing phase. After finishing this editorial phase, the shooting mission is encoded in an XML-based language and sent to the *Mission Controller*, which is the central module managing autonomous planning.

The Mission Controller can understand the director's dialect and is in charge of interpreting the shooting mission, as sequential or parallel *shooting actions* that will be assigned to the available drones in the team. A list of *tasks* to be executed by the drone fleet is compiled, where each task has a starting position and a duration, extracted from the shooting actions' descriptions. Then, the Mission Controller can use a *Planner* to obtain a feasible plan to accomplish that shooting mission.

The Planner should take into account spatial and temporal constraints of the tasks, as well as drone battery levels and starting positions, in order to assign tasks to drones efficiently. The plan consists of a list of *drone actions* for each drone, specifying where the drone should go at each instant and what to do. Basically, each drone gets assigned a sequence of shooting actions, so its list of actions is made up of single-drone shooting actions plus the required navigation actions in between. In general, this Planner module could be implemented by any standard planner for task allocation with temporal and resource constraints and it is not the focus of this paper. Some preliminary ideas can be seen in [19].

Once the plan is computed, it is sent to the drones, which are able to execute it autonomously. During execution, the Mission Controller provides some feedback to the director through the Dashboard, reporting on the status of each drone and the mission itself.

Section 4 describes the XML-based language to describe shooting missions and the process to translate them into plans made up of drone actions. Section 5 describes the design and functionalities of the Dashboard.

#### **4. Language for Cinematography Mission Description**

There are professional drones for filming in the market, and many of them include autonomous capabilities to implement certain specific shots, for example, a panoramic view or tracking a moving target. However, there is no general framework to specify complete missions for autonomous cinematography. In this section, we present a novel language to describe aerial cinematography missions for multiple drones.

We use a vocabulary that the editorial team can understand and define an XML-based dialect to write shooting missions. In the following, we explain the structure of the XML files and how they can be converted into lists of tasks for autonomous drones. The complete XML schema is also publicly available (https://grvc.us.es/downloads/docs/multidrone\_schema.xsd). Our XML schema to describe shooting missions has the following main information entities:


Each mission can have associated a specific *drone team*, made up of several drones. This choice has been made to foresee settings in which more than one multi-drone team are available, so the corresponding team for each mission needs to be specified. Moreover, a mission can have associated an *editorial team*, i.e., a group of Dashboard users, each with their own role, who will be granted permission to manage and modify the mission data.


In summary, the relation of the main components of the XML schema is as follows.

```
<event> Main event: e.g.,~a race to film
<event> Leaf event 1: e.g.,~start line
<mission>
<shootingActionSequence>
<shootingAction>
<shootingRole>
<event> Leaf event 2: e.g.,~finish line
<mission>
<shootingActionSequence>
<shootingAction>
<shootingRole>
```
In a pre-production stage, the director and the editorial team will manage a database with all the above entities through the Dashboard. Then, an XML file with all events and missions associated is generated, i.e., a shooting mission. The Mission Controller receives that file and computes plans for all possible combinations of mission roles and shooting action sequence roles. Before starting the execution of the mission, the director will specify the final selected role for the missions and for the shooting action sequences, which will determine the actual plan to be executed. Note that multiple plans for the different roles are presented to the director, who must choose unique roles for the missions and the SASs. Once the plan is selected, the Mission Controller sends their corresponding actions to the drones and waits for events. Anytime a leaf event occurs, this should be notified to the drones, so that they can trigger the associated shooting actions. The occurrence of these leaf events is either indicated manually by the director (e.g., start of a race) or detected automatically by the Mission Controller (e.g., target reaching a relevant position).

To compute a plan, the Mission Controller extracts the relevant information from the XML file and creates a list of data objects of type SHOOTING ACTION, as indicated in Table 1. Most of the fields of the SHOOTING ACTION data structure come directly from the <shootingAction> XML element. However, some of them come from data in the <shootingActionSequence>, <mission>, and <event> elements, for instance, *Start event*, *Mission ID*, or *Action sequence ID*. Other fields are calculated from the received data, such as the *RT displacement*, which is the difference between the <originOfFormation and the <originOfRT>.


**Table 1.** Structure for the data type SHOOTING ACTION.


**Table 2.** Structure for the data type SHOOTING ROLE.

The Planner would receive a list of these SHOOTING ACTION objects, each with one or multiple SHOOTING ROLE objects included. Then, each individual SHOOTING ROLE represents a task, and the Planner should solve the problem of allocating tasks to drones holding with constraints such as tasks start time and duration, drones' remaining battery, etc. For instance, if a shooting action implements a shot with several drones involved, this is translated into several tasks with the same starting time. Once that assignment is done, the plan for each drone is produced. This plan consists of a list of DRONE ACTION objects, some of them implementing navigation actions and other actual shots, which is specified by an *action type* field. If the action to perform is a shot, all the related information is included a SHOOTING ACTION object with a single SHOOTING ROLE for that drone. Table 3 shows the data structure for the DRONE ACTION objects. Last, the Mission Controller sends the corresponding list of DRONE ACTION objects to each drone and the mission execution can start.

**Table 3.** Structure for the data type DRONE ACTION.

