Conceptualization and First Realization Steps for a Multi-Camera System to Capture Tree Streamlining in Wind

Kammel, Frederik O.; Reiterer, Alexander

doi:10.3390/f15111846

Open AccessArticle

Conceptualization and First Realization Steps for a Multi-Camera System to Capture Tree Streamlining in Wind

by

Frederik O. Kammel

^1,2,*

and

Alexander Reiterer

^1,2

¹

Department of Sustainable Systems Engineering INATECH, Albert-Ludwigs-University Freiburg, 79110 Freiburg, Germany

²

Fraunhofer Institute for Physical Measurement Techniques IPM, 79110 Freiburg, Germany

^*

Author to whom correspondence should be addressed.

Forests 2024, 15(11), 1846; https://doi.org/10.3390/f15111846

Submission received: 30 August 2024 / Revised: 16 October 2024 / Accepted: 17 October 2024 / Published: 22 October 2024

(This article belongs to the Section Natural Hazards and Risk Management)

Download

Browse Figures

Versions Notes

Abstract

:

Forests and trees provide a variety of essential ecosystem services. Maintaining them is becoming increasingly important, as global and regional climate change is already leading to major changes in the structure and composition of forests. To minimize the negative effects of storm damage risk, the tree and stand characteristics on which the storm damage risk depends must be known. Previous work in this field has consisted of tree-pulling tests and targets attached to selected branches. They fail, however, since the mass of such targets is very high compared to the mass of the branches, causing the targets to influence the tree’s response significantly, and because they cannot model dynamic wind loads. We, therefore, installed a multi-camera system consisting of nine cameras that are mounted on four masts surrounding a tree. With those cameras acquiring images at a rate of 10 Hz, we use photogrammetry and a semi-automatic feature-matching workflow to deduce a 3D model of the tree crown over time. Together with motion sensors mounted on the tree and tree-pulling tests, we intended to learn more about the wind-induced tree response of all dominant aerial tree parts, including the crown, under real wind conditions, as well as dampening processes in tree motion.

Keywords:

feature matching; feature extraction; photogrammetry; storm damage risk; tree-pulling tests; 3D model; wind-induced tree response; tree motion

1. Introduction

Forests and trees are an essential part of the Earth system [1]. They are habitats for animal and plant species, serve as a resource for timber and non-timber products, conserve energy, store carbon and water, improve air quality, enhance human health, and enable outdoor recreation [1,2,3,4]. In times of rapidly advancing global change, the question arises as to how the positive ecosystem services of forests and trees can be maintained in the face of changes in the Earth system and social systems [2].

A natural hazard that endangers the maintenance of positive ecosystem services in managed forests and trees in Central Europe is high-impact storms [5]. The negative consequences of storm damage can be minimized by shaping the structure and composition of forests and adapting the management of catastrophic storm damage [6].

To be able to take measures to minimize the negative effects of storm damage risk, the tree and stand characteristics on which the storm damage risk depends must be known [7]. Especially in the rapidly expanding urban areas, minimizing the negative effects of storm damage to forests and trees is becoming increasingly important. Due to increasing urbanization worldwide, trees are seen as multifunctional tools of urban planning [8,9]. Minimizing the vulnerability of urban trees to high-impact storms may be crucial for the risk management of municipal stakeholders [7].

To study a tree’s response, its movements must be monitored. The first attempts at this were tree-pulling tests, in which a rope is attached at different heights to the stem of the tree, and a ground anchor nearby. Deflection meters measure the deflection of the tree along the stem, while the tensioned rope simulates a wind load [10,11]. Other studies have attempted to measure tree movement under real wind conditions, e.g., using electromagnetic targets, inclinometers, or accelerometers [12,13,14]. Another study used low-cost motion-processing units to capture tree motion [15]. These approaches all have in common that they require some physical target or sensor attached to the tree. Firstly, this is impractical when attaching many sensors and targets, as those sensors each need a power and signal cable. Most importantly, though, those targets and sensors also have a mass and/or frontal area. This, in turn, impacts the aerodynamic properties and dynamic behavior of the smaller tree parts at the periphery of the crown that the sensors are mounted to, which are often light and flexible.

Earlier studies, therefore, have attempted to attach optical markers (chessboard patterns) and conduct video analysis [16,17,18,19]. These approaches share a common flaw, which is that they only allow an analysis of partial aspects of the tree response, e.g., branch or stem motion. None of them allow for the simultaneous recording of all dominant aerial tree parts, including the crown, under real wind conditions.

Three-dimensional reconstruction is nowadays a main topic in the phenotyping of plants, especially for precise farming [20]. Different stationary and mobile systems have already been developed [21,22,23]. The existing systems use different measurement principles to capture the 3D structure of plants: captures using a stereo-configuration (e.g., stereo-photogrammetry), measuring the time of flight, or capturing a projected laser line or pattern (e.g., laser triangulation systems). Most systems can be applied from stationary and mobile platforms. However, no existing system is currently able to detect moving plants such as full-scale trees (some 10

m

) with a high spatial (some centimeters) and temporal (10

Hz

) resolution.

Photogrammetry works on the principle of triangulation and derives the 3D geometry from images taken from several camera perspectives [24,25]. The state of the art is to use a pair of cameras as stereo-configuration. The challenge is to find identical points in multiple images reliably and with high precision—this becomes more difficult with variable lighting conditions or moving or deforming objects. Furthermore, the fact that trees in a forest tend to blend in with surrounding trees makes it even more difficult to find and track such identical points, as most image-matching algorithms rely on contrast [26]. Even modern feature-matching approaches, such as SuperGlue, struggle with lighting changes and occlusion [27]. Also, determining the 3D geometry is very computationally intensive and time-consuming. Photogrammetric systems are, therefore, only suitable for real-time applications to a very limited extent.

This paper demonstrates a concept for a photogrammetric measurement system that uses eight cameras to record the crown of a tree with 10 frames per second (FPS). The eight cameras are mounted on four masts, which are installed around the subject tree, allowing for a 360-degree recording of the tree crown. To capture the wind conditions, three sonic anemometers are attached to those masts as well. Since each mast holds two cameras, and the view from adjacent masts overlaps, photogrammetric methods can be used to calculate a 3D model of the crown hull. The crown hull, in turn, can be used to calculate the frontal area of the crown perpendicular to the wind. The movement of the inner tree parts is measured using inertial measurement units (IMUs).

We describe the concept of the hardware setup, i.e., which cameras are used, how they are mounted, and how they are triggered. Furthermore, the paper presents proof-of-concept measurements to demonstrate the capabilities of this setup. Using these proof-of-concept measurements, we also outline the processing pipeline, including how we extract and match features in a very homologous image of a tree and how we plan to create a 3D model of it.

The preparatory research and method development for this paper were performed from 2021 to 2023. The proof-of-concept measurements were obtained in 2022 and 2023 and evaluated in 2024.

2. Concept of a Multi-Camera System and Image-Processing Pipeline for the Automated Creation of a 3D Model of a Tree Hull

2.1. Theoretical Background

The interactions between wind and trees are diverse, and they include multiple spatio-temporal scales. To evaluate those interactions, it is important to understand the two major types of wind loads: chronic and acute wind loads [28].

Chronic wind loads are wind loads that act on trees over long periods, resulting in long-term effects. In general, chronic wind loads are non-destructive. Trees that are exposed to chronic wind loads undergo physiological, morphological, and mechanical acclimation, commonly referred to as thigmomorphogenesis [29]. On the other hand, acute wind loads are high peak wind loads, commonly caused by storm events. They have a destructive impact and commonly lead to wind throw [30,31,32,33,34,35,36].

The response of trees to wind is complex because of a variety of component processes, such as wind–tree interactions [15,37,38], tree–tree interactions [39,40], and tree–soil interactions [10]. Furthermore, the response depends on the wind direction [41], which, in turn, affects the growth of all tree parts. The growth of the tree is thus also affected by its surroundings, namely the built environment, orography, and human interventions [42].

To understand tree response to wind loads, it is important to find the frequency ranges in which kinetic energy transfer from the wind into tree motion is most efficient. The results from field studies indicate that the wind-induced motion of conifers is dominated by bending sways [37,38,43,44]. It is commonly accepted that sway in the range of the damped fundamental sway frequency substantially contributes to the total tree response under low wind conditions. The results from the application of Fourier transform-based mechanical transfer functions [43] suggest that conifers absorb considerable amounts of kinetic energy contained in the wind in the range of their damped fundamental sway frequency [37,44,45,46,47]. However, although sway in the fundamental mode is important for tree motion under low wind conditions, there is debate about the excitation frequencies of wind-induced tree response. Previous studies speculated that the absorption of kinetic energy available from the wind in the range of the damped fundamental frequency is of minor importance for tree motion [47,48]. Instead, wind-induced stem displacement is initially caused by wind components occurring at frequencies below the damped fundamental sway frequency of stems of the studied Scots pine trees [38].

Apart from the experiments mentioned before, wind tunnel studies were carried out in which canopy airflow properties were investigated [49,50]. However, due to concomitant, instantaneous changes in the frontal area of the crown, wind tunnel studies fail at the dynamic parametrization of the wind load equation [51,52]. It is, therefore, only possible to study parts such as branches and treetops of fully grown trees in wind tunnels.

The aim of our work is to create a multi-camera system that is capable of capturing motion of the outer tree hull without the need to attach physical markers or devices onto a tree itself.

2.2. Technical Requirements

With an estimated crown radius of 2

m

and an approximate distance of 6

m

between the cameras and the tree surface, a spatial resolution of approximately

0.10

m can be achieved. We estimate this to be enough to model the tree crown for our purposes. For optimal results in the photogrammetric calculations, a vertical overlap between two images of 60% or more is required. The temporal resolution that we are aiming for is 10

Hz

. This resolution is necessary to detect fast movements with a short duration.

The spatial resolution of 0.10 m mentioned above will be the spatial resolution of the 3D triangulation. It is thus important to choose cameras that achieve a much better image resolution to account for errors in synchronization and errors in feature matching, which will both lead to decreased spatial accuracy.

The entire system is designed such that it can be dismantled and rebuilt at another site within a matter of 2–3 weeks.

2.3. Hardware Setup

2.3.1. Camera Arrangement and Mounting

We used eight cameras of the type Allied Vision Mako G-507C (Allied Vision, Stadtroda, Germany). The reasons for choosing this model were a combination of technical and practical requirements:

It can be controlled through an ethernet interface and powered through Power over Ethernet (PoE), thus simplifying the mounting process, as only one cable needs to be run to each camera.
Its spatial and temporal resolution (2464 by 2056 pixels, 23 FPS [53]) is high enough to fulfill the requirements mentioned above but not too high, such that the bandwidth used for each camera does not exceed 1 Gbit/s.
The cameras have a global shutter, thus eliminating distortions caused by the rolling shutter effect [53,54].

The lens we picked is a FUJINON HF6XA-5M (FUJIFILM Europe GmbH, Ratingen, Germany). It has a focal length of 6

m

m

[55], which is required to fulfil the requirement for 60% vertical overlap with a reasonable distance between the subject tree and the camera.

The cameras were mounted on four masts, which were equally distributed around the subject tree. Each mast held two cameras with a distance of approximately 3 m between each other, as well as three sonic anemometers, which are manufactured by the R. M. Young Company, Model 81000VRE (R. M. Young, Traverse City, MI, USA). The minimum number of masts was dictated by the anemometer measurements, as they require measurements to be made from all four sides at least. For financial reasons, we were not able to install more than four masts and eight cameras. Figure 1 shows the placement of the masts, cameras, and anemometers. The cameras were mounted such that they record in portrait mode, i.e., with the longer side of the sensor being vertical. This avoided capturing too much empty room around the tree and allows for a bigger vertical overlap. Figure 2 shows a side view of the mounting. The cameras were mounted on the masts, rather than being mounted on ground level, as we were more interested in the tree crown, rather than the lower parts of the tree. By doing this, we ensured that the cameras are as close to the subject area as they can be without sacrificing the overlap.

To capture the inner parts of the tree, we installed IMUs at the stem and branches, as described by [56]. Those IMUs primarily recorded the direction of the gravity vector and thus the tree deflection, but they also log acceleration, the angular rate, and the magnetic field strength.

The cameras were mounted inside weatherproof housing that could be directly attached to the mast itself. To measure the movement of the masts, an IMU was attached to each camera housing. As of now, we have not evaluated the recordings of those IMUs. In Section 4.5, we evaluate the vibration visually. We may, however, use the IMU-recordings in the future as a comparison.

2.3.2. Temporal Synchronization and Camera Triggers

Synchronizing the exposure time for all cameras was an important task, as errors in synchronization would ultimately lead to decreased spatial accuracy. Photogrammetric analysis expects static subjects. As we expected high wind speeds in our application, leaves and branches were likely to move at a high speed. Hence, if the cameras were triggered with a slight delay, the tree would have moved in the meanwhile, resulting in spatial uncertainties. Accurate synchronization is also required to correlate the data acquired with meteorological sensors (e.g., sonic anemometers). The validation experiments described below in Section 4 showed that one desynchronized camera may lead to spatial errors of up to 5

c

m

.

The chosen camera model has a so-called General Purpose Input/Output (GPIO) port over which it can receive voltage pulses [53]. These voltage pulses can be configured to trigger a frame recording on the camera [57]. It would, therefore, be possible to use a frequency generator to generate a 10 Hz signal, to distribute this signal to the GPIO connectors of the cameras, and to use that as a trigger source. This poses a practical problem since the masts have a maximum height of 25 m. Adding to that, depending on the situation in the field, the frequency generator might be located up to 50 m away from the masts, resulting in a cable length of up to 75 m. It is not possible for a signal of up to 30 V to travel such a long cable without facing serious dampening issues [58]. To compensate for that, the signal needs to be transformed to a higher voltage and then transformed back before being fed into the camera. Furthermore, other practical limitations, like signal reflection, must be considered when attempting such an approach.

Due to these problems, we used the Precision Time Protocol (PTP) to synchronize the cameras with each other through their ethernet interface [57]. For that, one device is defined as the master clock [59]. All the other cameras in the network synchronize their clocks to the master clock. In our case, we used the time server Meinberg LANTIME M500-IMS (Meinberg Funkuhren GmbH & Co KG, Bad Pyrmont, Germany) [60] as the master clock, which is described in detail below. Next, all cameras were informed about the exact time when the capturing should start. Since all cameras synchronized to the same master clock, this means that all cameras will start recording with, at most, 1

m

s

of delay. This method of synchronization had the added benefit that it does not require additional cables, as it uses the ethernet connection of the camera.

The Meinberg LANTIME M500-IMS time server uses a modular architecture with multiple input sources, like a global navigation satellite system (GNSS) or DCF77 [61]. For our application, we opted to use GNSS as the time source and Network Time Protocol (NTP) and PTP as outputs. Devices that need precise time synchronization (like cameras) could then consume the PTP signal. Other devices (such as sonic anemometers), which either did not support PTP or did not need such precise synchronization, could consume the NTP signal instead.

2.3.3. Data Storage and Camera Control

All cameras required 1 Gbit/s of bandwidth each [58]. Furthermore, all cameras were powered through PoE. The cameras were thus connected to a network switch that had at least nine ethernet ports with at least 1 Gbit/s of bandwidth and PoE support. The switch itself was connected through a redundant 10 Gbit/s connection to a Network Attached Storage (NAS) system equipped with 12 Solid State Drives (SSDs) with 8 TB TEST 8 TB of storage each. Half of the installed SSDs were available for storage, with the other half being allocated to data redundancy (so-called RAID 10 configuration, in which RAID means “redundant array of inexpensive disks”). This setup assured enough data storage available for raw camera recordings, as well as enough data redundancy, in case one or several drives fail, e.g. during the transport from one site to another. The selected NAS model was also capable of running virtual machines and docker containers, such that the control software for the cameras, sonic anemometers, and motion-control units could be run directly on the NAS.

The software to control the camera network was packaged into a docker image, which was run on the NAS. This step was necessary, as there were multiple independent applications that would need to run on separate virtual machines. Running multiple virtual machines consumes a considerable amount more Random Access Memory (RAM) on the NAS than running docker containers, which are much more memory-efficient. Furthermore, the operating system of the NAS had a built-in dashboard for docker containers, simplifying the control of these containers.

To control the cameras, Vimba Software Development Kit (SDK) (version 5.0) was provided by Allied Vision [62]. It conforms to the GenICam standard [62] and provides Application Programming Interfaces (APIs) to control the cameras in C, C++, Python, and all .NET languages (C# and F#). The control software was set up, such that it automatically discovered all cameras in the network. It proceeded to set the cameras up, such that they record with the correct frame rate, the correct output format, and other settings set to appropriate values. Then, it ran the synchronization routine to synchronize all cameras to the PTP master clock and trigger the recording simultaneously on all cameras.

The cameras then delivered the sensor readings in the so-called BayerRG8-format. This is the raw image format used by these cameras, which means that it needs to be debayered before the images can be visualized or processed. This can be done in real time, e.g., using the library OpenCV [63]. Since we did not need to process the data in real time in our use case, we decided to store the unprocessed BayerRG8 data to develop and store the images later. To save storage space, we applied PNG compression to each frame (so-called intra-frame compression). PNG compression is lossless; i.e., the raw sensor readings can be reconstructed without any loss of information. We observed that PNG compression achieves a compression ratio of approx. 60%, meaning that the compressed data take up approx. 40% less storage space than the uncompressed data. According to our own tests, PNG compression is not fast enough to be performed in real time on our processing and control hardware. The time to compress one of our images is approx. 500

m

s

. We, therefore, decided against performing the compression in real time. Instead, all data recorded during the day were compressed every night when no recording was running.

2.4. Data Analysis

2.4.1. Feature Extraction and Feature Matching

The central challenge of the development is the identification and matching of features within corresponding images, as well as in the time series. The initial results from the validation process (see Section 4) showed that automated tracking algorithms, such as SIFT [64], SURF [26], and ORB [65], return good results when tracking features within one camera’s video feed. They mostly fail, however, when attempting to match features between two different camera angles, especially if the two cameras are mounted onto different masts. Given that the cameras were installed in a circular geometry around the tree, there were always some cameras filming against the sun, resulting in poor lighting conditions. We assume that this and the very different appearance of the tree branches from different viewing angles were the main causes of the poor automatic matching results.

For those reasons, we intend to implement the following four-step process for data analysis:

An automated feature-matching algorithm detects feature points for every camera in one frame.
The algorithm attempts to match those features between different cameras. Those matches are then presented to an operator, who can either confirm or correct those suggestions. Lastly, all features that could not be matched automatically are also presented to the operator, who can then perform the matching manually.
The automated tracking algorithm is then used to track all features in every camera over time.
Assuming all extrinsic camera parameters are known, every feature is then triangulated using collinearity equations [66].

The proof-of-concept experiments shown in Section 4 track only a singular feature while the proposed algorithm tracks a multitude of features at the same time. The implementation of the above-described algorithm is still part of future work.

As mentioned in step 4, the collinearity equations used to triangulate the feature in 3D space require knowledge of the extrinsic and intrinsic parameters of each camera [67]. Any errors in the determination of those parameters will, therefore, also have an impact on the triangulation results. As described in Section 2.3.1, the cameras were mounted onto aluminum masts, which were secured using guy ropes. This stabilized the camera mounting as much as possible, but some vibration in the wind was still possible. Section 4.5, therefore, shows a vibration analysis to determine the possible impact of such errors. Section 4.2 also shows in detail how the calibration was performed.

2.4.2. Creation of the 3D Model

The result of step 4 of the aforementioned process will be a 4D-point cloud (3D geometry with a time axis). We will use this point cloud to calculate the outer hull of the tree crown. Using the 3D wind data that we collected with the sonic anemometers, we will proceed to calculate the instantaneous frontal area of the crown perpendicular to the dominant wind direction. This will be necessary to calculate the wind load acting on the tree [68].

3. Impressions of First Realization

The described system is already installed at our first field location in Hartheim, close to the German–French border. We chose this location mostly for practical reasons, as the Chair of Environmental Meteorology of the University of Freiburg already has a measurement station installed in Hartheim. Not only does this facilitate access to electricity and an internet connection, but it also increases public acceptance, as the local inhabitants are used to seeing measurement equipment installed on trees. The tree that we chose as the first sample is a common beech (Fagus sylvatica), which is approximately 15 m high. Figure 3 shows some first impressions from this site, as well as the first images recorded via the system.

4. Validation and Proof-of-Concept Measurements

4.1. Validation Setup

To validate the setup and processing workflow, we performed multiple comparison experiments in which we attached one or two reflective prisms to the tree. In particular, we attached them close to IMUs, such that the acquired data could also be used to validate the IMU data in the future. Each prism was tracked from a total station that is capable of automatic tracking. In our case, these were a Leica MS60 [69] and Leica TS50 (Leica Geosystems AG, Heerbrugg, Switzerland) [70].

To estimate the extrinsic camera parameters, multiple chessboard patterns were positioned on the ground. The positions of those patterns were also measured using the total stations to ensure a consistent local coordinate system. Additionally, the chessboard patterns were also scanned using the scanning functionality of the Leica MS60. Lastly, the whole scene was scanned using a Leica RTC 360 Laser scanner (Leica Geosystems AG, Heerbrugg, Switzerland) [71]. We also used the laser scan to extract the approximate 3D locations of all cameras, which we then used as an initial estimate for the estimation of the extrinsic camera parameters.

4.2. Estimation of Intrinsic and Extrinsic Parameters

As the tracked features were later be triangulated in 3D using collinearity equations, the intrinsic and extrinsic parameters of the cameras had to be known. The intrinsic parameters include the following [72,73]:

Focal lengths ( $f_{x}$ and $f_{y}$ );
The location of the camera’s principal point ( $c_{x}$ and $c_{y}$ );
Radial lens distortion coefficients ( $k_{1}$ , $k_{2}$ , $k_{3}$ , $k_{4}$ , $k_{5}$ , and $k_{6}$ );
Tangential lens distortion coefficients ( $p_{1}$ and $p_{2}$ ).

The extrinsic parameters include the following [72,73]:

The 3D location of the camera’s projection center $X_{c}$ , $Y_{c}$ , and $Z_{c}$ in local coordinates;
The rotation matrix, R, that defines the rotation of the camera sensor in local coordinates.

To calculate the intrinsic parameters, the camera must capture a known geometric body (in our case, a chessboard pattern). The image of this geometric body must cover the whole sensor. If this is not possible due to practical reasons, like in our case, multiple pictures can be taken in which the body is moved between frames such that, eventually, the whole sensor is covered [73].

After mounting each camera but before raising it to its final height, we adjusted and fixed the focus and proceeded to take pictures of a chessboard pattern on a clipboard, as described above. To calculate the intrinsic camera parameters, we used OpenCV’s built-in chessboard detection and camera calibration functionality [72].

The extrinsic camera parameters mentioned above describe each camera’s location and rotation in 3D space. The definition of the rotation matrix, R, can be especially ambiguous, as it only describes a rotation relative to an initial pose and different software packages use different initial poses. Furthermore, not all software packages use rotation matrices in the first place. For instance, OpenCV uses a Rodrigues vector instead [72]. Agisoft Metashape (version 2.1.0.17532) prefers yaw, pitch, and roll [74], while other packages use Euler angles (with sometimes ambiguous rotation order) or even quaternions. Maintaining the same coordinate reference frame proved, therefore, challenging.

To get all extrinsic parameters for all cameras, a rig of known image points must be used. In our case, we decided to place multiple chessboard patterns on the ground next to the tree. As mentioned in Section 4.1, the position of these patterns was measured using the MS60 multi-station. Figure 1 visualizes the position of the chessboards. We then used the built-in chessboard detection functionality of OpenCV to detect those chessboard patterns in every camera [72].

The detected chessboards were then imported into Agisoft Metashape (version 2.1.0.17532) to perform the extrinsic camera calibration. As mentioned before, the entire scene was also captured with an RTC 360 laser scanner, and an approximate location for each camera was extracted from the laser scan. We used a computer-aided design (CAD) model of the camera housing, lens, and camera body to deduce the projection center from these laser scans. Those estimated camera locations were entered into Agisoft Metashape as reference camera positions with an estimated accuracy of

0.001

m

. To determine the aforementioned estimated accuracy of

0.001

m

, we started with an estimated accuracy of

0.003

m

, which is the nominal positional accuracy of the laser scanner [71]. To see how the results are impacted, we attempted to decrease the value to

0.001

m

, which is the nominal range accuracy of the laserscanner [71]. We observed that the results did not change within our expected margin of error. We then used the “Align photos” workflow in Metashape to calculate the extrinsic camera parameters. While the exact algorithms used for Agisoft Metashape are not known publicly, we assume that it uses the collinearity equations in a bundle-block adjustment with the reference positions estimated from the laser scan as an initial guess.

4.3. Feature Tracking and Triangulation

To track the location of the prisms in the camera recordings, we used the following semi-automated process:

An operator manually marks the location of the prism in one camera frame. While doing so, the operator also selects an area around the prism, henceforth called the feature rectangle, and an even larger area around the feature rectangle, henceforth called the search rectangle. The feature rectangle designates the area of the image that will be tracked by the algorithm, and the search rectangle designates the area of the image where the feature will be searched in the next frame. It is, therefore, paramount that the search rectangle is big enough and includes the position where the feature is expected to appear in the next frame. The feature rectangle should also include some surrounding area around the actual feature, as the algorithm learns over time to distinguish features and backgrounds (see below). Figure 4 depicts a schematic view of the user interface that we developed for this process.

Figure 4. Schematic of the user interface used for the semi-automated feature tracking. The operator moves the feature rectangle in the image and aligns the feature center (depicted as a circle with a cross) with the desired target (in this case, a Leica 360°-prism). The operator then adjusts the size of the feature rectangle (depicted in red), such that it includes the entire feature, as well as some surroundings. Including the surroundings is important because it allows the algorithm to distinguish feature and background over time. Lastly, the operator adjusts the size of the search rectangle (depicted in black), such that it includes the area where the feature is expected to appear in the next frame.

Figure 4. Schematic of the user interface used for the semi-automated feature tracking. The operator moves the feature rectangle in the image and aligns the feature center (depicted as a circle with a cross) with the desired target (in this case, a Leica 360°-prism). The operator then adjusts the size of the feature rectangle (depicted in red), such that it includes the entire feature, as well as some surroundings. Including the surroundings is important because it allows the algorithm to distinguish feature and background over time. Lastly, the operator adjusts the size of the search rectangle (depicted in black), such that it includes the area where the feature is expected to appear in the next frame.
The algorithm now uses an advanced template search algorithm based on [75] and further described below. It uses the previously specified search rectangle and crops the following camera frame to this area. The template search is now run on this cropped frame, using the previously defined feature rectangle as the tracking template.
Once the template search has found a maximum, the feature rectangle and search rectangle are moved to the new feature location.
The algorithm now repeats steps 2 through 3 until either the end of the recording is reached or the operator interrupts the tracking.
If the template search fails to find the correct solution, the operator may interrupt the tracking at any time. The operator can then move the feature rectangle manually to the correct location, allowing the algorithm to learn (see below) and then resume the tracking from step 2. As an additional tool, the operator can start a local gradient-ascent search after moving the feature center manually to an approximately correct position. When doing this, the last used tracking template is matched around the manually selected feature center, and the closest local maximum is found. The feature center is then moved to this local maximum. This may refine the coarse correction performed by the operator.

The employed template search algorithm is based on [75]. This algorithm assumes, however, that there exists only one tracking template. In our case, we had a video recording of the feature, and over time, as more frames were analyzed, more versions of the tracking template became available. Instead of just using one feature as the tracking template, the tracking algorithm, therefore, analyzes multiple feature templates around the current timestamp. The operator can specify the exact number of surrounding templates to be used. The algorithm performs a template search on the target frame for each possible feature template and then scores each tracking result according to Formula (1). Table 1 explains the variables used in Formula (1). After the scores are evaluated, the tracking result with the highest score is deemed the best and is, therefore, used as the overall tracking result. This approach offers the advantage that the algorithm can choose a different template if, e.g., the appearance of the feature changes periodically, e.g., because the prism is swaying in the wind.

S c o r e = w_{C} * C + w_{d} * (1.0 - \frac{d_{t e m p l a t e - l o c a t i o n}}{d_{max}}) + w_{Δ_{a v e r a g e F r a m e}} \frac{max Δ_{a v e r a g e F r a m e}}{255}

(1)

In addition to the templates scored this way, the algorithm also calculates the average of all the feature templates. It then runs a template search using this average template as the tracking template and scores it much like the fixed templates. This average frame has the added benefit that, over time, visual elements that are within the feature rectangle but are not physically attached to the prism move in a different pattern than the prism and thus become very blurry. Visual elements that are attached to the prism (like the branch that the prism is mounted to or the IMU close to the prism) will not become blurry, as they remain in approximately the same location in relation to the prism. This allows the algorithm to effectively learn the surroundings of the prism. Using this knowledge, the algorithm is able to continue tracking the prism, even if it is partially covered by other branches or leaves, as long as at least some of the constant elements are visible. Figure 5 shows how this average feature template is calculated and how it benefits the tracking algorithm under otherwise challenging conditions.

Lastly, the operator has the ability to calculate the average frame of all recorded frames. This is done by calculating the per-pixel sum of all video frames of the camera recording and dividing them by the number of frames. In this average frame, stationary elements remain sharp, while moving elements become blurry. As stated in Equation (1),

Δ_{a v e r a g e F r a m e}

is the maximum difference between the frame at the tracking-result location and this average frame.

w_{Δ_{a v e r a g e F r a m e}}

can, therefore, be adjusted such that the algorithm favors tracking results in moving parts of the image over stationary image parts.

The operator has the ability to tweak the parameters of the algorithm. As mentioned above, the tracking can be interrupted at any time, allowing the operator to make corrections to the feature rectangle, as well as changes to the algorithm’s parameters. In particular, the following parameters can be changed:

The template search method. The operator may choose between all available matching methods available in [75]:
–
TM_CCOEFF;
–
TM_CCOEFF_NORMED;
–
TM_CCORR;
–
TM_CCORR_NORMED;
–
TM_SQDIFF;
–
TM_SQDIFF_NORMED [76].
The number of surrounding feature rectangles to consider;
Whether the fixed targets should be included in the score calculation;
Whether the average frame target should be used in the score calculation;
The weights for Equation (1).

After the feature has been tracked in all camera recordings, the collinearity equations are used to triangulate the 2D features into a 3D trajectory.

4.4. Evaluation of the Influence of Synchronization Errors

As mentioned in Section 2.2, errors in the temporal synchronization of the cameras may lead to decreased spatial accuracy if the subject is moving. To evaluate the influence of this error, we artificially modified the timestamp data of one camera during the postprocessing by up to 1

\min

. We then compared the resulting trajectory to the trajectory calculated without the artificial time offset.

4.5. Vibration Analysis of the Cameras

As mentioned in Section 2.4, the triangulation of the resulting trajectory using collinearity equations depends on accurate camera extrinsics [67]. To ensure that the cameras are not moving too much, we used one of the validation experiments to visualize the pixel movement of the chessboards placed on the ground over time. The amplitude of this movement can be interpreted as the vibration of the camera, given that the chessboards remained stationary.

4.6. Validation Results

In total, four validation experiments were run with the described setup. Table 2 shows their properties. In particular, Table 2 shows which prisms were used in which experiments and the recording rate in frames per second (FPS). Figure 6 shows the placement of the prisms for all validation experiments. Due to a configuration mistake, experiments V1 and V2 were recorded with a frame rate of 2 FPS. After this mistake was found, experiments V3 and V4 were run using the required 10 FPS.

4.6.1. Validation Results for V1

In V1, the 360° prism and circular prism were mounted to adjacent branches in the lower part of the tree crown. A rope was attached close to the 360° prism. At first, the branches were left still, and then the rope was pulled gently and released abruptly. This was repeated multiple times with an increasing amplitude. Figure 7 shows side views of the trajectory of the 360° prism as it was measured via the total station (blue line) and as it was triangulated through the camera system (orange crosses). Figure 8 shows the same for the circular prism. The beginning of the experiment, where the prisms were left still, is clearly visible in Figure 7 as a clump in the reference trajectory, as well as in the photogrammetric trajectory. The pulling on the branch is also clearly visible in the reference trajectory and in the photogrammetric trajectory in Figure 7. In the reference trajectory, Figure 8 shows that the circular prism remained mostly still. It also shows that this was successfully captured by the photogrammetric trajectory.

The plots in Figure 9 show the distance between each point on the photogrammetric trajectory and the corresponding point on the reference trajectory. Both the plot over time and the histogram show a systematic error of approximately 6

c

m

. The maximum distance between the photogrammetric trajectory and the reference trajectory was approximately 13

c

m

. Figure 10 shows similar results for the circular prism, except that the systematic error is a little smaller (approximately

5.5

c

m

) with a maximum distance of

8.1

c

m

.

4.6.2. Validation Results for V2

In V2, the prisms were mounted to the same points as in V1, but another rope was attached close to the circular prism as well. V2 can be split into four parts: At first, the ropes were tensioned slowly in a horizontal direction to excite the tree branches, and the ropes were released abruptly to let the branches swing back into their resting position, much like in V1. The difference to V1 is, though, that both ropes were tensioned at the same time. In the second part, the ropes were tensioned slowly and released slowly again, also in a horizontal direction. Then, the ropes were tensioned and released slowly in a vertical direction, followed by tensioning them slowly and an abrupt release in a vertical direction.

Figure 11 and Figure 12 compare this trajectory between the reference and the triangulation. In general, the trajectories matched up, like in V1, and the movements are clearly visible. The distance plot between the reference and triangulated trajectory in Figure 13 shows a systematic error of approximately

6.5

c

m

with a maximum error of 24

c

m

. Of particular note is the trajectory between 12:32:44 and 12:34:36. The total station tracking the 360° prism lost its line of sight to the prism during that time, and no usable reference trajectory is, therefore, available for this period. Once we noticed this, we paused the experiment until the total station regained its line of sight and resumed the tracking at time 12:34:36.

Figure 14 shows similar results for the circular prism with an approximate systematic error of

5.5

c

m

and a maximum error of

18.7

c

m

.

4.6.3. Validation Results for V3

V3 is the first validation experiment that was filmed in leaf-on conditions (2 May 2023). Because of the availability of our total stations, we only attached one prism to the tree for experiments V3 and V4. In V3, the prism was mounted at a similar location as in V1 and V2. A rope was attached close to the prism.

V3 is characterized by the very long time in the beginning when the branches were left swaying in the wind. We did this to test the capability of the system to track the natural motion of the branches in the wind. Only after approximately 5 min did we start to excite the branch with the prism attached. At first, we excited the branch slowly in a horizontal direction and released it abruptly. Then, we repeated the same again in a vertical direction, and lastly, we released the branch slowly, as well in a vertical direction.

Figure 15 shows that the triangulated trajectory and the reference trajectory are matching up much better than in V1 and V2. The different movement phases (the clump when the branch was left swaying in the wind, the horizontal and the vertical excitement) are also clearly visible. The distance plot in Figure 16 also reflects this good matching. The systematic error is only approximately

1.5

c

m

and the maximum error 16

c

m

.

4.6.4. Validation Results for V4

In V4, the prism was also mounted in the lower part of the tree crown, however, on a branch on the side of the tree opposite to where the prism was mounted in V1 (see Figure 6). The rope was attached close to the prism. At first, the rope was tensioned slowly in a horizontal direction and released abruptly. Then, it was released slowly, also in a horizontal direction. Following that, it was tensioned in a vertical direction and released abruptly. Lastly, it was also released slowly in a vertical direction.

Figure 17 shows the comparison of the triangulated trajectory with the reference trajectory. Both parts, the excitement in the horizontal direction and that in the vertical direction, are clearly visible. The comparison also shows that there is next to no systematic error in V4.

According to the distance plots in Figure 18, the systematic error equals approximately 3

c

m

, and the maximum error equals

22.6

c

m

.

As mentioned in Table 2, we used V4 to evaluate the influence of synchronization errors on the resulting trajectory, as described in Section 4.4. Figure 17 and Figure 18 do not contain this offset so that they remain comparable with the other validation data.

The camera where the timestamp was modified was camera 3-2 (see Figure 1 for positional reference). The timestamp data of this camera was artificially offset in

0.5

s

increments from

- 60.0

s

to

60.0

s

. We subsequently triangulated the trajectory using this offset data and compared it to the trajectory shown in Figure 17 (henceforth called a temporal reference) by calculating the distance between each point on the temporal reference trajectory and the corresponding point on the offset trajectory. To compare how this distance behaves over different offsets, we calculated the average of this distance over the whole trajectory and plotted it over the offset in Figure 19. When the offset is 0

s

, the trajectories are obviously equal, resulting in a mean distance of 0

m

. It should be noted that the mean distance seems to sway periodically. The dominant period of this sway is at ≈8.6 s. This coincides with the period of the vertical movement with which the prism was excited, as can be seen in the 2D trajectory plot in Figure 19 and its frequency spectrum beneath. Such behavior is to be expected, as the trajectory is periodical by itself (through repeated excitation), and if the offset has the same value as the period of the excitation, the trajectories realign themselves, at least to a certain extent. Overall, the results show that a desynchronized camera results in a spatial error that ranges from ≈3 cm to ≈5 cm and does not increase with an increasing time offset.

Figure 19 also shows how the trajectory changes in the worst case. The rightmost plots depict the trajectory of the 360° prism for an offset of

3.5

s

, which generated the worst-seen mean error of ≈

5.3

c

m

. The geometry of the trajectory changes visibly.

Lastly, we used this experiment to visualize the movement of the chessboards over time. Figure 20 shows the areas in which the chessboard corners were detected over time for each camera. In all cases, the chessboards moved only by a few pixels, which we deem negligible, given that the output trajectory only needs to be accurate within 0.10 m (see Section 2.2).

4.7. Evaluation of Validation Results

The validation experiments show that, in some cases, there seems to be a systematic error. However, the systematic error does not appear to be constant across experiments. It is noteworthy that the systematic error improved significantly when the frame rate was raised to 10 FPS. Most of our recordings use at least 10 or even 20 FPS, at which the systematic error seems to be, at most 2

c

m

, and is thus negligible.

It is also important that the camera system was always able to reproduce the shape of the trajectory, and the systematic error mostly resulted from a translation error, not a rotation error. Since the synchronization experiments shown in Figure 19 clearly demonstrate that a desynchronized camera visibly changes the shape of the trajectory, we assume that the cameras were synchronized sufficiently well and that the systematic error was not caused by a synchronization error. Given that we are mostly interested in frequency analysis and relative motion analysis, and not absolute trajectory accuracy, the results are acceptable.

5. Limitations of the Presented Approach

As mentioned in Section 2.4.1, the main limitation of this approach is the feature-matching between different masts. While modern approaches like SuperGlue [27] exist, they still require training for the expected matching conditions. Another solution to the matching issue could also be to increase the horizontal overlap by adding more masts. This, however, would also drastically increase installation and equipment costs.

Secondly, this approach is limited to recording a single tree at a time. To expand our understanding of forest risks caused by wind load, more trees, tree species, ecosystems, and wind conditions should be considered. It is for this reason that the system can be dismantled and reinstalled at a different site within a matter of 2–3 weeks, as mentioned in Section 2.2. While we have more unprocessed data recorded at the tree used for the validation experiments, we currently do not have the financial backing to extend the study to other trees. Nevertheless, monitoring multiple trees or even larger forest areas while maintaining the same spatial resolution would drastically increase the number of required cameras and the mounting and recording infrastructure.

6. Future Work

Section 4 shows that the system is sufficiently accurate when tracking a single target. We achieved this using the semi-automated tracking approach described in Section 4.3. This algorithm proved to be very efficient for one target but also very time-consuming, especially when the target was highly obstructed by leaves and manual intervention was required from the operator. Given that the overall goal is to track the whole tree crown and not just singular targets, we must find a more automated way to perform this task. As a start, we tested the feature matching algorithm SIFT on our data and found out that SIFT seems to be very good at tracking features within one camera’s video but fails when trying to match features between two or more cameras. We believe that this is due to the very difficult lighting conditions, as our recording geometry causes at least two cameras to look directly in the direction of the Sun, no matter the time of day. To solve this issue, we plan to detect key points and features using SIFT. We will then show those key points to an operator and let the operator perform the matching between multiple camera views. Once this is done, SIFT should be able to track all features over time, greatly reducing the amount of required operator time per recording length. We will record the number of manual interventions necessary and the workload required and publish the results in the future. A comparison of this method to other modern matching approaches like SuperGlue [27] might also be possible.

Lastly, Section 4 shows the performance of the system under calm wind conditions. We will investigate in the future how stronger winds affect the system performance, e.g., by pulling on one of the masts and measuring the impact on the triangulated trajectory. This will also help us understand how the data recorded via the IMUs attached to the cameras can be used to compensate for camera vibrations.

After the full tree crown is evaluated over time, other applications of this system could be considered, such as tracking foliage over the seasons or tracking the impact of snow load. It should be noted, though, that tracking bigger hazards, such as landslides, would not be possible with such a system, as it only detects movements relative to the cameras. Landslides would also cause the cameras to move, making such measurements impossible.

7. Conclusions

In this paper, we have introduced the necessity of measuring the frontal area of a tree crown through contactless methods. We demonstrated our attempt at this using photogrammetry, the resulting technical requirements, and our system design.

In the future, we want to focus on the recognition of natural features and realize the movement of the tree as completely as possible without the use of artificial markers. This will be very challenging because trees vary greatly over time due to growth and features; therefore, they vary in their structure and geometry. There is also the question of whether it will be necessary to track a tree over its entire surface in the future since this would require the rapid calculation of a 3D representation. From the current point of view, this can only be done with complex post-processing. Considering the amount of data we already have to deal with, it is evident that a reduction has to be achieved. A reduction in spatial resolution could help here. The most exciting question will, of course, concern the accuracy of the correlation between the IMU systems that directly detect movement on the tree and our optical detection. If the difference is negligible, thus proving that we do not need the IMUs on the tree, this would constitute a real breakthrough for research.

Author Contributions

F.O.K.: conceptualization, methodology, software, validation, data curation, investigation, writing—original draft preparation, and visualization; A.R.: writing—review and editing, supervision, project administration, and funding acquisition. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Deutsche Forschungsgemeinschaft (DFG) project no. 460531546.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest. The funders played no role in the design of the study, in the collection, analyses, or interpretation of data, in the writing of the manuscript, or in the decision to publish the results.

Abbreviations

The following abbreviations are used in this manuscript:

API	Application programming interface
CAD	Computer-aided design
FPS	Frames per second
GNSS	Global navigation satellite system
GPIO	General-purpose input/output
IMU	Inertial measurement unit
NAS	Network-attached storage
NTP	Network time protocol
PoE	Power over Ethernet
PTP	Precision time protocol
RAM	Random access memory
SDK	Software development kit
SSD	Solid-state drive

References

Jaroszewicz, B.; Cholewińska, O.; Gutowski, J.M.; Samojlik, T.; Zimny, M.; Latałowa, M. Białowieża Forest—A Relic of the High Naturalness of European Forests. Forests 2019, 10, 849. [Google Scholar] [CrossRef]
Krieger, D.J. Economic Value of Forest Ecosystem Services: A Review; The Wilderness Society: Washington, DC, USA, 2001. [Google Scholar]
Morales-Hidalgo, D.; Oswalt, S.N.; Somanathan, E. Status and trends in global primary forest, protected areas, and areas designated for conservation of biodiversity from the Global Forest Resources Assessment 2015. For. Ecol. Manag. 2015, 352, 68–77. [Google Scholar] [CrossRef]
Gamfeldt, L.; Snäll, T.; Bagchi, R.; Jonsson, M.; Gustafsson, L.; Kjellander, P.; Ruiz-Jaen, M.C.; Fröberg, M.; Stendahl, J.; Philipson, C.D.; et al. Higher levels of multiple ecosystem services are found in forests with more tree species. Nat. Commun. 2013, 4, 1340. [Google Scholar] [CrossRef] [PubMed]
Gardiner, B.; Blennow, K.; Carnus, J.M.; Fleischer, P.; Ingemarson, F.; Landmann, G.; Lindner, M.; Marzano, M.; Nicoll, B.; Orazio, C.; et al. Destructive Storms in European Forests: Past and Forthcoming Impacts. Final report to European Commission—DG Environment, European Forest Institute; DG Environment: Brussels, Belgium, 2010. [Google Scholar] [CrossRef]
Isbell, F.; Craven, D.; Connolly, J.; Loreau, M.; Schmid, B.; Beierkuhnlein, C.; Bezemer, T.M.; Bonin, C.; Bruelheide, H.; de Luca, E.; et al. Biodiversity increases the resistance of ecosystem productivity to climate extremes. Nature 2015, 526, 574–577. [Google Scholar] [CrossRef]
Gardiner, B.; Schuck, A.; Schelhaas, M.J.; Orazio, C.; Blennow, K.; Nicoll, B. (Eds.) Living with Storm Damage to Forests: What Science Can Tell Us; European Forest Institute: Joensuu, Finland, 2013; Volume 3. [Google Scholar]
Salmond, J.A.; Tadaki, M.; Vardoulakis, S.; Arbuthnott, K.; Coutts, A.; Demuzere, M.; Dirks, K.N.; Heaviside, C.; Lim, S.; Macintyre, H.; et al. Health and climate related ecosystem services provided by street trees in the urban environment. Environ. Health Glob. Access Sci. Source 2016, 15 (Suppl. S1), 36. [Google Scholar] [CrossRef]
Millward, A.A.; Sabir, S. Benefits of a forested urban park: What is the value of Allan Gardens to the city of Toronto, Canada? Landsc. Urban Plan. 2011, 100, 177–188. [Google Scholar] [CrossRef]
Nicoll, B.C.; Gardiner, B.A.; Rayner, B.; Peace, A.J. Anchorage of coniferous trees in relation to species, soil type, and rooting depth. Can. J. For. Res. 2006, 36, 1871–1883. [Google Scholar] [CrossRef]
Rahardjo, H.; Harnas, F.R.; Indrawan, I.; Leong, E.C.; Tan, P.Y.; Fong, Y.K.; Ow, L.F. Understanding the stability of Samanea saman trees through tree pulling, analytical calculations and numerical models. Urban For. Urban Green. 2014, 13, 355–364. [Google Scholar] [CrossRef]
Rudnicki, M.; Burns, D. Branch sway period of four tree species using 3d motion tracking. In Proceedings of the 5th Plant Biomechanics Conference, Stockholm, Sweden, 28 August–1 September 2006; pp. 25–31. [Google Scholar]
Peltola, H.; Kellomäki, S. A mechanistic model for calculating windthrow and stem breakage of Scots pines at stand age. Silva Fenn. 1993, 27. [Google Scholar] [CrossRef]
van Emmerik, T.; Steele-Dunne, S.; Hut, R.; Gentine, P.; Guerin, M.; Oliveira, R.S.; Wagner, J.; Selker, J.; van de Giesen, N. Measuring Tree Properties and Responses Using Low-Cost Accelerometers. Sensors 2017, 17, 1098. [Google Scholar] [CrossRef]
Schindler, D.; Kolbe, S. Assessment of the Response of a Scots Pine Tree to Effective Wind Loading. Forests 2020, 11, 145. [Google Scholar] [CrossRef]
Baker, C.J. Measurements of the natural frequencies of trees. J. Exp. Bot. 1997, 48, 1125–1132. [Google Scholar] [CrossRef]
Hassinen, A.; Lemettinen, M.; Peltola, H.; Kellomäki, S.; Gardiner, B. A prism-based system for monitoring the swaying of trees under wind loading. Agric. For. Meteorol. 1998, 90, 187–194. [Google Scholar] [CrossRef]
Doaré, O.; Moulia, B.; de Langre, E. Effect of plant interaction on wind-induced crop motion. J. Biomech. Eng. 2004, 126, 146–151. [Google Scholar] [CrossRef]
Rudnicki, M.; Mitchell, S.J.; Novak, M.D. Wind tunnel measurements of crown streamlining and drag relationships for three conifer species. Can. J. For. Res. 2004, 34, 666–676. [Google Scholar] [CrossRef]
Paulus, S. Measuring crops in 3D: Using geometry for plant phenotyping. Plant Methods 2019, 15, 103. [Google Scholar] [CrossRef]
Shafiekhani, A.; Kadam, S.; Fritschi, F.B.; DeSouza, G.N. Vinobot and Vinoculer: Two Robotic Platforms for High-Throughput Field Phenotyping. Sensors 2017, 17, 214. [Google Scholar] [CrossRef]
Jin, X.; Zarco-Tejada, P.J.; Schmidhalter, U.; Reynolds, M.P.; Hawkesford, M.J.; Varshney, R.K.; Yang, T.; Nie, C.; Li, Z.; Ming, B.; et al. High-Throughput Estimation of Crop Traits: A Review of Ground and Aerial Phenotyping Platforms. IEEE Geosci. Remote Sens. Mag. 2021, 9, 200–231. [Google Scholar] [CrossRef]
Wu, S.; Wen, W.; Wang, Y.; Fan, J.; Wang, C.; Gou, W.; Guo, X. MVS-Pheno: A Portable and Low-Cost Phenotyping Platform for Maize Shoots Using Multiview Stereo 3D Reconstruction. Plant Phenomics 2020, 2020, 1848437. [Google Scholar] [CrossRef]
Luhmann, T. Nahbereichsphotogrammetrie: Grundlagen, Methoden und Anwendungen; Wichmann: Heidelberg, Germany, 2000. [Google Scholar]
Kraus, K. Photogrammetrie: Band 1: Photogrammetrie: Geometrische Informationen aus Photographien und Laserscanneraufnahmen; De Gruyter Lehrbuch; De Gruyter: Berlin, Germany, 2004. [Google Scholar]
Bay, H.; Tuytelaars, T.; van Gool, L. SURF: Speeded Up Robust Features. In Computer Vision—ECCV 2006; Leonardis, A., Bischof, H., Pinz, A., Eds.; Lecture Notes in Computer Science; Springer: Berlin, Germany, 2006; Volume 3951, pp. 404–417. [Google Scholar] [CrossRef]
Sarlin, P.E.; DeTone, D.; Malisiewicz, T.; Rabinovich, A. SuperGlue: Learning Feature Matching With Graph Neural Networks. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; Mortensen, E., Masson-Forsythe, M., Eds.; IEEE: Piscataway, NJ, USA, 2020; pp. 4937–4946. [Google Scholar] [CrossRef]
Quine, C.; Gardiner, B. Understanding how the interaction of wind and trees results in windthrow, stem breakage, and canopy gap formation. In Plant Disturbance Ecology—The Process and the Response; Academic Press: Cambridge, MA, USA, 2007; pp. 103–155. [Google Scholar]
Jaffe, M.J. Thigmomorphogenesis: The response of plant growth and development to mechanical stimulation: With special reference to Bryonia dioica. Planta 1973, 114, 143–157. [Google Scholar] [CrossRef]
Lindner, M.; Maroschek, M.; Netherer, S.; Kremer, A.; Barbati, A.; Garcia-Gonzalo, J.; Seidl, R.; Delzon, S.; Corona, P.; Kolström, M.; et al. Climate change impacts, adaptive capacity, and vulnerability of European forest ecosystems. For. Ecol. Manag. 2010, 259, 698–709. [Google Scholar] [CrossRef]
Hanewinkel, M.; Hummel, S.; Albrecht, A. Assessing natural hazards in forestry for risk management: A review. Eur. J. For. Res. 2011, 130, 329–351. [Google Scholar] [CrossRef]
Schuck, A.; Schelhaas, M.J. Storm damage in Europe—An overview. In Living with Storm Damage to Forests: What Science Can Tell Us; Gardiner, B., Schuck, A., Schelhaas, M.J., Orazio, C., Blennow, K., Nicoll, B., Eds.; European Forest Institute: Joensuu, Finland, 2013; Volume 3, pp. 15–23. [Google Scholar]
Seidl, R.; Fernandes, P.M.; Fonseca, T.F.; Gillet, F.; Jönsson, A.M.; Merganičová, K.; Netherer, S.; Arpaci, A.; Bontemps, J.D.; Bugmann, H.; et al. Modelling natural disturbances in forest ecosystems: A review. Ecol. Model. 2011, 222, 903–924. [Google Scholar] [CrossRef]
Seidl, R.; Rammer, W.; Blennow, K. Simulating wind disturbance impacts on forest landscapes: Tree-level heterogeneity matters. Environ. Model. Softw. 2014, 51, 1–11. [Google Scholar] [CrossRef]
Kamimura, K.; Gardiner, B.; Dupont, S.; Finnigan, J. Agent-based modelling of wind damage processes and patterns in forests. Agric. For. Meteorol. 2019, 268, 279–288. [Google Scholar] [CrossRef]
Forzieri, G.; Pecchi, M.; Girardello, M.; Mauri, A.; Klaus, M.; Nikolov, C.; Rüetschi, M.; Gardiner, B.; Tomaštík, J.; Small, D.; et al. A spatially explicit database of wind disturbances in European forests over the period 2000–2018. Earth Syst. Sci. Data 2020, 12, 257–276. [Google Scholar] [CrossRef]
Gardiner, B.A. Wind and wind forces in a plantation spruce forest. Boundary-Layer Meteorol. 1994, 67, 161–186. [Google Scholar] [CrossRef]
Schindler, D.; Mohr, M. No resonant response of Scots pine trees to wind excitation. Agric. For. Meteorol. 2019, 265, 227–244. [Google Scholar] [CrossRef]
Milne, R. Dynamics of swaying of Picea sitchensis. Tree Physiol. 1991, 9, 383–399. [Google Scholar] [CrossRef]
Rudnicki, M.; Meyer, T.H.; Lieffers, V.J.; Silins, U.; Webb, V.A. The periodic motion of lodgepole pine trees as affected by collisions with neighbors. Trees 2008, 22, 475–482. [Google Scholar] [CrossRef]
Niez, B.; Dlouha, J.; Moulia, B.; Badel, E. Water-stressed or not, the mechanical acclimation is a priority requirement for trees. Trees 2019, 33, 279–291. [Google Scholar] [CrossRef]
Telewski, F.W. Is windswept tree growth negative thigmotropism? Plant Sci. 2012, 184, 20–28. [Google Scholar] [CrossRef] [PubMed]
Mayer, H. Wind-induced tree sways. Trees 1987, 1, 195–206. [Google Scholar] [CrossRef]
Sellier, D.; Brunet, Y.; Fourcaud, T. A numerical model of tree aerodynamic response to a turbulent airflow. Forestry 2008, 81, 279–297. [Google Scholar] [CrossRef]
Gardiner, B. Mathematical modelling of the static and dynamic characteristics of plantation trees. In Mathematical Modelling of Forest Ecosystems; Franke, J., Ed.; Sauerländer: Frankfurt am Main, Germany, 1992; pp. 40–61. [Google Scholar]
Schindler, D.; Fugmann, H.; Schönborn, J.; Mayer, H. Coherent response of a group of plantation-grown Scots pine trees to wind loading. Eur. J. For. Res. 2012, 131, 191–202. [Google Scholar] [CrossRef]
Schindler, D.; Fugmann, H.; Mayer, H. Analysis and simulation of dynamic response behavior of Scots pine trees to wind loading. Int. J. Biometeorol. 2013, 57, 819–833. [Google Scholar] [CrossRef]
Schindler, D.; Vogt, R.; Fugmann, H.; Rodriguez, M.; Schönborn, J.; Mayer, H. Vibration behavior of plantation-grown Scots pine trees in response to wind excitation. Agric. For. Meteorol. 2010, 150, 984–993. [Google Scholar] [CrossRef]
Gardiner, B.A.; Stacey, G.R.; Belcher, R.E.; Wood, C.J. Field and wind tunnel assessments of the implications of respacing and thinning for tree stability. Forestry 1997, 70, 233–252. [Google Scholar] [CrossRef]
Gardiner, B.; Marshall, B.; Achim, A.; Belcher, R.; Wood, C. The stability of different silvicultural systems: A wind-tunnel investigation. Forestry 2005, 78, 471–484. [Google Scholar] [CrossRef]
Mayhead, G.J. Some drag coefficients for british forest trees derived from wind tunnel studies. Agric. Meteorol. 1973, 12, 123–130. [Google Scholar] [CrossRef]
Leclercq, T.; Peake, N.; de Langre, E. Does flutter prevent drag reduction by reconfiguration? Proc. Math. Phys. Eng. Sci. 2018, 474, 20170678. [Google Scholar] [CrossRef] [PubMed]
Allied Vision Technologies. Mako G-507 Data Sheet. Available online: https://cdn.alliedvision.com/fileadmin/pdf/en/Mako_G-507_DataSheet_en.pdf (accessed on 21 August 2024).
Liang, C.K.; Chang, L.W.; Chen, H.H. Analysis and compensation of rolling shutter effect. IEEE Trans. Image Process. 2008, 17, 1323–1330. [Google Scholar] [CrossRef] [PubMed]
Stemmer Imaging. FUJINON HF6XA-5M Data Sheet. Available online: https://www.fujifilm.com/de/de/business/optical-devices/mvlens/hfxa5m#HF01 (accessed on 21 August 2024).
Kolbe, S.; Schindler, D. TreeMMoSys: A low cost sensor network to measure wind-induced tree response. HardwareX 2021, 9, e00180. [Google Scholar] [CrossRef]
Allied Vision Technologies. Mako Technical Manual: GigE Vision Cameras. Available online: https://cdn.alliedvision.com/fileadmin/content/documents/products/cameras/Mako/techman/Mako_TechMan_en.pdf (accessed on 24 January 2022).
Teledyne FLIR. FLIR BLACKFLY S Installation Guide. Available online: https://flir.app.boxcn.net/s/bfw7jzqcq3qfrgf3i9ri1zp4mam2d7l7 (accessed on 25 August 2021).
1588-2019; IEEE Standard for a Precision Clock Synchronization Protocol for Networked Measurement and Control Systems. Institute of Electrical and Electronics Engineers: Piscataway, NJ, USA, 2020. [CrossRef]
Meinberg. IMS—LANTIME M500: Time and Frequency Synchronization in Rail Mount Chassis. Available online: https://www.meinbergglobal.com/english/products/modular-railmount-ntp-server-ieee-1588-solution.htm (accessed on 22 September 2022).
Meinberg. IMS Modules: Power Supplies, Input Signals and Output Signals for Meinberg IMS—Systems. Available online: https://www.meinbergglobal.com/english/products/ims-modules.htm (accessed on 25 August 2021).
Allied Vision Technologies. Vimba SDK Product Page. Available online: https://www.alliedvision.com/en/products/vimba-sdk/ (accessed on 24 January 2022).
Bradski, G. The OpenCV Library. Dr. Dobb’s J. Softw. Tools 2000, 25, 120–126. [Google Scholar]
Lowe, D.G. Object recognition from local scale-invariant features. In Proceedings of the Seventh IEEE International Conference on Computer Vision, Kerkyra, Greece, 20–27 September 1999. [Google Scholar] [CrossRef]
Rublee, E.; Rabaud, V.; Konolige, K.; Bradski, G. ORB: An efficient alternative to SIFT or SURF. In Proceedings of the 2011 International Conference on Computer Vision (ICCV 2011), Barcelona, Spain, 6–13 November 2011; IEEE: Piscataway, NJ, USA, 2011; pp. 2564–2571. [Google Scholar] [CrossRef]
Schenk, T. Introduction to Photogrammetry; Department of Civil and Environmental Engineering and Geodetic Science, The Ohio State University: Columbus, OH, USA, 2005. [Google Scholar]
Wiggenhagen, M.; Steensen, T. Taschenbuch zur Photogrammetrie und Fernerkundung: Guide for Photogrammetry and Remote Sensing; Wichmann: Berlin/Offenfach, Germany, 2021. [Google Scholar]
de Langre, E. Effects of Wind on Plants. Annu. Rev. Fluid Mech. 2008, 40, 141–168. [Google Scholar] [CrossRef]
Leica Geosystems AG. Leica Nova MS60 Data Sheet. Available online: https://leica-geosystems.com/-/media/files/leicageosystems/products/datasheets/leica_nova_ms60_ds.ashx?la=en-gb&hash=DC24A3605EE8B0DED66F30240A8B63DC (accessed on 27 October 2022).
Leica Geosystems AG. Leica Nova TS50 Data Sheet. Available online: https://downloads.leica-geosystems.com/files/archived-files/leica_nova_ts50_bro_de.pdf (accessed on 4 October 2024).
Leica Geosystems AG. Leica RTC360 3D Reality Capture Solution Data Sheet: Fast. Agile. Precise. Available online: https://leica-geosystems.com/-/media/files/leicageosystems/products/datasheets/leica-rtc360-ds-872750-0422-en.ashx?sc_lang=en&hash=0C62B81F1DE6058C41E4ED99C6900326 (accessed on 21 August 2024).
OpenCV. Camera Calibration and 3D Reconstruction. Available online: https://docs.opencv.org/4.9.0/d9/d0c/group__calib3d.html (accessed on 12 August 2024).
Zhang, Z. A flexible new technique for camera calibration. IEEE Trans. Pattern Anal. Mach. Intell. 2000, 22, 1330–1334. [Google Scholar] [CrossRef]
Agisoft LLC. Agisoft Metashape User Manual: Professional Edition, Version 2.1. Available online: https://www.agisoft.com/pdf/metashape-pro_2_1_en.pdf (accessed on 13 August 2024).
OpenCV. matchTemplate Documentation. Available online: https://docs.opencv.org/4.6.0/df/dfb/group__imgproc__object.html#ga586ebfb0a7fb604b35a23d85391329be (accessed on 15 July 2024).
OpenCV. TemplateMatchModes for matchTemplate. Available online: https://docs.opencv.org/4.6.0/df/dfb/group__imgproc__object.html#ga3a7850640f1fe1f58fe91a2d7583695d (accessed on 15 July 2024).

Figure 1. Three-dimensional visualization of the measurement site (to scale). The cameras were mounted on four masts, which were equally distributed around the tree, allowing us to capture the tree from all directions. In particular, the figure shows the placement of all four masts, eight cameras, and the chessboard patterns used for the extrinsic calibration.

Figure 2. Schematic side view to illustrate camera placement (not to scale). Two cameras were mounted on each mast, creating a vertical overlap of at least 60%.

Figure 3. (a) Image of the subject tree recorded through one of the cameras. (b) Mast with the cameras, anemometers, and a Wi-Fi antenna mounted. Next to the mast, the waterproof server rack can be seen, which is depicted closer in image c. (c) The server rack with the door open. From top to bottom, the Meinberg time server, switch, and NAS can be seen. (d) One of the cameras mounted to a mast.

Figure 5. Visualization of a feature (Leica 360°-prism) over time (a–c) and the average feature frame over time (d–f). (a) Frame 0 of the feature. (b) Frame 15 of the feature. (c) Frame 151 of the feature. (d) Average frame 0 of the feature. Since this is the first frame of the feature, the algorithm has no way of differentiating the foreground from the background. The entire feature therefore appears sharp. (e) Average frame 15 of the feature. Over those 15 frames, the prism has already moved, and the background, therefore, has changed. Due to that, the background is starting to become blurry, while the prism is still relatively sharp. (f) Average frame 151 of the feature. The algorithm has now fully understood the difference between the foreground (the prism) and the background. The background, therefore, appears entirely blurred, while the prism is still sharp. Note that the algorithm also identified the branch and the inertial measurement unit (IMU) as stationary. This allows the algorithm to keep track of the feature, even if parts of it are obscured, e.g., due to moving leaves. Of particular note is (f), as this frame shows that the algorithm is still able to track the feature, even though it is partially covered by leaves.

Figure 6. Side view of the prism placement for all validation experiments, as described in Table 2.

Figure 7. Trajectory comparison for the 360° prism in V1.

Figure 8. Trajectory comparison for the circular prism in V1.

Figure 9. (a) Distance between the photogrammetric trajectory and the reference trajectory over time and (b) as a histogram for the 360° prism in V1.

Figure 10. (a) Distance between the photogrammetric trajectory and the reference trajectory over time and (b) as a histogram for the circular prism in V1.

Figure 11. Trajectory comparison for the 360° prism in V2.

Figure 12. Trajectory comparison for the circular prism in V2.

Figure 13. (a) Distance between the photogrammetric trajectory and the reference trajectory over time and (b) as a histogram for the 360° prism in V2.

Figure 14. (a) Distance between the photogrammetric trajectory and the reference trajectory over time and (b) as a histogram for the circular prism in V2.

Figure 15. Trajectory comparison for the 360° prism in V3.

Figure 16. (a) Distance between the photogrammetric trajectory and the reference trajectory over time and (b) as a histogram for the 360° prism in V3.

Figure 17. Trajectory comparison for the 360°-prism in V4.

Figure 18. (a) Distance between the photogrammetric trajectory and the reference trajectory over time and (b) as a histogram for the 360° prism in V4.

Figure 19. Overview of how one desynchronized camera can affect the resulting trajectory. In the top left, the mean distance between the temporal reference trajectory and the offset trajectory for the 360°-prism in V4 is depicted for offsets from

- 60

s

to 60

s

. When the offset is 0

s

, the trajectories are obviously equal, resulting in a mean distance of 0

m

. It should be noted that the mean distance seems to sway periodically. The dominant period of this sway is at ≈

8.6

s

. This coincides with the period of the vertical movement with which the prism was excited, as can be seen in the 2D trajectory plot and its frequency spectrum beneath. Lastly, the rightmost plots show the temporal reference trajectory and the offset trajectory for an offset of

3.5

s

. This trajectory is interesting because it generates the largest mean error of ≈

5.3

c

m

.

Figure 19. Overview of how one desynchronized camera can affect the resulting trajectory. In the top left, the mean distance between the temporal reference trajectory and the offset trajectory for the 360°-prism in V4 is depicted for offsets from

- 60

s

to 60

s

. When the offset is 0

s

, the trajectories are obviously equal, resulting in a mean distance of 0

m

. It should be noted that the mean distance seems to sway periodically. The dominant period of this sway is at ≈

8.6

s

. This coincides with the period of the vertical movement with which the prism was excited, as can be seen in the 2D trajectory plot and its frequency spectrum beneath. Lastly, the rightmost plots show the temporal reference trajectory and the offset trajectory for an offset of

3.5

s

. This trajectory is interesting because it generates the largest mean error of ≈

5.3

c

m

.

Figure 20. Movement of stationary chessboards over time from every camera. The colored area visualizes the area in which each chessboard corner was detected in the camera video. The chessboards have internal dimensions of approximately 60 by 70

c

m

.

Figure 20. Movement of stationary chessboards over time from every camera. The colored area visualizes the area in which each chessboard corner was detected in the camera video. The chessboards have internal dimensions of approximately 60 by 70

c

m

.

Table 1. Explanation of variables for Equation (1).

Variable	Explanation
C	Tracking confidence. This is the result that the selected template search method has calculated at the resulting location.
$d_{t e m p l a t e - l o c a t i o n}$	The Euclidean distance between the 2D location of the feature location and the tracking-result location.
$d_{max}$	The maximum value of $d_{t e m p l a t e - l o c a t i o n}$ over all evaluated tracking templates.
$Δ_{a v e r a g e F r a m e}$	The maximum difference between the new feature rectangle and the average frame. The average frame is determined by calculating the per-pixel sum of all video frames of the camera recording and dividing them by the number of frames. The result of this is a picture where stationary image parts remain sharp and moving image parts become blurred. Given that the prism is moving, the aim of this parameter is to favor tracking results in moving image areas.
$w_{C}$ , $w_{d}$ , $w_{Δ_{a v e r a g e F r a m e}}$	Weights for each parameter. The operator can customize the values of the weights. $w_{C} + w_{d} + w_{Δ_{a v e r a g e F r a m e}} = 1$ must always be true.

Table 2. List of validation experiments.

Experiment	Recording Date	Leica 360° Used?	Circular Prism Used?	Recording Rate [FPS]	Artificial Time Offset Applied?
V1	6 December 2022	Yes	Yes	2	No
V2	6 December 2022	Yes	Yes	2	No
V3	6 December 2022	Yes	No	10	No
V4	2 May 2023	Yes	No	10	Yes

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kammel, F.O.; Reiterer, A. Conceptualization and First Realization Steps for a Multi-Camera System to Capture Tree Streamlining in Wind. Forests 2024, 15, 1846. https://doi.org/10.3390/f15111846

AMA Style

Kammel FO, Reiterer A. Conceptualization and First Realization Steps for a Multi-Camera System to Capture Tree Streamlining in Wind. Forests. 2024; 15(11):1846. https://doi.org/10.3390/f15111846

Chicago/Turabian Style

Kammel, Frederik O., and Alexander Reiterer. 2024. "Conceptualization and First Realization Steps for a Multi-Camera System to Capture Tree Streamlining in Wind" Forests 15, no. 11: 1846. https://doi.org/10.3390/f15111846

APA Style

Kammel, F. O., & Reiterer, A. (2024). Conceptualization and First Realization Steps for a Multi-Camera System to Capture Tree Streamlining in Wind. Forests, 15(11), 1846. https://doi.org/10.3390/f15111846

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Conceptualization and First Realization Steps for a Multi-Camera System to Capture Tree Streamlining in Wind

Abstract

1. Introduction

2. Concept of a Multi-Camera System and Image-Processing Pipeline for the Automated Creation of a 3D Model of a Tree Hull

2.1. Theoretical Background

2.2. Technical Requirements

2.3. Hardware Setup

2.3.1. Camera Arrangement and Mounting

2.3.2. Temporal Synchronization and Camera Triggers

2.3.3. Data Storage and Camera Control

2.4. Data Analysis

2.4.1. Feature Extraction and Feature Matching

2.4.2. Creation of the 3D Model

3. Impressions of First Realization

4. Validation and Proof-of-Concept Measurements

4.1. Validation Setup

4.2. Estimation of Intrinsic and Extrinsic Parameters

4.3. Feature Tracking and Triangulation

4.4. Evaluation of the Influence of Synchronization Errors

4.5. Vibration Analysis of the Cameras

4.6. Validation Results

4.6.1. Validation Results for V1

4.6.2. Validation Results for V2

4.6.3. Validation Results for V3

4.6.4. Validation Results for V4

4.7. Evaluation of Validation Results

5. Limitations of the Presented Approach

6. Future Work

7. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI