*Article* **Logging Trail Segmentation via a Novel U-Net Convolutional Neural Network and High-Density Laser Scanning Data**

**Omid Abdi \*, Jori Uusitalo and Veli-Pekka Kivinen**

Department of Forest Sciences, University of Helsinki, Latokartanonkaari 7, 00014 Helsinki, Finland; jori.uusitalo@helsinki.fi (J.U.); veli.kivinen@helsinki.fi (V.-P.K.)

**\*** Correspondence: omid.abdi@helsinki.fi; Tel.: +358-294158466

**Abstract:** Logging trails are one of the main components of modern forestry. However, spotting the accurate locations of old logging trails through common approaches is challenging and time consuming. This study was established to develop an approach, using cutting-edge deep-learning convolutional neural networks and high-density laser scanning data, to detect logging trails in different stages of commercial thinning, in Southern Finland. We constructed a U-Net architecture, consisting of encoder and decoder paths with several convolutional layers, pooling and non-linear operations. The canopy height model (CHM), digital surface model (DSM), and digital elevation models (DEMs) were derived from the laser scanning data and were used as image datasets for training the model. The labeled dataset for the logging trails was generated from different references as well. Three forest areas were selected to test the efficiency of the algorithm that was developed for detecting logging trails. We designed 21 routes, including 390 samples of the logging trails and non-logging trails, covering all logging trails inside the stands. The results indicated that the trained U-Net using DSM (*k* = 0.846 and *IoU* = 0.867) shows superior performance over the trained model using CHM (*k* = 0.734 and *IoU* = 0.782), DEMavg (*k* = 0.542 and *IoU* = 0.667), and DEMmin (*k* = 0.136 and *IoU* = 0.155) in distinguishing logging trails from non-logging trails. Although the efficiency of the developed approach in young and mature stands that had undergone the commercial thinning is approximately perfect, it needs to be improved in old stands that have not received the second or third commercial thinning.

**Keywords:** U-Net; high-density laser scanning; logging trails; digital surface model; canopy height model; commercial thinning; semantic segmentation; convolutional neural networks

#### **1. Introduction**

In modern timber harvesting, logging trails are crucial entities for the accurate navigations of harvesters and forwarders to penetrate into forest stands for silvicultural operations [1] in the pathway of precision harvesting. However, spotting the accurate locations of old logging trails is one the major and most challenging tasks for forest owners or operators/drivers, particularly in the stands that have not undergone commercial thinning for a long period of time. Little is known about holistic solutions for the detection of logging trails using remote-sensing data. However, cutting-edge deep-learning based approaches using high-density laser scanning data may aid in solving this problem.

In Finland, rotation forest management (RFM) is the most common silvicultural method. It relies on three main phases: establishment, thinning, and final felling [2]. Normally, forest stands are thinned two to three times between the ages of 20 and 70 years [3,4]. Logging trails are determined with a width of 4–5 m and a spacing of 20–25 m in the first commercial thinning [1,3,5], which covers the entirety of a stand. However, some segments of a trail, an entire trail, or even a logging trail network may vanish on the ground over time, due to the regrowth of trees, the growth of seedlings, and the spreading of the crowns of trees on the trail surface. Therefore, spotting the initial locations of logging trails can be

**Citation:** Abdi, O.; Uusitalo, J.; Kivinen, V.-P. Logging Trail Segmentation via a Novel U-Net Convolutional Neural Network and High-Density Laser Scanning Data. *Remote Sens.* **2022**, *14*, 349. https:// doi.org/10.3390/rs14020349

Academic Editors: Fahimeh Farahnakian, Jukka Heikkonen and Pouya Jafarzadeh

Received: 3 December 2021 Accepted: 10 January 2022 Published: 13 January 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

time-consuming and costly. Additionally, misinterpreting the original logging trail network in subsequent thinning operations may cause overcut of the growing stock.

In recent decades, airborne laser scanning (ALS) systems have become central to characterizing the 3D structure of forest canopies. These systems have provided cuttingedge applications and research in forestry, particularly in the areas of forest inventory and ecology [6–8]. Few studies have addressed the detection of logging trails using laser scanning data [9,10], while well-documented literature is available regarding the mapping of forest roads using either low-density laser scanning data or high-density laser scanning data [11–16]. The majority of these studies have used traditional methods based on edge detection, thresholding, or object-based segmentation to detect logging trails or forest roads under canopies via machine learning algorithms. Sherba et al. [10] presented a rulebased classification approach for detecting old logging roads using slope models derived from high-density LiDAR data in Marin County, California. They reported that some post-classification techniques such as LiDAR-derived flow direction raster and curvature increased the accuracy of detecting logging trails by dropping streams and gullies and adding ridge trails to the final classified layer. They emphasized that the high point density of LiDAR data has a significant influence on the accuracy of discriminating old logging trails from non-trail objects. Similarly, Buján et al. [16] proposed a pixel-based random forest approach to map paved and unpaved roads through numerous LiDAR-derived metrics in the forests of Spain. However, they concluded that the density of LiDAR points did not have a significant impact on the accuracy of the detection of roads using random forest. Lee et al. [9] extracted trails using the segmentation of canopy density derived from the airborne laser swath mapping (ALSM) data. They labeled the sharpened sightlines as trails that result from the visibility vectors between the canopies. The introduced approaches may show promising results but rely on heavy pre-processing and post-processing tasks. Typically, they are developed for a specific type of trail or road in a particular forest. Furthermore, the detection of a logging trail is more difficult than the detection of a forest road using these developed approaches, due to a lower geometric consistency, more complex background, and the occlusions of the canopy [17]. Therefore, the need to develop a versatile approach, such as deep learning methods with minimal processing and optimal efficiency for detecting logging trails from laser scanning data, is undeniable.

Recently, convolutional neural networks (CNNs), as one of the architectures of deep learning neural networks, have become the epicenter for image classification, semantic segmentation, pattern recondition, and object detection, in particular with the emerging high-resolution remote sensing data [18,19]. The standard architecture of a CNN encompasses a set of convolutional layers, pooling and non-linear operations [20]. The primary characteristics of a CNN are the spatial connectivity between the adjacent layers, sharing of the weights, acquiring features from low-spatial scale to high-spatial scale, and integrating the modules of feature extractions and classifiers [21]. Various successful CNN architectures have been developed for main road classification, such as U-Net [22] and GANs [23], and for main road area or centerline extractions, such as U-Net [24–29], ResNet [30], GANs [31], Y-Net [32], SegNet [33], and CasNet [34], which mostly were used very high-resolution satellite (VHR) images or UAV. Several studies have addressed the outperforming of deep learning-based approaches in forest applications, such as individual tree detection [35–38], species classification [35,39–42], tree characteristics extraction [43,44], and forest disturbances [45–48], mostly using VHR, UAV, or high-density laser scanning data. At present, little is known about the efficiency of the deep learning-based approaches on the extraction of logging trails or forest roads.

Tree occlusions and other noises hampered accurate road detection using the traditional road segmentation methods even using VHR images [17,49,50]. However, the CNN-based approaches could relatively alleviate the effects of complex background and the occlusion of trees [34,51]. Using high-density laser scanning data with the capability of penetrating into the canopy and reaching the ground surface may aid to solve these problems. Few studies explored the feasibility of CNN-based architectures in using laser

scanning-derived metrics for detecting road networks [52,53]. Caltagirone et al. [52] developed a fast fully convolutional neural network (FCN) for road detection through the metrics of average elevation and density layers derived from laser scanning data. They reported excellent performance of this approach in detecting roads, particularly for real-time applications. Similarly, Verschoof-van der Vaart et al. [53] demonstrated the efficiency of CarcassonNet using a digital terrain model (DTM) derived from laser scanning data for detecting and tracing of archaeological objects such as historical roads in Netherlands.

Although the performance of CNNs methods for road extractions and its components have been well documented using VHR and UAV for public roads [51], this efficiency requires greater scrutiny in the more complex backgrounds, such as for detecting commercial forest roads or logging trails in forests, and with different data such as laser scanning data. Therefore, this study seeks to test the performance of U-Net, as one of the most popular architectures of CNNs, in integration with high-density laser scanning data for detecting logging trails, as one of the most complex networks regarding geometry and visibility in the mechanized forests of Finland.

The main purpose of this research is to develop an end-to-end deep learning-based approach that uses the metrics of high-density laser scanning data to automate the detection of logging trails in forest stands that have undergone commercial thinning. Specifically, we aim to comparatively evaluate the performance of a trained U-Net algorithm by using different derivatives of laser scanning datasets (i.e., canopy height and elevation-based models) for the detection of logging trails. We are also eager to investigate the performance of this approach to detect logging trails in young and mature stands with different development classes.

#### **2. Materials and Methods**

#### *2.1. Description of the Study Area*

We focused our research on the Kakkurinmaa, Länsi-Aure, and Karpanmaa forests in the municipalities of Parkano and Ikaalinen, Southern Finland. The Kakkurinmaa and Karpanmaa forests are owned by Finsilva Oyj, and the Länsi-Aure forest, as governmental public land, is managed by state-owned Metsähallitus. The forest areas are structured in spatially uniform forest stands, typically 3–10 hectares in size. The tree species are pine, spruce, and birch with a predominance of pine in the three regions. The stands are managed even-aged, and the age range of the stands is between 34 and 72 years. The height of trees ranges between 5 and 30 m. Forest stands are typically thinned 2–3 times during a rotation period in which around 25–30% of the trees are removed [4,54]. We classified forest stands concerning age, height, and thinning operations into four development categories to facilitate the detection of logging trails (Figure 1): (1) young stands before the first commercial thinning, (2) young stands that had experienced the first commercial thinning, (3) mature stands before the second commercial thinning, and (4) mature stands that had undergone the second or third thinning operation. Logging trails may be visible within Categories 2 and 4 stands (Figure 1b,d); however, in some development classes, for example, within Category 3 stands, old logging trails are very challenging to find (Figure 1c).

#### *2.2. Data*

We ordered a license to access the high-quality laser scanning data for the study area in 2020, under the framework of the National Land Survey of Finland (NLS). These data are the latest and most accurate laser scanning data that have been collected by the NLS in Finland. The density of data is at least 5 points per square meter, as the average distance between points is circa 40 cm. The mean altimetric error of the data is less than 10 cm and the mean error of horizontal accuracy is less than 45 cm [55]. To detect logging trails, we extracted the canopy height and the elevation metrics after processing the high-quality laser scanning data. The characteristics of the forest stands (e.g., species composition, age, height, and thinning history) and their boundaries were collected from the databases of Finsilva Oy and Metsähallitus. These data were used for the classification of the stands as

described in Section 2.1. A further set of required data such as topographic maps and the time-series of orthophotos were also obtained from the open databases of the NLS [56].

**Figure 1.** Forest stands regarding commercial thinning: (**a**) young stands before the first commercial thinning; (**b**) young stands after the first commercial thinning; (**c**) mature stands before the second commercial thinning; and (**d**) mature stands after the second/third commercial thinning. The logging trails are visible in Categories (**b**) and (**d**), but they are difficult to spot in Category (**c**).

We used these data to create the labeled dataset of logging trails for training the U-Net algorithm. In addition to the extensive ground-truth samplings of the logging trails to test the algorithm efficiency (Section 2.5), we visited the logging trails and recorded some tracks in three regions before creating the dataset of labels.

#### *2.3. Training Datasets*

We selected 44 laser scanning tiles of 1 × 1 km to create image and labeled datasets for training the deep-learning algorithm. After decompressing the laser scanning datasets, we merged the tiles and produced required data from the cloud points such as the height metrics and elevation models. The canopy height model (CHM) was utilized [57] with a spatial resolution of 0.5 m to estimate the total height of trees. The binning interpolation methods were adopted to derive a digital elevation models (DEMs) based on the minimum cell assignment types (DEMmin) (i.e., close to the terrain using the point clouds with minimum elevation) and the average digital elevation model (DEMavg) as well as a digital surface model (DSM) based on the maximum cell assignment type [58]. For example, the assignment of each output cell was determined from the maximum value of point clouds that fall within its extent to form the DSM. The values of all the raster models were normalized between 0 and 255 using a min–max scaling method. Finally, we smoothed the raster layers by calculating their median value in a 3 × 3 neighborhood around each cell.

The labels of logging trails were generated from a variety of resources such as orthophotos (Figure 2a), trees height (Figure 2b) and profiles extracted from the laser scanning points (Figures 2c and 3). The ground elevation model was used to discriminate ditches and forest roads from the logging trails (Figure 2d). We created a total of 336 km of logging trails and then defined a 2 m buffer, as the width of a segment, from the centerline. The logging trails were converted into a binary image containing the cells with the labels 0 (non-trail) and 1 (trail) (Figure 2e).

The images and their corresponding labels were converted into the patches with a size of 256 × 256 cells (Figure 4a) before entering these into U-Net. In total, we selected 1888 image patches and their corresponding labels for training (75%) and validation (25%) of the U-Net. We excluded some image patches from training datasets that were in the areas selected for collecting test data, as described in Section 2.5.

**Figure 2.** References comprising (**a**) near-infrared orthophotos and the derivatives of high-density laser scanning data such as (**b**) canopy height model, (**c**) tree profiles, and (**d**) the ground elevation model, used to produce the labeled datasets (**e**) from logging trails for training the U-Net convolutional neural network architecture. While the orthophoto, tree height, and tree profiles enhanced the visibility of logging trails, the digital terrain model heightened the ditches and roads that might inadvertently be digitized as logging trails during creation of the labeled dataset.

**Figure 3.** The profile of the cloud points of a laser scanning dataset within a young stand that has undergone its first commercial thinning (**a**–**f**,**j**,**k**). The intervals between two logging trails and their footprint are shown on the layers of canopy height and trees' profile.

#### *2.4. U-Net Architecture*

The U-Net is one of the cutting-edge architectures of the convolutional neural network for image segmentation due to its simple structure, ability to work with little training data, and high performance [59,60]. The U-Net concatenates low-level information and high-level semantic information that is derived from the convolutional layers. This strategy enables it to produce accurate prediction maps, even with limited training data [59]. The U-shaped structure of U-Net consists of a contraction path (encoder) and an expansion path (decoder). The extraction of low-level features and the reduction of spatial dimensions are implemented in the contraction path, while the spatial dimensions of the features are enhanced through a series of upward convolutions and concatenations in the expansion path. In the architecture of our U-Net (Figure 4b), the contraction path consists of four steps, each step comprising two 3 × 3 convolution layers. Each convolution layer is followed by an ReLU activation function and a batch normalization layer with a same-padded. The spatial dimensions of the features were reduced using a 2 × 2 max-pooling layer. The number of filters/features was doubled, while the spatial dimensions were halved at each contraction step. In our U-Net, the first and last convolution layers of the contraction path entail 16 and 128 filters, respectively. The expansion path consists of a sequence of upsampling of the features, followed by the transposed convolution layers with a stride 2. The upsampling layers combine the high-level features with the corresponding features in the contraction path using the intermediate concatenations. A bottleneck layer with 256 filters is located between the contraction and expansion blocks as well (Figure 4b). The output is a 1 × 1 convolutional layer with one dimension that is followed by a sigmoid activation function (Figure 4c).

**Figure 4.** Architecture of the constructed U-Net for detecting logging trails using high-density laser scanning data: (**a**) preparation of a laser scanning tile for use in the U-Net to detect logging trails; (**b**) architecture of the designed U-Net, which includes the contraction path and the expansion path; and (**c**) predicted logging trails.

The U-Net architecture was constructed and trained in Python using the powerful Keras and TensorFlow libraries [61]. The model was trained using the GPU of NVIDIA Quadro RTX 4000 with 8 GB. We implemented the Hyperband algorithm in Keras Tuner to search the optimal set of hyperparameters for our algorithm [62], such as the optimization algorithm, learning rate, dropout rate, batch size, and loss function [20]. The model builder was used to define the search algorithm and hypertuned model. The model was trained using the training data and evaluated using the test data. Table A1 shows a number of tuned optimal values for the hyperparameters in training the U-Net. The minimum number of epochs was set at 100, and the early stop rule was implicated to stop the process of training, in case of overfitting. The cross-entropy loss function was set to monitor how poorly the U-Net was performing. The plots of accuracy and loss versus the epochs in the training of U-Net are provided in Figure A1.

Figure 5 shows an example of the predicted logging trails from DSM data, using the trained U-Net. The algorithm accepts an input layer (i.e., a DSM) with a fixed size (256, 256, 1). It produces different feature maps in the intermediate step, such as convolution, batch normal, dropout, and max-pooling layers. The convolutional layers generate several spatial features from small parts of the image, based on the defined number and size of the filters. The batch normalization layer normalizes the previous layers in the network. The dropout layer reduces the complexity of the network. The batch normalization and dropout layers act as regulators to avoid overfitting in the model. The max-pooling layer reduces the scale of the features in each step of the contraction path [63]. The output layer indicates the probability of existing logging trails by the fixed size, as the input layer. A few low-level feature maps generated from 32 filters (3 × 3) in the second block of the contraction path along with the obtained high-level feature maps during the expansion path with the same filters are shown in Figures 5b and 5c, respectively.

**Figure 5.** Visualization of different layers of the U-Net: (**a**) the input layer (e.g., a DSM derived from high-density laser scanning data) with a fixed size (256, 256, 1). A few intermediate feature maps such as convolutional layer, batch normalization, dropout, and max pooling generated from 32 filters (**b**) in the contraction path and (**c**) in the expansion path, and (**d**) the output layer of logging trails with the same size of the input layer.

#### *2.5. Accuracy Assessment*

#### 2.5.1. Collecting Testing Data from Logging Trails

We selected some stands to collect testing data from logging trails in the Kakkurinmaa, Länsi-Aure, and Karpanmaa regions (Figure 6a). We designed 21 routes to collect the samples from segments to cover all of the logging trails within a stand (Figure 6b–d). Each

route consisted of endpoints, trail segments, and edges (interval between two segment trails) (Figure 6f,g). The segments and edges indicated ground-truth trails and non-trails, respectively.

**Figure 6.** Collecting testing samples from logging and non-logging trails. (**a**) Selected forest stands for sampling from the logging trails in the Parkano and Ikaalinen areas in southern Finland; (**b**–**e**) designated routes for testing segments (logging trails) and edges (no logging trails) in the three selected sites; (**f**) an example of a designed route and (**g**) its components.

The length of an example sample segment trail was approximately 30 m; it may be longer in some cases, however, due to certain conditions such as existing connections or looped trails at the edges. Each segment has a start point and an endpoint that are both called endpoints. The positions of endpoints were converted into the GPS Exchange Format (GPX) and imported into a Garmin Oregon 750t GNSS receiver. The routes were reconstructed based on their corresponding endpoints and then navigated point by point with a PDOP (position dilution of precision) of less than 3 m. After finding the approximate location of an endpoint, the surveyor moved to the center of the trail and recorded the segment between the two endpoints using a Trimble GeoXT GNSS receiver. It also controlled the existence of any possible trails between two adjacent trails in the connector edges. The attributes of each endpoint, segment, and edge (e.g., PDOP, dominant tree species, existence trail, or other objects) were recorded. The data were transferred into GPS Pathfinder Office to correct errors based on the nearby GPS base stations to achieve an accuracy of less than 50 cm. The corrected data files were exported in shapefile format for use in assessing the accuracy of the predicted trails by the trained U-Net using the high-density laser scanning datasets.

#### 2.5.2. Accuracy Metrics

A confusion matrix was constructed to assess the accuracy of the trained U-Net through the testing data in predicting logging trails using the laser scanning-derived datasets. The confusion matrix consisted of the number of the ground-truth samples that were labeled as logging trails on the ground and predicted as logging trails through the U-Net (*TP*), the number of samples that were labeled as non-logging trails and predicted as non-logging trails (*TN*), the number of samples that were labeled as logging trails but predicted as non-logging trails (*FN*), and the number of samples that were labeled as non-logging trails but predicted as logging trails (*FP*). Cohen's kappa, overall accuracy, intersection over union (*IoU*), and recall metrics were then derived from the confusion matrix to quantify the U-Net's performance in detecting logging trails from the canopy height and elevation models.

Cohen's kappa indicates the ratio of agreement after removing chance agreement [64,65]. It was calculated as Equation (1) [20] with respect to the observed accuracy (*P*0) and the randomly expected accuracy (*Pe*).

$$\text{Cohen}'s\,kappa = \frac{(P\_0 - P\_c)}{(1 - P\_c)}\tag{1}$$

$$P\_0 = \frac{TP + TN}{N} \tag{1a}$$

$$P\_{\varepsilon} = \frac{(TP + FN) \times (TP + FP)}{N^2} + \frac{(TN + FP) \times (TN + FN)}{N^2} \tag{1b}$$

where *N* is the total number of ground-truth samples.

The overall accuracy indicates the ratio of correct predictions for both logging trail and non-logging trail classes (Equation (2)).

$$\text{Overall Accuracy} = \frac{TP + TN}{TP + TN + FP + FN} \tag{2}$$

*IoU* expresses the similarity ratio between the predicted logging trails and the corresponding segments of ground truth samples (Equation (3)).

$$IoI = \frac{TP}{TP + FP + FN} \tag{3}$$

*Recall* expresses the perfection of the positive predictions. It is the proportion that a real instance of the target class (i.e., logging trails) can be correctly detected through the model (Equation (4)).

$$Recall = \frac{TP}{(TP + FN)}\tag{4}$$

#### **3. Results**

*3.1. Performance of Trained Models*

3.1.1. Detection Logging Trails in the Entire Forest

The results of the accuracy assessment of the trained U-Net using the CHM, DSM, and DEMs datasets in distinguishing logging trails from non-logging trails demonstrate the superior performance of the DSM (Table 1). The accuracy metrics show almost excellent performance of the U-Net using the DSM (*k* = 0.846 and *IoU* = 0.867), substantial performance using the CHM (*k* = 0.734 and *IoU* = 0.782), moderate performance using the DEMavg (*k* = 0.528 and *IoU* = 0.587), and a slight performance using the DEMmin (*k* = 0.136 and *IoU* = 0.155). The values of *Recall* show the excellent performance of trained U-Net using the DSM (0.959) and the CHM (0.908) in detecting the logging trail class.

**Table 1.** The accuracy of the trained U-Net using the derivatives of high-density laser scanning data, including the canopy height model (CHM), the digital surface model (DSM), and the digital elevation models based on the average (DEMavg) and minimum (DEMmin) values to distinguish the logging trails from the non-logging trails in three testing forests in southern Finland.


3.1.2. Detection Logging Trails in Different Stages of Commercial Thinning

The performance of the trained U-Net using the CHM, the DSM, and the DEMs varies in distinguishing logging trails from non-logging trails in the four classes of stand development (Figure 7) as well. Although the trained U-Net using CHM could distinguish significantly logging trails from non-logging trails in young stands after the first commercial thinning (*k* = 0.859 and *IoU* = 0.893) and in mature stands after the second/third commercial thinning (*k* = 0.834 and *IoU* = 0.876), it shows moderate performance in mature stands before the second commercial thinning (*k* = 0.438 and *IoU* = 0.505) (Figure 7a).

Similarly, the trained U-Net using DSM showed excellent performance to distinguish the logging trails from the non-logging trails in young stands (*k* = 0.953 and *IoU* = 0.963) and mature stands (*k* = 0.854 and *IoU* = 0.889) after receiving the commercial thinning operations. The efficiency of the trained model using DSM is higher than the trained model using CHM in mature stands before the second commercial thinning (*k* = 0.684 and *IoU* = 0.686) (Figure 7b).

The trained U-Net using DEMavg showed moderate performance in detecting logging trails within thinned stands, with slightly better performance in the mature stands after receiving the commercial thinning (*k* = 0.542 and *IoU* = 0.667) (Figure 7c). The trained U-Net using DEMmin demonstrated a slight performance in all four stand classes. The accuracy values in the mature stands with commercial thinning is slightly better than other stands (*k* = 0.179 and *IoU* = 0.218) (Figure 7d).

#### *3.2. Prediction of Logging Trails*

Figure 8 shows some examples of predicted logging trails by trained U-Net using different datasets within different stages of commercial thinning. Logging trails were detected with high probabilities using both CHM (Figure 8b,c) and DSM (Figure 8f,g) datasets in young stands and mature stands that had undergone commercial thinning. The detected logging trail patterns were very similar by these two models. However, the trained model using DSM detected the trails under the canopy with a higher probability. In the old stands before the second commercial thinning, the trained U-Net, based on the both CHM (Figure 8d) and DSM (Figure 8h), predicted some segments of a trail with a high

probability while other segments with a low probability. Typically, most of these segments are located in complex backgrounds that are clogged by regenerated trees or seedlings. However, this detection, even with a low probability, can be used to restore the original network of old logging trails in this type of stand.

**Figure 7.** Comparison of the accuracy of the trained U-Net (**a**) using the canopy height model (CHM), (**b**) using the digital surface model (DSM), (**c**) using the average digital elevation model (DEMavg), and (**d**) using the minimum digital elevation model (DEMmin) in detecting logging trails from non-logging trails in different stages of commercial thinning operations.

The trained U-Net using DEMavg dataset for detecting logging trails, demonstrated a weak prediction in the young stands that had received the first thinning (Figure 8j), a relatively high prediction in the mature stands that had received the second thinning (Figure 8k), and a moderate prediction in the old stands (Figure 8l). The trained U-Net using DEMmin dataset only indicated a high prediction of logging trails in mature stands after a second or third commercial thinning (Figure 8o). As logging trails were not established in young stands before the first commercial thinning, the trained models did not predict any significant segments as part of a logging trail (Figure 8a,e,i,m).

**Figure 8.** Comparison of the probability of prediction logging trails using U-Net in different forest development classes based on (**a**–**d**) the canopy height model (CHM), (**e**–**h**) digital surface model (DSM), and (**i**–**p**) digital elevation models (DEMs), in a patch with a size of 256 by 256. Although the U-Net using DSM and CHM showed high probability in detecting logging trails, using DEMmin and DEMavg, it showed weak and moderate probabilities throughout forest stand classes except for mature stands that received the final commercial thinning operations.

#### **4. Discussion**

#### *4.1. Distinguishing Logging Trails from Non-Logging Trails Using U-Net*

The developed U-Net algorithm can distinguish logging trails from non-logging trails with almost perfect accuracy in the studied forest stands. The algorithm could precisely classify wide-open, polygonal spaces within the stands, such as forest storage areas and landing areas as a non-logging trail (Figure 9b). Nevertheless, few narrow corridors, mostly within the mature stands that were not thinned for a long time are predicted as logging trails (Figure 9f). Additionally, some linear features such as drainage ditches with geometric characteristics similar to logging trails (e.g., ditch width/cleaned area from tress) may be misidentified as logging trails in some stands (Figure 9g,h). We classified the testing samples of these objects as the *FP* samples in the confusion matrix during the performance assessment. However, the pattern of the corridors in the network and the geometric characteristics, such as their spacing and width, might cause the U-Net to recognize them as a logging trail. The forest roads are detected as non-logging trails in all stands; the specific geometry of a forest road and its texture on the DSM or CHM resulted in distinguishing it from a logging trail through the U-Net (Figure 9c). As previous studies reported the efficiency of U-Net in detection of road areas using VHR or UAV images [24–29], this study adds its efficiency in detection of logging trails using highdensity laser scanning data as well. On the basis of traditional machine learning, some studies have extracted numerous metrics from laser scanning data to achieve accurate segments of roads under the canopy [10,16]. However, logging trail segmentation using our trained U-Net does not require laborious feature extractions or post-processing to detect the final trail using laser scanning-derived metrics. The developed end-to-end convolutional neural network approach obtains the image patches of the DSM or CHM, derived from laser scanning points, as inputs without extensive pre-processing and creates trail segments without requiring specific post-processing.

#### *4.2. Detection of Logging Trails in Different Stages of Commercial Thinning*

Using the CHM and DSM datasets, our algorithm perfectly detected logging trails in both young and mature stands that had undergone commercial thinning operations (Figure 7a,b). The misidentification of some drainage ditches as logging trails mainly occurred in these two types of stand; we recommend excluding these from the final network. Triangular irregular networks (TIN), which are derived from the laser scanning data, can significantly detect drainage ditches (Figure 9h) and solve this problem. Moreover, using the DSM, the U-Net was able to detect logging trails within mature stands that had not recently undergone a second commercial thinning. The logging trails in these stands do not form a continuous network, as opposed to stands that have undergone recent commercial thinning operations (Figure 8). Some segments of logging trails in the old stands are occluded by regenerated young trees (Figure 9i). The U-Net detected some of these clogged trails with a lower probability, however, which may aid in reconstruction of the original network of logging trails in these stands, for example, similar to the proposed approach [53] for restoring the network of historical roads through hollow roads detected by CarcassonNet and laser scanning-derived DTM.

#### *4.3. Geometric Properties of the Predicted Logging Trails*

The trained U-Net has sharpened the geometric properties of the logging-trail network as accurate as that of the labeled dataset used for its training. It recognized the pattern of a network within a stand (Figure 9) and attempted to keep the average spacing (i.e., 20–25 m) between the logging trails, while avoiding any overlap between them, particularly in the stands that were thinned (Figure 9a). The connection between the trails occurred at the endpoints or through intermediate trail connections that looped the trails (Figure 9d,e). The algorithm also detected those segments of a trail that were clogged by new trees, mostly in mature stands that were not thinned over a long period of time (Figure 9i). However, it did retain the overall pattern of a network, making it possible to restore the missing

segments and the original network. Similarly, earlier studies reported the efficiency of some CNN-based algorithms, such as CasNet [34] and DH-GAN [66], for the extraction of some characterizations of main roads using VHR images.

**Figure 9.** The ability of developed U-Net in detection the characteristics of logging trails: (**a**) patterns and geometric properties of the detected logging trails, such as trail spacing; (**d**) intermediate trail connections; and (**e**) looped trails through the U-Net and the DSM dataset. The algorithm correctly distinguished some complex features such as (**b**) landing areas and (**c**) forest roads as non-logging trails in the vicinity of the logging trails; (**f**) a corridor that was wrongly identified as a logging trail; (**g**,**h**) a deep ditch was detected as a non-logging trail and a shallow ditch that was detected as a logging trail; and (**i**) the occlusion of an old logging-trail by regenerated trees, although the algorithm was able to guess it as a logging trail with a lower probability.

No ground data was available to measure the accurate width of the logging trails. Therefore, we took the standard width of 4 m for a logging trail into account during the creation of the labeled dataset. We attempted to select trails that are visible in the set of our applied sources (e.g., orthophotos and tree profiles), particularly for the stands that had undergone commercial thinning. We randomly visited some of the logging trails within

the selected sites to achieve the highest confidence in the created labeled dataset, before training the model.

The U-Net perfectly detected the features as logging trails when their width was close to the average value. For example, forest roads were classified as non-logging trails using this geometry by the U-Net. However, we could not find reliable labels in some complex stands, such as the mature stands that were not thinned for a long time. With the modern harvesting methods, the harvesters and forwarders are equipped with a computer system and a global navigation satellite system (GNSS) [67,68] that enables them to record the tracks of logging trails with an acceptable accuracy during thinning operations. We recommend employing this large dataset to train the deep learning-based algorithms to sharpen the detection of logging trails using high-quality laser scanning data, particularly in the complex stands.

To explore how well the developed U-Net algorithm performed with the datasets of high-quality laser scanning, we carried out a novel sampling method, with an extensive field survey from the predicted logging trails and non-logging trails in three selected forest sites. For this purpose, we collected adequate ground-truth samples (390) from the segments of predicted logging trails (with a size of circa 30 m) and the interval between two logging trails to check for possible missing trails that might not be detected by the algorithm (Figure 6). This surveying method enabled us to take samples from almost entire logging trails inside a stand; as a logging trail is designed as a continuous loop line starting from one side of the stand and continuing to the other side, so a segment of this line represents the existing or non-existence of the entire trail. It also enabled us to detect the non-logging trail objects either in the spot of predicted logging trails (i.e., segments) or the space between the trails (i.e., edges). Therefore, we maintained a balance between the samples of logging trails and non-logging trail objects, which is curtailed in the assessment of the efficiency of machine learning- or deep learning-based approaches to avoid unbalanced testing data and then miss-evaluation by the algorithm [69].

#### *4.4. DEM Drawbacks in Detecting Logging Trails*

Our research confirms the efficiency of the laser scanning derived metrics that sharpen the changes in the canopy structure of the trees, such as the DSM and DHM, in detecting logging trails (Figure 3). Therefore, the metrics, such as DEMs, which merely demonstrate the topographic characteristics of the ground surface, failed to recognize logging trails in the stands that had undergone commercial thinning. The earlier studies reported the efficiency of high-quality laser scanning data for detecting old logging trails or skid trails in harvested forests using DEM-derivatives, such as the morphological metrics [10,70]. Conversely, using the DEMs dataset (i.e., close to the ground surface), our research did not verify the efficiency of U-Net for detecting logging trails in forests that use harvesters or forwarders in commercial thinning (Figure 7). The soil damage created by harvesters and forwarders is reported to be less than that of skidders during the forest operations [71]. Forwarders carry a large volume of timber, but skidders drag the logs on the ground with several passes, which results in soil disturbance, compaction, and rutting [72]. Moreover, Finland's forest management regulations do not allow heavy soil disturbances, such as deep ruts (>10 cm), during commercial thinning. They recommend spreading logging residues on the logging trails, particularly on routes prone to rutting, to minimize soil damage [5]. The looped pattern of a logging trail network and retaining the optimal spacing between the trails within a stand, may aid to reduce the number of passes on some specific trails and then soil disturbances in forest operations as well. These practices lead to minimal alterations in the natural condition of the ground by logging trails. Therefore, logging trails are not expected to emerge in the DEM against the skid trails [70], abandoned logging trails [10], drainage ditches, or forest roads (Figure 9). Nevertheless, a few logging trails were detected using the DEMs dataset in the mature stands that had undergone commercial thinning (Figure 8). There is a good chance that increasing the number of passes by the machinery [73], using multifactional and heavier harvesters/forwarders [1], surging the

weight of timber loads, and concentrating the forest operation during wet seasons [73], have all resulted in soil compression and then alteration in the natural ground (i.e., terrain).

#### *4.5. Applications*

Our findings and the procedure that we developed have several implications for precision harvesting and sustainable forest management during forest operations. A holistic network of old logging trails may lead to a better understanding of the patterns, geometric characteristics, efficiency, and drawbacks of the network. This understanding provides a new perspective on the designation of an optimal logging-trail network in the new stands, one that can minimize the costs of thinning operations and the damage to the soil and the trees left. This new perspective also provides a modification of the routes of a network that probably passes the soils with low bearing capacity due to a weakness in the design of the initial network or the deformation of the ground surface over time.

By having a network of old logging trails, the operators can import the routes into the computer system of the harvesters/forwarders for accurate navigation of the machines. Doing so decreases the costs of finding the old trails and prevents the overthinning of the stand, which may occur when removing trees for establishing new trails. This is a crucial step to approaching the aims of precision harvesting by minimizing the operation costs and preserving the forest landscape in modern forestry.

#### *4.6. Outlook*

Despite difficulties in finding reliable logging trails, we could collect acceptable patches of labeled datasets for training the U-Net algorithm. However, the datasets are limited to the Parkano and Ikaalinen areas, in Southern Finland. We strongly recommend employing a large dataset of logging trails that covers similar forest stands, with regard to commercial thinning, at least in the Nordic region for training the deep learning-based algorithm to achieve a versatile algorithm for the detection of logging trails.

The developed model performed with reasonable accuracy in detection of old logging trails in the mature stands that had not received the second thinning. However, detecting entire segments of a logging trail is still challenging in this type of stand. As mentioned earlier, providing an appropriate labeled dataset for improving the process of training the algorithm or testing the performances of other deep learning-based algorithms may aid in sharpening old logging trails in the mature stands.

In some stands, the drainage ditches hampered the efficiency of U-Net using the DSM or DHM to distinguish the logging trails through the semantic segmentation procedure that relies on the binary segmentation. We recommend testing high-level semantic segmentation or instance segmentation that discriminates different objects from each other [74]. However, this requires a larger labeled dataset based on the number of objects.

#### **5. Conclusions**

In this research, we presented an end-to-end U-Net convolutional neural network that uses high-density laser scanning-derived metrics for logging trail extraction. We carried out an extensive field survey to test the efficiency of the trained model based on three metrics (i.e., DSM, CHM, and DEMs) in forests with different commercial thinning. The trained U-Net using DSM was able to distinguish logging trails from the background with a high probability and very high performance, particularly in young and mature stands that had undergone commercial thinning. However, it needs to be improved for the very old stands that have not received second commercial thinning for a long time. The developed model can be used easily by the end-users, without heavy pre-processing of the laser scanning data or heavy post-processing of the outputs. We recommend creating a large labeled dataset from logging trials collected by harvesters during thinning operations and use them to train the deep-learning based algorithms. It would help to develop a versatile model that can extract logging trails in different forest management systems and different thinning stages, at least over the Nordic regions.

**Author Contributions:** Conceptualization, O.A., J.U. and V.-P.K.; methodology, O.A., J.U. and V.-P.K.; data provision, J.U.; data preparation, O.A.; software and programming, O.A.; field investigation and sampling, O.A., J.U. and V.-P.K.; visualization, O.A.; writing—original draft preparation, O.A.; writing—review and editing, J.U. and V.-P.K.; supervision, J.U. and V.-P.K.; project administration, J.U. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work has been funded by the public-private partnership grant established for the professorship of forest operation and logistics at the University of Helsinki, grant number 7820148 and by the proof-of-concept-grant by the Faculty of Agriculture and Forestry, University of Helsinki, grant number 78004041. The APC was funded by University of Helsinki.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Acknowledgments:** We would like thank Mikko Leinonen and Juho Luotola for assisting in the field operations. We would also like to express our gratitude to Finsilva Oyj and Metsähallitus for providing the access to their forest holdings and related forest inventory databases.

**Conflicts of Interest:** The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

#### **Appendix A**

**Table A1.** U-Net tuned hyperparameters.


**Figure A1.** Accuracy and loss versus epochs during training of U-Net using (**a**) the DSM, (**b**) the CHM, (**c**) the DEMavg, and (**d**) the DEMmin derived from high-density laser scanning data.

#### **References**

