*5.3. Ground and Seabed Elevation*

Similarly to the IR data, elevations contributed greatly to the improvement of the OA of the habitat classification, but could not be separated from spectral predictors and used alone to provide accurate detection of land and sea covers. Indeed, Table 3 revealed that elevations produced a classification with an OA of 55%, which is less than what can be achieved with spectral predictors. This was expected since many different habitats coexist at similar altitudes and are mainly differentiated by their reflectance. However, for some others, mainly salt marsh types, elevation is an intrinsic quality and the base of their definition. This explains why those are the type of classes that benefited the most from the addition of elevation to a set of spectral predictors in terms of classification accuracy. Respectively, the classification accuracy of low salt marsh, mid salt marsh, and high salt marsh rose from 71%, 76%, and 21% by adding elevations to the green waveform features in the set of predictors (see Figures A4 and A6).

Even though elevation combined with spectral information already provided high classification accuracy (90%, see Table 3), the values extracted with our approach were not always consistent with those provided by the original PCs in marine areas, as explained above. To remove artifacts due to water quality, a post-processing step could be implemented, and the neighboring elevations could be used to regularize the processed PC obtained.

### *5.4. Classification Approach*

Our results showed that topobathymetric lidar is fitted to the classification of coastal habitats. Although elevations, IR intensities, and green waveform features did not produce high accuracy classifications of the land-water continuum, they were complimentary and achieved high-precision results when combined. To the best of our knowledge, no similar papers proposing point-based land and water covers mapping from bispectral lidar data were published, so no direct comparisons of results are possible. However, our observations corroborate those made in [35], which successfully used random forest algorithms to classify full-waveform lidar data over urban areas and obtained an overall accuracy of over 94% when identifying four types of land covers. This paper only focuses on terrestrial areas but confirms the high accuracy we observe when using waveform features without rasterization for mapping purposes. Class-wise, our results seem more homogeneous for the land covers we have in common, although this means that our approach performs less accurately than theirs on some urban classes. Indeed, ref. [35] presents a PA of 94.8% for buildings, which is higher than we observe on our roof class (89%), but our vegetation classes (trees and shrubs) have an average PA of 82.6%, while theirs is 68.9%, and our natural ground classes (soil, lawn, salt marsh) reach an average PA of 91.6%, higher than the 32.7% presented in [35]. Our approach and the method introduced in [35] perform similarly on artificial ground (for us, tar and concrete), with PAs of 96% and 96.4%, respectively.

Although we found no other research performing point-based classification of subtidal, intertidal, and supra-tidal habitats, we can compare our findings to those in [34], where the authors also observed that the use of waveform data improves seabed maps and obtained an OA of 86% for their classification of seabed substrates and aquatic macrovegetation. Their approach provides a better mapping of low underwater vegetation on soft substrate (100% versus 85% of PA in our case) but a less accurate detection of hard seabed substrate (68% versus 90% of PA in our study). Again, although we have less accurate results for some classes, our method seems to provide more balanced and homogeneous performances among different classes.

Our results also confirm those from [27], where 19 land-water continuum habitats were classified with an OA of 90%, and in which the authors concluded that the best classification results were obtained when combining spectral information and elevation. However, in [27], the authors used digital models of waveform features that they obtained by rasterizing their data, and they relied on an ML classifier. Although our metrics are similar, our classification has the advantage of preserving the spatial density and repartition of the data.

Other studies such as [49–51] used 2D lidar-derived data and imagery along with machine learning classifiers to map similar coastal habitats as the ones we attempted to map. They obtained performance scores in the same range as ours, with OA between 84% and 92%. The authors did not use waveform data in these studies and observed low accuracy when classifying only digital elevation models obtained with lidar surveys, therefore requiring the additional processing of imagery. Our approach has the advantage of requiring only one source of data out of the two sources often used in existing literature, which facilitates both acquisition procedures and processing.

Globally, our results are in line with [27,35,52], which all state that bathymetric lidar waveforms are well suited for benthic habitats mapping and observe the same complementarity between spectral and elevation information for habitat mapping. Our method offers an OA similar to existing research in lidar data classification for habitat mapping, while extending the application to a wider range of habitats—both marine and terrestrial—and avoiding information loss through rasterization. Although the PAs obtained for some classes are lower than results previously presented in other studies [34,35], this method also has the advantage of offering homogeneous performances and low inter classes PA differences, contrary to other existing research results [34,35].

The random forest models trained showed low overfitting, as the extended application results illustrated. The classification of boats located outside of the training and test data collection area, for example, illustrated that the classifiers obtained could be applied to other datasets accurately. Natural, semi-natural and anthropic habitats were well distinguished, and vegetation was precisely isolated, which opened perspectives for ecological assessments of those coastal areas. The remaining errors often involved classes that were close semantically. For example, there was confusion between salt marsh and high natural grasses but low confusion between lawns and low marsh. A potential improvement could be to review the classes defined initially and distinguish vegetation by layers (herbaceous, arbustive, arborescent) and by their natural, semi-natural, or anthropic nature.

Besides the quality of the training and test datasets established, a source of explanation for remaining classification errors could be found in the technical specifications of the sensor. The diameter of the HawkEye III's green laser's footprint is 1.80 m, which means that the returned waveform condensates information in a 2.5 m<sup>2</sup> area. This parameter may have had an influence on the ability of a given array of features to describe pebble or sand, mostly at interfaces between habitats. This could partially explain the confusion between deciduous and evergreen trees: in a mixed-species forest, two different trees can coexist in a 2.5 m2 patch.

Although the main issues identified visually reflected in the metrics computed on the test dataset, there was a gap between the estimated quality of the map and the statistics computed. For example, the classification of portions of sandy beaches as pebble was not as obvious in the confusion matrix as it is in the map. This showed the influence of the way the test dataset is built. Further work should try to incorporate validation maps in addition to test datasets to qualify the output on the complete study site. A finer tree species inventory could also be integrated to better assess the results obtained.

Nonetheless, our results highlighted a strong classification approach, leveraging the strengths of 3D bispectral data. Working at the PC scale and not in 2D opens perspectives for 3D classifications, identifying all layers of land and sea cover, mostly in vegetated areas, by using waveform segmentation instead of waveform PC segmentation, as experimented in [53]. It also shows possibilities for post-processing and neighbor-based result regularization, as well as the exploitation of spatial information through the addition of geometrical features such as roughness or local density. Lastly, the accurate classification of habitats through 3D data offers opportunities for structural ecology assessments and communication of these results to environmental managers through virtual reality or more relatable 3D visualizations, for the implementation of sustainable integrated management of coastal areas. Figure 14 provides a 3D view of the 3D habitat mapping achieved in the present study.

**Figure 14.** 3D map of the habitats obtained over the complete study area by the random forest classifier trained on green waveform features, infrared intensities, and elevations. S. = submerged, Ev. = evergreen, Dec. = deciduous.

#### **6. Conclusions**

In this article, we proposed an approach to map coastal habitats exclusively using topobathymetric lidar, including both full waveforms and reanalyzed echoes. We produced results under the form of PCs and extended the application of our best classifier—which obtained 90% of OA on the test dataset—to a dataset of 24.5 million points covering a very diverse coastal area. A total of 21 classes of land and sea covers were defined and mapped in 3D. We found that green waveforms and IR intensities complement each other: while green data provided strong results in submerged areas, the IR wavelength improved the distinction of land covers. Elevations further increased the classification accuracy by perfecting the classification of plane classes and classes, such as low, mid, and high salt marsh, which were principally differentiated by their elevation. It is of special interest to note that green waveforms alone produced better results than IR intensities or elevations alone. However, the combination of the three sources of information yielded the best result, highlighting that they each bring a specific contribution to the result. Our research showed

how fit topobathymetric lidar is to the classification of such numerous land-water interface habitats. We enhanced a waveform processing method to apply it to topo-bathymetric environments. The use of PCs instead of rasters and the addition of a second wavelength provided an original 3D map of 21 coastal habitats at very high spatial resolution. This provides encouraging perspectives for 3D mapping and ecological assessment of the landwater interface and paves the way to integrated management of coastal areas, bridging the gap between marine and terrestrial domains.

**Author Contributions:** Conceptualization, M.L., A.C., T.C. and D.L.; methodology, M.L.; software, M.L.; validation, M.L.; formal analysis, M.L.; investigation, M.L.; resources, Y.P., A.E., A.C. and M.L.; data curation, M.L. and Y.P.; writing—original draft preparation, M.L.; writing—review and editing, M.L., A.C., T.C., D.L., Y.P. and A.E.; visualization, M.L.; supervision, A.C., T.C. and D.L.; project administration, A.C., T.C. and D.L.; funding acquisition, M.L., A.C., T.C. and D.L. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by a Ph.D. grant from Région Bretagne, and by the Saur Group's patronage. The APC was funded by the Nantes-Rennes topobathymetric lidar platform of the University of Rennes 1.

**Data Availability Statement:** Data sharing is not applicable to this article.

**Acknowledgments:** The authors are grateful to Airborne Hydrography AB (AHAB), an affiliate company of Leica Geosystems, for their technical support, their assistance with software and development issues, and their help in understanding the specificities of the sensor and its data. M.L. would like to thank the lab members who contributed to ground truth data acquisition—Hélène Gloria, Dorothée James, Alyson Le Quilleuc, Antoine Mury, and Léna Véle—and external volunteers. M.L. is also grateful to the Rennes 2 University for the field work equipment support. The authors deeply acknowledge the input of the anonymous reviewers, which improved the quality of the manuscript.

**Conflicts of Interest:** The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

#### **Appendix A**

Appendix A contains the projected 3D maps of the habitats obtained for 3 random forest experiments: the classification of green waveform features only, the classification of green waveform features and IR intensities, and the classification of green waveform features and elevations.

**Figure A1.** Projected 3D map of the habitats obtained with the predictions of a random forest classifier on green spectral features; orthoimage of the study area. The orthoimage was captured in 2014, while lidar data are not contemporaneous as they date from 2019. S. = submerged, Ev. = evergreen, Dec. = deciduous.

**Figure A2.** Projected 3D map of the habitats obtained with the predictions of a random forest classifier on green spectral features and infrared intensities; orthoimage of the study area. The orthoimage was captured in 2014, while lidar data are not contemporaneous as they date from 2019. S. = submerged, Ev. = evergreen, Dec. = deciduous.

**Figure A3.** Projected 3D map of the habitats obtained with the predictions of a random forest classifier on green spectral features and elevation values; orthoimage of the study area. The orthoimage was captured in 2014, while lidar data are not contemporaneous as they date from 2019. S. = submerged, Ev. = evergreen, Dec. = deciduous.
