*3.2. Machine Learning-Based Raman Spectral Analysis Can Classify Renal Amyloidosis with Respect to Deposition Sites and Types*

To distinguish subtle intrinsic spectral differences between amyloid types that were not detected by visual inspection of the tissue spectra, we utilized a multivariate dimension reduction and data exploration technique, t-SNE. Figure 5 shows the t-SNE distribution results of the processed Raman tissue spectra of the biological fingerprint region, ranging between 800 and 1800 cm−1. We subjected a collection of Raman spectra to nonlinear dimensionality reduction and projected them onto a lower dimension, specifically, 2-dimensional space (t-SNE components 1 and 2). The t-SNE map reveals that spectra collected from each amyloid type are clearly separated, as are spectra from glomerular and non-glomerular regions (even those collected from the same tissue sections). Each cluster of identified type is relatively tight without overlap between clusters, indicating that dimensionality reduction of Raman spectra using t-SNE can clearly discriminate between glomeruli constituting amyloid fibrils and normal glomerulus regions, and between AL and AA fibrils. We observed intra-group separation, especially in glomerular AA datapoints; however, the distance between the sub-groups is relatively small compared to the inter-group distances. As inter-group separation is significantly higher than intra-group separation, strong similarity among Raman spectra of the same types and regionality are observed from the t-SNE map. We attribute such clear separation between clusters, not only among different types but also between glomerular and non-glomerular regions, to the function of the glomerulus in the kidney. The glomerulus, a ball-shaped structure identified in Figure 4a, is responsible for filtering waste products and excess fluids from the blood [46]. As amyloidogenic proteins—serum amyloid A (AA) or immunoglobin light chain (AL)—form insoluble fibrils, they fail to pass through the filter; thus, most of these fibrils are deposited and accumulated in the glomeruli. Therefore, the amyloid protein deposits are predominantly found in the glomeruli [34,36]. This concentration of amyloid deposits in the glomeruli of AA and AL tissues is reflected in the Raman fingerprinting of the tissue, leading to clear separation in the t-SNE map.

**Figure 5.** t-SNE map for the distribution of Raman spectra. Spectra were identified with their amyloid types (AA, AL, or NA) and location (within or without glomeruli). Each point represents a Raman spectrum that is positioned based on the similarity probability of the spectra in the dataset. Each group is well separated from other groups, indicating that the Raman spectra of the same group are similar and distinct from those of other groups.

Furthermore, DBSCAN results (Figure 6) obtained using the processed Raman tissue spectra between 800 and 1800 cm−1, show clustering results with distinctive separation among the types and glomeruli. DBSCAN analysis resulted in a total of 12 clusters, of which 5 major clusters represent 96.4% of the entire collection (8360 out of 8672 spectra) with parameters (number of neighbors as 2 within the radius of 1.09). The left panel of Figure 6 summarizes the arrangement of each cluster with respect to amyloid type and deposition site. 96.9% of glomerular AA (Cluster 3), 98.4% of non-glomerular AA (Cluster 6), 96% of glomerular AL (Cluster 1), and 97.2% of non-glomerular AL (Cluster 2) are identified as separate clusters. For the NA tissue, 95.6% of spectra are grouped as an individual cluster (Cluster 8). The remaining spectra are either unidentified or assigned to separate minor clusters. It is worth noting that these minor clusters do not have spectra pertaining to different amyloid types or deposition sites, demonstrating the robustness of the clustering analysis. The average spectra with one standard deviation shaded for the five major cluster groups are presented on the right panel of Figure 6. The spectral profiles demonstrate strong similarities to those of the actual spectra in Figure 4b, indicating that machine learning-based classification indeed enables us to characterize the types of amyloid fibrils and their deposition sites within the tissue.

**Figure 6.** DBSCAN clustering results and representative Raman spectra of each cluster. (**Left**) Out of a total of 12 clusters, 5 dominant clusters were identified. AA glomerular and non-glomerular spectra are primarily grouped as Clusters 3 and 6, respectively. AL glomerular and non-glomerular spectra are primarily grouped as Clusters 1 and 2, respectively. NA tissue is primarily grouped as Cluster 8. The rest of the seven minor clusters are grouped accordingly. Unassigned spectra are marked as gray. (**Right**) Average spectra of the 5 dominant clusters with 1 standard deviation shaded.

In a previous study, we successfully utilized Raman spectroscopy to characterize crystal deposits in kidney biopsies [16], leading us to expand its application to the study of renal amyloid deposits. Spectroscopic techniques, including Raman spectroscopy, have demonstrated promise in detecting and identifying molecular changes in various kidney conditions [47,48]. With the aid of statistical and machine learning algorithms for analysis, these approaches can produce robust results [19,20,49]. Despite the limited sample size in this pilot study, Raman spectroscopy combined with appropriate analysis techniques was able to distinguish between different types of amyloids.
