1. Introduction and Motivation
The World Health Organization (WHO) estimates that by 2050, nearly 2.5 billion people are projected to have some degree of hearing loss, which poses an annual global cost of US
$980 billion [
1]. Daniela Bagozzi, a WHO Senior Information Officer, wrote an article to make a call for the private sector to provide affordable hearing aids in developing countries (as their current cost ranges from US
$200 to over US
$500) [
2]. In addition, the Healthline Organization reported that a set of hearing aids might cost
$5000 [
3].
The main components of a digital hearing aid are shown in
Figure 1, comprising of a microphone, an analogue to digital converter (A/D), filter banks, gain blocks, and a digital to analogue converter (D/A). First, the analogue sound signal detected by the microphone is converted into digital form by an A/D converter. Next, this digital signal is applied to a filter bank. Different digital signal theories can be applied to divide the input digitized sound signal spectrum into sub-bands with different bandwidths. Then, gain blocks are applied to the outputs of the filter bank to amplify the sound signal to the desired hearing level. In the last stage, the signal is converted from digital to analogue by the D/A converter [
4,
5]. However, it is better to design digital filters that can match multiple audiograms for patients who suffer from hearing loss. This approach lowers the cost of manufacturing hearing aids as they can be produced on a large scale as designed by many research techniques [
4,
6,
7,
8,
9]. On the other hand, it increases the complexity of hearing aid design, requiring high operating power and big chip area, leading to improperly fitted hearing aid [
10].
This research was motivated by all these considerations in hearing aid design and the impact of hearing loss on nations’ economies and patients’ ability to afford the hearing aid. The main idea here is to use artificial intelligence and machine learning to facilitate the whole process for the patients and the hearing aid designers. In addition, the process of fitting hearing aids is tiring and consuming in time as it depends on many trials which require the patient to be highly responsive. A study stated that only 50–60% are satisfied with their hearing aids use [
11]. Furthermore, integrating all these factors with the severe shortage of numbers of audiologists who are very rare and hard to find in rural areas [
12,
13]. This urges the use of artificial intelligence new technology to resolve these problems, especially when the fitting process depends on the skills and experience of audiologists [
14].
In this work, the authors apply unsupervised learning to cluster audiograms using spectral clustering. These audiograms are taken from a database of 28,244 audiograms used by Bisgaard, Vlaming and Dahlquist [
15] to produce a standard set of audiograms for the IEC (International Electrotechnical Commission) 60118-15 measurement procedure. These audiograms were clustered by vector quantization analysis of size 60. Here, the researchers excluded five audiograms of the quantized results, representing normal hearing levels (0–20 dB) defined by different health organizations [
1,
16,
17]. These five audiograms are removed since the study aims to produce clusters representing different audiograms for patients who experience hearing loss. Different audiograms with the same shape but with different levels can be realized with the same set of filters by adjusting the gains to match the required audiogram shape. Another reason to classify audiograms according to shape is, fitting hearing aid process will be easier. A supervised machine learning model can be built based on this work to classify patients’ audiograms then program or adjust hearing aid according to pre-set configurations. These configurations are linked to the clusters produced by this work then at the end of the process of hearing aid fitting, a fine tuning might be needed.
This introduction is kept simple to be comprehended easily by both experts in the field of hearing aid design from engineering perspectives and audiologist medical perspectives. The technical parts can be found in the following sections of the paper.
This paper is organized as follows. Firstly, we discuss recent audiogram classifiers and what are the limitations of these works then the main contribution of this work is highlighted. Subsequently data clustering algorithm is explained in
Section 3, where the algorithm description and implementation are elaborately discussed. This is followed by a discussion concerning how this algorithm is evaluated and how data sets are prepared. Then, the results are demonstrated and discussed in
Section 4 to find the optimum number of clusters. The clustering algorithm is evaluated for audiogram population that produced the quantized data. The generated clusters are mapped and compared to the standard sets selected by Bisgaard in the last subsection. Finally, a summary for the results, conclusion, and prospects for future work are presented in
Section 5.
2. Related Work
In 2016, Rahne, et al., have built an excel sheet as an audiogram classifier with pre-set inputs that can be defined according to inclusion criteria in the clinical trial. This tool provides inclusion decision based on the predefined audiological criteria [
18]. Then, in 2018, Sanchez, et al. have classified the hearing tests data in two stages. The first stage is unsupervised learning to define trends and spot patterns in data obtained from different hearing tests. In the second stage a supervised learning algorithm is built in which different outcomes from different hearing tests were explored. In the second stage, the subjects were assigned a profile then the data were analyzed again to find the best classification structure of the subjects into the four auditory profiles. This classifier was based on data analysis to audiograms which reflects the loss of sensitivity and other hearing tests to reflect loss of clarity that was not captured by the audiogram [
19]. Belitz et al., in 2019 have also combined unsupervised and supervised machine learning methods to map audiograms to a small number of hearing aid configurations. The target of this study was to use these configurations as a starting point for hearing aid fitting. This method was applied in two steps, the first one started by performing different unsupervised clustering algorithms to determine a limited number of pre-set configurations for a hearing aid. The centroids of the clusters were chosen to represent fittings targets which can be used as starting configurations for hearing aid adjustments for each individual. The second step was to assign to each audiogram a class based on the results from the first stage comfort target clustering. Various supervised machine learning techniques were used to assign to each audiogram a pre-set configuration. The classifier accuracy of the second stage was low when they selected single configuration and it was improved when they allowed two configurations to each audiogram [
20]. In 2018, a research team developed their first steps of a machine learning classifier by the use of unsupervised learning to cluster audiograms [
21]. In this work, audiograms were clustered with the target to make them maximally informative audiograms. Then the clustered data was prepared to be a good training set for supervised machine learning classifiers. They built an approach to get a set of non-redundant unannotated audiograms with minimal loss of information from very big data set. In 2020, the same group used the data preparation procedure carried out by them to produce a machine learning classifier. They applied supervised ML to 270 audiograms annotated by three experts in the field. The results have good accuracy to annotate the audiograms concisely in terms of shape, severity and symmetry [
12]. The classifier can be integrated as a mobile application to help the user to describe audiogram concisely so it can be interpreted by non-experts. The classifier outputs can be used by non-experts to decide if the patient needs to be checked by a specialist. It can resolve partially the problem of having a shortage of specialists and it can be the first step towards a more sophisticated algorithm to help experts of the audiology field.
Crowson et al., used deep learning convolutional neural network architecture to classify audiograms of normal hearing, sensorineural, conductive, and mixed hearing loss. The audiograms were converted to jpeg formatted picture files. Image transformation techniques were used to increase the number of images available as a training data for the classifier. Image rotation, wrapping, contrast, lighting and zoom were applied to the audiogram images in the training set. They achieved 97.5% accuracy of their model to classify hearing loss types based on features extraction of the audiograms [
13]. In this research, the study aimed at classifying audiograms to detect the cause of hearing loss which is not helping in configuring hearing aids or not conducted for this purpose [
13].
Musiba [
22], has classified audiograms based on UKHSE (United Kingdom Health and Safety Executive) categorization scheme. The sum of pure tone audiometry test hearing levels at frequencies 1 kHz, 2 kHz, 3 kHz, 4 kHz and 6 kHz, was obtained. Then, compared with the figures set by UKHSE and classified as one of the following: acceptable hearing ability, mild hearing impairment, poor hearing, or rapid hearing loss. The aim of this classification was to prompt proper actions to prevent noise-induced hearing loss. The annotation process was carried out by experts in the field who applied the UKHSE standards.
Cruickshanks and his team [
23] made a longitudinal study on how the shape of audiograms change over time. The follow up was carried out based on four stages; 1993–1995, 1998–2000, 2003–2005, and 2009–2010. The audiograms were classified into eight levels and the change in hearing ability over time was recorded based on these classes. Musiba and Cruickshanks [
22,
23] didn’t implement any intelligent solutions as they counted on the experience of the specialists in the field.
Classifier techniques found in the literature, are summarized in
Table 1, showing the limitations and short-coming of each technique.
To the best of our knowledge, the classifiers that are built with the purpose to classify audiograms are very few and not suitable as a refence to the specialists in the field, such as audiologists, hearing aid specialists, and hearing aid designers. This study is the first study to classify audiograms according to the similarity in shape with the aim to reduce the complexity of the filter bank used to realize the audiogram shape of the patients. According to signal processing techniques, it is important to know the shape of the audiogram to apply different gains to different filters that cover the entire band of hearing (125 Hz–8 KHz). This classifier is built to capture different shapes of audiograms and not to classify hearing loss type as achieved by current existing works. Audiograms of similar shape at different levels can be realized by a group of filters by changing the gain coefficients of each filter or the overall gain of the cascaded filters. This classification will help hearing aid designers to reduce the complexity of their filter designs and can be a good start for the future supervised learning algorithm to classify audiograms according to these detected shapes. Applying novel methods such as sophisticated machine learning algorithm will facilitate the whole process for the experts and to increase patients’ satisfaction.
3. Data Clustering Algorithm
The study is conducted to group audiograms according to similarity in shape. For this purpose, spectral clustering was used to provide clusters that can be technically used by the experts in the field. This section starts with a general description of the algorithm showing the main steps of how the algorithm was implemented and evaluated. Then, the details of the implementation process were discussed and finally, the evaluation criteria for different numbers of clusters and for the selected clusters, all were explained.
3.1. Algorithm Description
The spectral clustering algorithm is a graph-based technique to find k clusters in data [
24,
25]. It calculates a similarity matrix of a similarity graph from the data to determine the Laplacian matrix. A similarity graph models the local neighborhood relationships between the data points, where the matrix representation of this graph is the similarity matrix. The similarity matrix contains pairwise similarity values between connected nodes in the similarity graph and can be represented by Laplacian matrix. This algorithm starts by representing data in a lower dimension space in which the data are classified. The reduction in data dimension is based on the eigenvectors of the Laplacian matrix. The columns of the eigenvectors correspond to the k smallest eigenvalues of the Laplacian matrix. These eigenvectors are a low-dimensional representation of the input data in a new space, where the clusters are well-separated [
25]. This algorithm aims to classify data into clusters, such that the parts of data in the same cluster are similar and others in different clusters are dissimilar to each other [
26]. The authors decided to use spectral clustering for their data as it can produce accurate clustering results by solving the features of the Laplacian matrix. This clustering method can be used for any shape of data, with the advantage of dealing with non-convex data distributions [
27]. Since the data used in this research is mostly convex and sometimes non-convex, spectral clustering is a suitable method for unsupervised learning to detect different audiograms shapes.
The authors started by clustering the data into seven clusters using spectral clustering. Two methods were used to construct the similarity matrix, namely the nearest neighbors and radius search methods. Then, the Laplacian matrix was generated and normalized with different methods, such as random-walk normalization and symmetric normalization. The produced seven clusters were checked by looking into the eigenvalues then visually inspected by plotting a scatter plot for the clusters. If all the eigenvalues are zero or the plot did not indicate credible clusters then the method is not considered. The selected methods were furtherly assessed by checking eigenvalues one more time if they indicate a gap then the Silhouette coefficient is calculated for these seven clusters. This process was repeated to generated eight clusters and then evaluate the model performance. Then, other number of clusters (9, 10, and 11 clusters) were also tested. It was decided that we would start with seven clusters in order to detect as many shapes as possible of the audiograms for future supervised machine learning model with good accuracy. The lower the number of audiograms clusters the lower the accuracy of the predictions. On the other hand, the authors decided to stop at 11 clusters as the Silhouette coefficient dropped significantly. The algorithm steps are shown in
Figure 2.
Then authors picked the number of clusters with the highest 2 silhouette coefficients for further evaluation. These two numbers of clusters were compared by evaluating Silhouette coefficients, Calinski-Harabasz criterion, and Davies-Bouldin criterion. This was followed by generating audiogram population and annotating them according to the produced clusters and then these clusters were evaluated with the same three criteria methods Silhouette coefficients, Calinski-Harabasz criterion, and Davies-Bouldin criterion. This was carried out to test the clustering method for a large number of audiograms. Finally, the authors mapped the generated clusters to Bisgaard selected levels to compare their clusters with existing standards used in hearing aid measurements.
3.2. Clustering Implementation
Spectral clustering is a well-established algorithm but can be carried out using many input arguments. The authors tried many of them and the trials were evaluated statistically. First, the similarity graphs are generated using two ways (number of nearest neighbors and according to a certain value that represent radius to search for the nearest neighbors). Then, the similarity graphs are represented using Laplacian matrix. The clustering results were evaluated for different forms of this matrix; without normalization, random-walk normalization, and symmetric normalization. Finally, two clustering methods (k-means and k-medoids) were tested to cluster the eigenvectors of the Laplacian matrix. In each case, the eigenvalues were checked, and the silhouette coefficients were calculated for performance evaluation [
28].
MATLAB is the selected platform to perform spectral clustering. The similarity graph was generated using kernel nearest neighbors, where, it connects two points
and
, when either
is the nearest neighbor of
or
is the nearest neighbor of
. These distances are calculated using Euclidean formula, then, transformed with a scaled kernel with a value that is selected using heuristic procedure. The clustering method used to cluster eigenvectors of the Laplacian matrix is the K-medoids. A medoid in the K-medoids algorithm is the most centrally located point with minimum distance with respect to other points and is not influenced by the outliers or extremities [
29,
30]. Finally, the similarity graph is represented with normalized Laplacian matrix using random-walk.
3.3. Clustering Performance Evaluation
Four criteria values are calculated to find the best number of clusters and to evaluate clustering method. These are: eigenvalues, silhouette coefficients, Calinski-Harabasz criterion and Davies-Bouldin criterion. To obtain well-separated clusters, eigenvalues should ideally be zero or small. To determine the proper number of data clusters, the number of clusters is increased gradually till a gap is observed in the eigenvalues [
31]. If it is not possible to reach this gap, the silhouette analysis is used to measure how well the data is clustered. This analysis results in a coefficient in the range of [−1, 1], where silhouette coefficients close to +1 indicate that the sample is far away from the neighboring clusters. On the other hand, a value of 0 indicates that the sample is on or very close to the decision boundary between two neighboring clusters. In contrast, negative values indicate that those samples are assigned to the wrong cluster. The silhouette index (SI) is the average of these coefficients, the closer to +1 the better the separation between clusters [
32,
33]. Calinski-Harabasz index (CHI) is the ratio between the variance of the sums of squares of the distances of individual data points to their cluster center and the sum of squares of the distance between the cluster centers. The higher the value the better the performance of the clustering model [
34]. The Davies-Bouldin analysis calculates two values (within cluster variance and the distance between the centroids of different clusters). Then, the nearest neighboring cluster is identified for each cluster and the sum of within cluster variances is divided by distance difference between clusters centroids. The Davies-Bouldin index (DBI) is the average of these values and it ranges from zero to infinity and the smaller the value the better the separation between clusters [
35]. The last three criteria are suitable to evaluate clusters as they give more accurate results for convex data [
36]. The optimal number of clusters occurs at the highest Calinski-Harabasz and silhouette coefficients while it is the lowest for the Davies-Bouldin value [
37].
3.4. Data Sets Preprocessing
The authors used two sets of data. The first one consists of 55 audiograms and the second one is generated using Kutools Excel add-in.
3.4.1. First Data Set
To perform spectral clustering to group large data set, it should go through two steps [
38]:
The first step is data reduction, which is carried out mostly using k-means to cluster the given data set. From each cluster, some data are picked normally. They are the ones near the center of the cluster. Thus, each cluster is represented by one set [
39,
40]. Then, the spectral clustering can be applied to construct the similarity matrix and to classify the reduced size data into final classes.
Bisgaard et al. [
15] did the data reduction to a database of 28,244 audiograms using vector quantization of size 60. Then the authors in this paper, applied the spectral clustering to these 60 audiograms. This data can be found in Table A.1. in Bisgaard work [
15]. In fact, it was furtherly reduced by eliminating five audiograms that represent individuals with normal hearing. These levels are removed since the model is built for patients who experience hearing loss to assist in configuring or designing hearing aids. The audiograms were measured in standard audiometry booths at eight test frequencies. Air conduction thresholds are measured at 250 Hz, 500 Hz, 1000 Hz, 1500 Hz, 2000 Hz, 3000 Hz, 4000 Hz, 6000 Hz, and 8000 Hz.
3.4.2. Second Data Set
The authors generated data with the original size (audiograms) by repeating the 60 audiograms with the frequencies indicated in Table A.1 in Bisgaard work. This percentages represent the part of population audiograms within a specified range around the 60 audiograms. This range is decided based on minimizing the calculated Euclidean distance from each measured audiogram to its corresponding “typical” code vector audiogram. Based on this training technique, the authors believe repeating these audiograms would result in good representation and carry enough information about the original database. The used tool to generate this data set is the Kutools Excel add-in.
5. Results Summary and Conclusions
A comparison between the results of 8 clusters and 10 clusters can be summarized in
Table 7. As shown in
Table 7, different criteria values are slightly better for 8 clusters than 10 clusters. Silhouette coefficients and Calinski-Harabasz clustering evaluation criterion values are higher for eight clusters, while Davies-Bouldin values are lower. For the population audiograms, the Silhouette and the Davies-Bouldin values are better for eight clusters, but the Calinski-Harabasz value is better for 10 clusters. This could be explained as Calinski-Harabasz is the most sensitive parameter to the number of observations considered to calculate this criterion [
45]. The number of audiograms considered for 8 clusters in stage 3 is 20,957 while it is 22,002 audiograms for 10 clusters in stage 3 (as shown in
Table 7). Eigenvalues are small, but no gap is indicated to confirm selection between 8 and 10 clusters. The number of audiograms’ population considered in the last stage for 10 clusters is higher than those in 8 clusters. The 10 clusters can be preferred as more patients’ audiogram shapes are considered using 10 clusters. The Silhouette coefficients of the audiogram population are higher than 0.5 for both number of clusters, which suggests good clustering.
The trial to classify audiograms by Belitz [
20] to adjust hearing aid, has low accuracy. 68% when they assigned one configuration to each audiogram. We believe that their accuracy might have increased if they considered higher number of clusters as the data has a nature of high overlap. This matches with the results found in this research to cluster data into 8 or 10 classes.
This work can be considered the first step to change the way of designing hearing aid filter banks. The existing filter bank designs use digital filters with different techniques to divide the entire frequency band (125 Hz–8 KHz) non-uniformly. It then applies gain controls to configure the hearing aid to match patient’s audiogram. The current practice aims to design digital filters that can match multiple patients’ audiograms which leads to very complex designs as conducted in [
6,
7,
8,
9]. These designs lower the cost of manufacturing hearing aids as they can be produced on a large scale since they accommodate multiple users. However, complex designs require high operating power and big chip area, which leads to an improper fitted hearing aid. When there is a reduction in complexity, the hearing aid prototype can match a limited number of audiograms effectively. On the other hand, a low complex hearing aid design makes it properly fitted as it does not require large area to be implemented since it has a small number of filter coefficients [
4,
8]. Normally, hardware complexity when designing filter bank structures is measured by the components needed to realize these filters (which are multipliers, adders, and shifters). However, in many researches, only multipliers are considered as they are the most power-consuming elements in the digital signal processing (DSP) hardware [
46]. Hence, to summarize, the current practice, trials are made to mask the hearing frequency band with a large number of filters with complex techniques. Another consequence to these complex designs the process of adjusting hearing aid becomes difficult for both the patients and audiologists. Instead of attempting to match different types of hearing loss with one design to satisfy the needs of many patients aiming to lower the cost of manufacturing, designs can be implemented according to categories produced by our intelligent solution. The filter bank can be designed to match the shapes of a number of these clusters not all. This will result in designs that less complex, with low delay, a small chip area, and reduced cost. In addition, these clusters will facilitate the process of programming or adjusting hearing aid to match the user needs by assigning each patient’s audiogram a configuration related to the produced clusters.
Consequently, configuring a hearing aid will be easy and less exhausting for the patients and the audiologists as it will require less response from the patients. The power of intelligent solutions does not depend on the skills, experience, and knowledge of limited number of skillful experienced audiologists. In addition, as it requires less response from the patients, it will introduce big help to cohorts such as older people, individuals with dementia and children who are experience hearing loss. In addition, this method can be applied to any set of test frequencies as it is not restricted to the used set in the study. Data can be pre-processed such that any missed frequency can be interpolated. The needed input is the hearing levels to be tested at eight different frequencies. Those eight frequencies can vary according to the protocol used in the hearing test. Hence, what is considered in this study is to test the air conduction thresholds at eight test frequencies with/without masking in the non-test ear. Bone conduction thresholds are not considered in this study.
To conclude, the authors do not count only on rigid statistical analysis. The results should be interpreted according to the needed solution to be introduced. The authors prefer to consider the 10 clusters since more shapes of patients’ audiograms will be included. In addition, the authors predict that grouping the standard levels S1, N1, N2, and N4, N5 in the same cluster might be a source of confusion for any future supervised machine learning algorithm. It is seen that 10 clusters might produce a higher accuracy supervised learning model due to the high overlap nature of data, which means that introducing more clusters might help resolving this problem.
The authors recommend, for future work, the use of regression analysis to generate one polynomial that represents each cluster. These polynomials are predicted using regression with the least square method to minimize the difference between clustered audiograms in the same cluster and predicted polynomials (as carried out in [
47,
48]).