3.1. Methods of Data Collection
The documentation of 222 patients of the Osteoporosis Treatment Clinic (OTC) at the Central Clinical Hospital in Łódź, Poland was analyzed. The patients were aged 28 to 95, female and male Caucasian race, residing in the territory of the Republic of Poland and registered in the Clinic. From the patient cards, the following data were copied into an Excel table: fractures, chronic diseases, surgeries, medications, lifestyle (diet, smoking, alcohol consumption, physical activity, and sun exposure), age of last menstruation (in post-menopausal women), and history of fractures of the proximal end of the femur in the immediate family.
Due to the T-score result in the densitometry test and the presence of osteoporotic fractures, patients were assigned to one of four groups: 1—osteoporosis with pathological fractures, ICD-10 code: M80; 2—osteoporosis without pathological fractures: M81; 3—osteopenia (other disorders of bone mineralization and structure): M85; 4—other persons not qualified for the above-mentioned three groups.
The results of the densitometric examination of the L1–L4 vertebrae and the proximal femoral epiphysis, as well as laboratory tests of the level of 25 OH vitamin D3, calcium, and phosphates in the blood, were taken from the electronic documentation. The above-mentioned tests were performed in a hospital facility. These data were also entered into the table.
The study was conducted according to the guidelines of the Declaration of Helsinki. The Bioethics Committee at the Medical University of Łódź has approved this research. Patients’ informed consent was waived due to the retrospective nature of the study.
3.2. Description of Risk Factors
Below, we describe the risk factors that were taken into account. They were chosen on the basis of their appearance in the literature or because they were regarded as important by the OTC doctors. We provide brief explanations of why we consider these factors important.
Sex. It is known that women suffer from osteoporosis more often than men. After menopause, women experience significant bone loss: 1–2% per year. Estrogen deficiency causes an increase in bone remodeling, which makes the bones weaker. It is estimated that the risk of hip fractures for women aged over 70 is 16–18%. Men have a higher peak bone mass at the end of the third decade of life. Therefore, BMD values indicating osteoporosis and the risk of fractures associated with bone loss will be reached later. Although the risk of hip fracture for men over 70 is about 5–6%, the mortality rate after osteoporotic fractures is twice as high as for women. However, osteoporosis in men is often underestimated by physicians [
25].
Age. A person reaches peak bone mass in the third decade of life. Then, after a period of stabilization, BMD slowly decreases (after the age of 50, 0.5–1% per year). Therefore, the risk of fractures due to osteoporosis increases with age. It is estimated that osteoporosis affects approximately 1/10 of women aged 60, 1/5 of women aged 70, 2/5 of women aged 80, and 2/3 of women aged 90 [
26].
Body mass index (BMI). BMI is calculated from the formula: BMI = (weight)/(height)
. BMI can be used in the FRAX calculator if BMD is not known. The estimated probability of fracture increases as BMI decreases [
3].
Last menstrual period (for women only, age in years). The research reported in [
27] showed a high risk for osteoporotic fractures in non-obese women who have untreated premature menopause.
Alcohol. Alcohol can increase the excretion of calcium from the body. Excessive alcohol consumption may have a toxic effect on osteoblasts and the liver, which reduces the production of the active form of vitamin D3. Drinking more than three units of alcohol per day is a risk factor for fractures [
28]. On the other hand, some studies have shown that drinking small amounts of alcohol, combined with a healthy lifestyle, can have a positive effect on bone density [
29,
30].
Smoking. Smoking is a widely recognized risk factor for osteoporosis [
31]. It causes a decrease in bone strength, more often in men than in women. This negative effect depends on how many cigarettes a person has smoked during their lifetime.
Coffee. Some recent studies have not shown that moderate coffee drinking increases the risk of osteoporosis and fractures in healthy adults (see [
32] and the references therein). Caffeine, similar to theine, flushes out calcium and increases its excretion in the urine. However, the more calcium is washed out, the greater its absorption from food. It is therefore emphasized to provide adequate amounts of calcium with food. A recent review showed that there was a non-linear relationship between the level of coffee consumption and the incidence of hip fractures, and the lowest relative risk of hip fracture was found in those who consumed two to three cups of coffee per day [
33].
Glucocorticoids. Taken chronically for more than 3 months, in a dose equivalent to 5 mg of prednisone daily, they may reduce bone formation, especially in the trabecular bone (vertebrae), increase urinary calcium excretion, and cause some hormonal disorders. Glucocorticoid-induced osteoporosis (GIO) is a common, iatrogenic, secondary osteoporosis that may be associated with a very high risk of fractures [
2].
Physical activity. Various physical activities are effective in preventing and treating osteoporosis. The mechanical load resulting from physical activity increases muscle mass, creates stress on the skeleton, and increases osteoblast activity [
22]. With age, skeletal muscle mass and function decrease, which, in the presence of osteoporosis, may increase the risk of falls and fractures. Strength training is recommended, which can be used without age restrictions.
Sun exposure. In our latitude (in Poland), from May to September, the synthesis of vitamin D3 in the skin can be effective from 10 a.m. to 3 p.m., in sunny weather, when at least the forearms and lower legs are exposed to the sun for at least 15 min without using sunscreen. Older people are at a high risk for vitamin D deficiency [
34].
Rheumatic diseases—including rheumatoid arthritis, spondyloarthritis, and other connective tissue diseases. In inflammatory rheumatic diseases, there is a significant increase in the risk of osteoporosis and fractures [
35]. Pro-inflammatory cytokines cause periarticular osteoporosis and activate osteoclastogenesis. The use of glucocorticoids may increase bone resorption. Additionally, joint damage, atrophy, and muscle weakness increase the risk of falls.
Diabetes. It is known that diabetic patients have increased risk of fractures. For patients with type 1 diabetes, BMD measurement should be performed 5 years after the diagnosis of the disease and repeated every 2–5 years. For patients with type 2 diabetes, the risk of fractures may not correspond to the BMD values [
2]. Complications of diabetes such as myopathy, neuropathy, visual impairment, and obesity may increase the risk of falls.
Neoplasma—current or past. Cancer can have negative influence on bone health in many ways. Some cancer cells have an affinity for bone tissue, which results in bone metastases, leading to possible fractures. In addition, many cancer treatments can have detrimental effects on bone health. In particular, hormone deprivation therapies for breast cancer and prostate cancer adversely affect bone turnover, resulting in decreases in BMD and bone quality, which can lead to fractures [
36].
Hyperthyroidism. Hyperthyroidism (also hyperparathyroidism), especially if left untreated for a long time, may contribute to bone atrophy, weakening of bone strength, and fractures [
37]. Treatment of carcinoma of glandoma thyroidea with high doses of L-thyroxine in postmenopausal women increases fracture and osteoporosis risk.
Hypogonadism or premature menopause (<45 years). Postmenopausal women with estrogen deficiency have increased rate of remodeling of bones [
27], particularly in the first 10 years after the last menstrual period. Low calcium absorption is observed, and the release of calcium from the bones is augmented. Secretion of parathyroid hormone reduces, and vitamin D3 metabolism decreases. Hypogonadism, low testosterone levels in elderly men are risk factors of bone resorption and fractures [
38]. Ablative treatment of prostate cancer causes higher risk, independently of age.
Gastrointestinal diseases. Patients with gastrointestinal diseases are at high risk for osteoporosis and fractures [
39]. Resection of the stomach and intestines and treatment with PPIs (proton pump inhibitors) may decrease the absorption of calcium and other nutrients. Using treatment with TNF-alpha inhibitors can reduce inflammation and bone resorption.
Chronic kidney disease. Due to biochemical abnormalities in the homeostasis of calcium and phosphorus, different types of osteodystrophy are distinguished. In the beginning, there is gradual retention of phosphorus and impairment of vitamin 1.25 OH D3. This disease significantly increases the risk of fractures [
40].
Strumectomy. It is known that thyroidectomy significantly increases the long-term risk of osteoporosis. Younger patients, women, patients with comorbidities, and patients receiving chronic thyroxin treatment should be monitored for changes in postoperative bone density [
41].
Secondary osteoporosis. The following diseases are included: type I (insulin dependent) diabetes, osteogenesis imperfecta in adults, untreated long-standing hyperthyroidism, hypogonadism or premature menopause (<45 years), chronic malnutrition or malabsorption, and chronic liver disease [
3].
Meat. A meat diet is rich in saturated fats. The PRAR-gamma nuclear receptor can be activated by ligands, mainly products of oxidation of polyunsaturated fatty acids. This is responsible for an increase in adipocytogenesis. Lipoxygenase products bind to low-density lipoproteins (LDL) and increase osteoblast apoptosis. It has been shown that more bone fractures occur in countries with a high intake of animal protein. Studies have confirmed the bone-protective effect of a diet based on fruits and vegetables [
42].
Saltibg. It is emphasized that reducing the amount of salt in the diet can help normalize blood pressure and reduce urinary calcium excretion. However, it was also observed that a low-salt diet can lead to a negative calcium and magnesium balance which could result in osteoporosis [
43].
Family history of hip fractures. It has been suggested that the degree of bone remodeling and geometry may depend on genetic factors [
44]. The occurrence of hip fractures in the immediate family is included among the clinical factors in the FRAX form [
3].
Dual-energy X-ray absorptiometry (DXA) neck T-score and dual-energy X-ray absorptiometry (DXA) spine T-score.
To date, there is no known method that could measure the mechanical strength of bones. Currently, the diagnosis of osteoporosis is based on the DXA densitometry test, which measures bone density. A T-score (either neck or spine) of SDs and below in postmenopausal women and men over 50 is classified as osteoporosis.
Phosphates—phosphate level in blood in mmol/L. Of the phosphorus in the human body, 85% is found in bones and teeth. It is available in many foods and drinks. Its bioavailability is in the range of 60–70%. Hyperphosphatemia reduces the absorption of calcium in the gastrointestinal tract, inhibits the synthesis of vitamin 1.25 (OH)2 D3 and independently increases the secretion of PTH. Therefore, the normal ratio of calcium to phosphorus is considered to be 1.5:1 or 1:1. Products that contain phosphorus include: meat, eggs, fish, lentils, beans, wheat bran, nuts, seeds, and carbonated drinks [
42].
Vitamin D3—vitamin D3 level in blood in ng/mL. The special role of vitamin D3 is emphasized, which enables the absorption of calcium from the gastrointestinal tract, and also reduces the secretion of parathyroid hormone. Therefore, it helps in the regulation of calcium and phosphate metabolism. The diet covers only about 20% of the demand; thus, skin synthesis is important. When sufficient sun exposure is not possible, especially in people over 65, appropriate supplementation of this vitamin should be used throughout the year [
42].
Calcium—calcium level in blood in mmol/L. The main inorganic component of bones is calcium. Approximately 99% of this macroelement in the body is accumulated here. Adequate supply during childhood and adolescence determines high peak bone mass. The body absorbs from 10 to 40% of calcium from the diet, and absorption from the intestines decreases with age. In adults, the demand for calcium is about 1000 mg/day. In the Polish population, the supply of this element is 50–60%. Supplementation with calcium preparations alone does not reduce bone fractures, but the risk of heart attack and kidney stones may increase [
45].
3.3. Data Preparation
First, the features were divided into features that were to be input values to the model and features that were to be output data. It was necessary to decide how to take into account the results of subsequent medical tests. One test set consists of five parameters: the results of the densitometric examination of the L1–L4 vertebrae and the proximal femoral epiphysis, as well as laboratory tests of the level of 25 OH vitamin D3, calcium, and phosphate in the blood. For some people it was performed more than once. Initially, only the results of the first test were considered in this study.
A dataset used for the study included a group of 222 patients. Some of them had incomplete first set of tests. Therefore, 177 patients were chosen who had complete the first test set.
We used the following set of 27 features as input variables for ANNs:
- 1.
Sex (M–male, F—female),
- 2.
Age (in years),
- 3.
BMI (real value),
- 4.
Last menstruation (age in years),
- 5.
Alcohol (Y—yes, N—no, S—in small amounts),
- 6.
Smoking (Y—yes, N—no, S—at most 1 cigarette daily),
- 7.
Coffee (number of cups daily),
- 8.
Treatment with glucocorticoids (Y—yes, N—no),
- 9.
Physical activity (≥ 30 min daily) (Y—yes, N—no),
- 10.
Sun exposure (≥ 15 min daily) (Y—yes, N—no),
- 11.
Rheumatic diseases (Y—yes, N—no),
- 12.
Diabetes (Y—yes, N—no),
- 13.
Neoplasma (Y—yes, N—no),
- 14.
Hyperthyroidism (Y—yes, N—no),
- 15.
Hypogonadism or premature menopause (Y—yes, N—no),
- 16.
Gastrointestinal diseases (Y—yes, N—no),
- 17.
Chronic kidney disease (Y—yes, N—no),
- 18.
Strumectomy (Y—yes, N—no),
- 19.
Secondary osteoporosis (Y—yes, N—no),
- 20.
Meat in the diet (Y—yes, N—no, S—in small amounts),
- 21.
Salting (Y—yes, N—no, S—in small amounts),
- 22.
Family history of hip fractures (Y—yes, N—no),
- 23.
DXA neck T-score (real value),
- 24.
DXA spine T-score (real value),
- 25.
Phosphates serum level (real value),
- 26.
Vitamin D3 serum level (real value),
- 27.
Calcium serum level (real value).
The set defined in this way includes features that take numerical values, are specified with a letter, or are empty. The last case concerns a feature that is not possible to determine for a given patient (e.g., last menstruation in men) and its absence must be described in some way. The following assumptions were made. The numerical data remained unchanged. Data described with values Y—yes, N—no, S—in small amounts, were converted to numerical values as follows: , and . If only two values were allowed: Y—yes, N—no, we assumed , . For features that are not possible to determine, the empty spaces have been filled with zeros. We understand such a feature value as the “not applicable” case. The above operations on the data allowed us to create a set consisting of complete numerical vectors. It was assumed that only one parameter would correspond to the output value. The value of this parameter was if no fractures occurred and 1 if fractures occurred.
For each feature, the range of values that this feature takes was defined.
In addition to the above 27 features, our dataset contained also the following data for each patient:
Number of low-energy fractures before treatment (FrBT),
Number of low-energy fractures during treatment (FrDT).
This last information was used to train the neural network to correctly predict fractures in patients. The first analysis of the data was performed on the number of fractures that occurred before treatment. It was assumed that if the number of fractures was 0 (that is, FrBT had a value of 0), then the patient data was labeled (the so-called ground truth label). If fractures before treatment occurred (that is, FrBT ), the data was labeled 1. A correctly trained network should give the output in case of fractures and 1 in case of no fractures. On this basis, a training set was created and a neural network was built.
An example of the feature vector for one patient is of the form:
It has 27 components corresponding to subsequent items of the numbered list above. After converting the feature vector into a vector consisting of numerical data, we get the following input vector:
It is necessary to indicate the interval of acceptable values for each feature. Analyzing the data, we obtain 27 intervals defining the ranges of acceptable values for the relevant features: , , , , , , , , , , , , , , , , , , , , , , , , , , .
Remark 1. Defining ranges for each feature is especially important. It should be remembered that these should be the maximum ranges from which the given features can come, and not the ranges determined by the minimum and maximum values from the observation data. One should analyze each feature and determine the entire domain for that feature. An example of this is the “age” feature. Regardless of the data present in the studied dataset, if the set of adults is considered, the range from which data are acceptable can be defined as 18 to 97 years.
The next element is to define the output vector. In our case, the output is a value from the set . So, the output from the network should be or 1.
3.4. Learning Neural Network
An ANN is a simplified model of the structure of the brain. In our study, we used a multilayer feed-forward neural network, whose structure is shown in
Figure 1. It is composed of artificial neurons which are organized in layers. Matlab software was used to create the ANN and carry out the learning procedure. The effectiveness of the network was checked on a testing set. During the testing procedure, it was verified whether the output generated by the ANN for a patient from the test set was consistent with the ground truth label containing the information about real fractures in this patient.
There are a number of methods that are used to implement the learning procedure. In this study, the method of back propagation was used.
Test data are necessary to check whether the previously trained network correctly generates a response for other data than those used in the learning process. The training set and the test set are created by appropriate division of the complete dataset. In this study, it is a set of 177 items that is split 80% to 20%, training data to test data. This division should be made many times in the above proportion and the training and testing procedure should be repeated many times on a possibly large group of different divisions of the dataset. In other words, we should train many different networks with many different training sets and validate them on different test sets. Each of these attempts should be repeated several times. The method can be considered effective if, for a given network structure, the average error on different test sets is small (<).
The training procedure is explained in details in
Figure 2.
As a result of the computational procedure, it was noticed that even single-layer networks learned the training data very well (with an error close to zero). It was much worse with data generalization: the error on test data exceeded 30%. Subsequent changes in the construction of the network did not lead to a significant improvement. Therefore, the elimination of individual features was used to check whether removing any of them will significantly affect the obtained results; however, this did not significantly affect the network error. In addition, an analysis of the relationships between individual features was performed, taking feature pairs and feature triples. This procedure did not lead to the detection of significant dependencies. Because the elimination of features and the analysis of selected features did not bring any significant changes in the obtained results, the best one-way two-layer network was adopted as the optimal solution.