*4.1. Statistical Model*

The current study utilized the measurements developed by the previous studies. Therefore, class circumstances were sorted by three demographic items, comprising age of the family head, education level of the family head, and family size, to assess the impact of these demographics on the choice of drinking water by households. Life choices were measured by asking the respondents about tap or plant water use, and example items were included regarding why they make these choices. This study employed frequency distributions to describe demography and the variables involved in class circumstances. An independent group *t*-test was performed to measure the mean difference in the occurrence of diseases in both with and without local government filtration plant areas.

Moreover, the impact of various variables refers to class circumstances; namely, respondent's area (with and without plants), income, expenditure on drinking water, use of plant water, family size, education, and age of household head are regressed on waterborne diseases referring to life chances using logistic regression. The waterborne disease was measured through a dichotomous variable, having only two values assigned: value 1 if the respondent suffered from waterborne disease; otherwise, value 0 was assigned. Similarly, a binary variable also measures respondents' area and use of plant water. The functional form of the logistic regression curve is

$$f(t) = \frac{\mathfrak{e}^t}{1 + \mathfrak{e}^t} \tag{1}$$

where is Euler's number and can be any linear combination of predictors such as *β*<sup>0</sup> + *β*1*x*.

$$f(t) = \frac{e^{b\_0 + b\_1 \mathbf{x}}}{1 + e^{b\_0 + b\_1 \mathbf{x}}} \tag{2}$$

we want to end up with the "typical" formula of the logistic regression, something like:

$$f(\mathbf{x}) = L(b\_0 + b\_1\mathbf{x} + \dots + \dots + \dots) \tag{3}$$

where *L* is the Logit, i.e.,

$$f(t) = \ln\left(\frac{e^t}{1 + e^t}\right) = b\_0 + b\_1 x \tag{4}$$

#### *4.2. Deriving the Formula*

In the first step, let us take our *p*(*Y* + 1) = *f*(*t*) and divide by the probability of the complementary event. If the probability of event A is *p*, the probability of not-A is 1 − *p*, Thus,

$$\frac{f(t)}{1 - f(t)} = \frac{\frac{e^t}{1 + e^t}}{1 - \frac{e^t}{1 + e^t}}\tag{5}$$

So, we replaced *<sup>f</sup>*(*t*) by *<sup>e</sup><sup>t</sup>* <sup>1</sup> <sup>+</sup> *<sup>e</sup><sup>t</sup>* and thereby computed the odds. Next, we multiply the equation by <sup>1</sup> <sup>+</sup> *<sup>e</sup><sup>t</sup>* <sup>1</sup> <sup>+</sup> *<sup>e</sup><sup>t</sup>* (which is the neutral element, 1) yielding

$$=\frac{e^t}{(e^t+1)\left(\frac{1+e^t}{1+e^t}-\frac{e^t}{e^t+1}\right)}\tag{6}$$

In other words, the denominator of the numerator "wandered" down to the denominator. Now we can simplify the denominator:

$$\frac{e^t}{(e^t+1)\left(\frac{1+e^t-e^t}{e^t+1}\right)}\tag{7}$$

Simplifying the denominator further

$$\frac{e^t}{(e^t+1)\left(\frac{1}{e^t+1}\right)}\tag{8}$$

However, the denominator simplifies to 1, as can be seen here

$$\frac{e^t}{\frac{e^t + 1}{e^t + 1}} = \frac{e^t}{1} = e^t \tag{9}$$

The above equation tells us that the odds simplify to *e<sup>t</sup>* . Now, let us take the logarithm of this expression.

$$\operatorname{Ln}(e^t) = t \tag{10}$$

By the rules of exponents algebra

$$t = \beta\_0 + \beta\_1 \mathbf{x} \tag{11}$$

in sum,

$$\ln\left(\frac{f(t)}{1 - f(t)}\right) = \beta\_0 + \beta\_1 x \tag{12}$$

The left part of the previous equation is called the Logit, which is the "odd plus logarithm" of *<sup>f</sup>*(*t*), or rather, more precisely, the logarithm of the odd of *<sup>p</sup>* 1 − *p* . The logistic regression formula can now be obtained by taking the Logit of any linear combination. We can use our standard regression terminology because of the Logit's linearity: The Logit of dependent variable changes by *β*<sup>1</sup> if *x* is increased by one unit. Simply replace the Logit with the appropriate value; the rest of the statement is standard regression jargon. In the meantime, because the curve's slope is not linear, *β*<sup>1</sup> is not equal for all *x* values. The logistic regression equation for the current model can be expressed as follows:

$$\text{Logit}\ (p) = \text{Log}\left(\frac{p}{1-p}\right) = \pounds\_0 + \pounds\_1 AH + \pounds\_2EH + \pounds\_3FS + \pounds\_4LIPW + \pounds\_5EDW + \pounds\_6RD + \mu\_l \tag{13}$$

Solve this equation for *p*,

$$\frac{p}{1-p} = \frac{1}{\exp(\beta\_0 + \beta\_1 AH + \beta\_2 EH + \beta\_3 FS + \beta\_4 LIP\mathcal{W} + \beta\_5 ED\mathcal{W} + \beta\_6 RD + \mu\_i)}\tag{14}$$

Pain the formula for the probability *P*(WBD = 1) = *p*

$$p = \frac{\exp\left(\beta\_0 + \beta\_1 AH + \beta\_2 EH + \beta\_3 FS + \beta\_4 LIPW + \beta\_5 ED\mathcal{W} + \beta\_6 RD + \mu\_i\right)}{1 + \exp\left(\beta\_0 + \beta\_1 AH + \beta\_2 EH + \beta\_3 FS + \beta\_4 LIPW + \beta\_5 EDW + \beta\_6 RD + \mu\_i\right)}\tag{15}$$

Thus, adding criterion and predictor variables equations becomes,

$$\text{Lnn}\left[p/(1-p)\right] = \beta\_0 + \beta\_1 AH + \beta\_2 EH + \beta\_3 FS + \beta\_4 LIPW + \beta\_5 EDW + \beta\_6 RD + \mu\_i \quad \text{(16)}$$

$$\text{WBD} = \beta\_0 + \beta\_1 AH + \beta\_2 EH + \beta\_3 FS + \beta\_4 LIPW + \beta\_5 EDW + \beta\_6 RD + \mu\_i \tag{17}$$


#### **5. Results and Discussion**

Human survival depends on the availability of water. Meanwhile, water resources in Pakistan are constantly degrading because of the mixing of various chemicals and trash. The public's health is jeopardized when this polluted water is consumed. This sickness is more likely to affect infants and children. Furthermore, due to these ailments, people must shoulder the cost of health treatment. The local government set up filtration facilities in various parts of Lahore so that the general population may profit from them. This research investigated the effects of waterborne diseases on newborns, children, and other households and their healthcare expenses. Furthermore, by comparing the probability of diseases in areas with drinking water filtration plants installed by their local government versus areas without this facility using a healthy lifestyle model, it was determined that people benefit from drinking water filtration plants whether their local government installs them or not.
