*2.4. Computational Algorithms*

We developed an efficient Newton–Raphson type of algorithm to obtain the penalized estimate *β*ˆ. Starting with an initialized value, we can solve the penalized GEE iteratively. The estimated *β*ˆ(*d*+1) in the (*d* + 1)th iteration can be solved as:

$$\mathcal{B}^{(d+1)} = \mathcal{B}^{(d)} + [T^{(d)} + n\mathcal{W}^{(d)}]^{-1}[\![\![L^{(d)} - n\mathcal{W}^{(d)}]\!\!\!\beta^{(d)}] ],\tag{4}$$

where *U*(*d*) is the score function expressed in terms of *β*ˆ(*d*) at the *d*th iteration and *T*(*d*) is the corresponding first derivative function of *U*(*d*):

$$T^{(d)} = \sum\_{i=1}^{n} Z\_i^T V\_i^{-1} Z\_{i'} $$

which is also a function of *β*ˆ(*d*). The MCP penalty was imposed on both the individual level (main lipid effects) and group level (lipid–environment interactions). Therefore, *W*(*d*) is a diagonal matrix that contains the first derivative of the MCP penalty for the lipid factors and the first derivative of the group MCP penalty for the lipid–environment interactions. We define *W*(*d*) as:

$$\begin{split} W^{(d)} &= \text{diag}\{\underbrace{0,\dots,0}\_{1+\eta}, \frac{\rho'(|\boldsymbol{\beta}\_{21}^{(d)}|;\lambda\_{1},\gamma)}{\epsilon+|\boldsymbol{\beta}\_{21}^{(d)}|}, \dots, \frac{\rho'(|\boldsymbol{\beta}\_{2p}^{(d)}|;\lambda\_{1},\gamma)}{\epsilon+|\boldsymbol{\beta}\_{2p}^{(d)}|}\}, \\ &\frac{\rho'(||\boldsymbol{\beta}\_{31}^{(d)}||\_{\Sigma\_{1}};\sqrt{\eta}\lambda\_{2},\gamma)}{\epsilon+||\boldsymbol{\beta}\_{31}^{(d)}||\_{\Sigma\_{1}}}, \dots, \frac{\rho'(||\boldsymbol{\beta}\_{3p}^{(d)}||\_{\Sigma\_{p}};\sqrt{\eta}\lambda\_{2},\gamma)}{\epsilon+||\boldsymbol{\beta}\_{3p}^{(d)}||\_{\Sigma\_{p}}}\}, \end{split}$$

where is a small positive number set to 10−<sup>6</sup> to avoid the numerical instability when the denominator is zero. The first (1 + q) elements on the diagonal of *W* are zero, suggesting that there is no shrinkage imposed on the coefficients for the intercept and the environmental factors. We can use *nWβ*ˆ and *nW* to approximate the first derivative function of MCP in the penalized score equation and the second derivative function of the MCP penalty, respectively. Given a fixed tuning parameter, the regression parameter *β*ˆ(*d*+1) can be updated iteratively till convergence. The stopping criterion is that the L1 norm for the L1 difference between two consecutive iterations is less than 10<sup>−</sup>3, and convergence can usually be achieved within 10 iterations.

There are two tuning parameters *λ*<sup>1</sup> and *λ*<sup>2</sup> and a regularization parameter *γ*. *λ*<sup>1</sup> controls the sparsity of lipid factors, and *λ*<sup>2</sup> determines sparsity among lipid–environment interactions. We chose the optimal tuning parameters *λ*<sup>1</sup> and *λ*<sup>2</sup> using five-fold cross-validation in both the simulation study and real data analysis. The regularization parameter *γ* was obtained via a data driven approach. In our numerical study, we examined a sequence of values, such as 1.8, 3, 4.5, 6, and 10, suggested by published studies, and found that the results were not sensitive to the choice of the value of *γ*, and then set the value at 3. We split the dataset into five equally sized subsets and took four of them as the training dataset, leaving the last subset as the testing dataset. The penalized estimates were obtained from the training data, and then, prediction performance was evaluated on the testing data. A joint search over a two-dimensional grid of (*λ*1, *λ*2) was conducted to find the optimal pair of tuning parameters.

Given fixed tuning parameters, we implemented the algorithm as follows:


In our study, we considered the methods considering both lipid main effects and lipid–environment interactions with exchangeable working correlation (A1), AR(1) working correlation (A2), and independence working correlation (A3). For comparison with the methods that cannot accommodate the identification of lipid–environment interactions, we also included A4–A6, which incorporate the exchangeable, AR(1), and independence working correlation, respectively. The alternative methods A4–A6 do not ignore the interaction effects. Instead, they treat the interaction effects individually, so the group structure considered in A1–A3 does not exist. We computed the CPU running time for 100 replicates of simulated lipidomics data with *n* = 250, *ρ* = 0.8, *p* = 75 (with a total dimension of 304) and fixed tuning parameters on a regular laptop for A1–A6, which can be implemented using our developed package: (interep https://cran.r-project.org/package=interep) [21]. The CPU running time in seconds was 48.8 (A1), 40.2 (A2), 29.0 (A3), 49.3 (A4), 39.7 (A5), and 27.9 (A6), respectively.

## **3. Results**
