*2.2. Analysis of RUNX1 Methylation and mRNA Levels*

We previously analyzed the DNA methylation and mRNA expression at the level of genome using the Infinium HumanMethylation450 BeadChip and the HumanHT-12 expression BeadChips (Illumina, San Diego, CA, USA), respectively, in 42 surgically resected tumor and matched normal tissues, 136 bronchial washings, 12 sputums, or 6 bronchial biopsy specimens obtained from a total of 118 NSCLC patients and 60 cancer-free patients [13]. We used the reported data for the analysis of methylation and mRNA levels of *RUNX1*. Preprocessing such as background or batch effect correction, probe filtering, and adjustment of the background signal difference between types I and II probes was conducted using the R software package called wateRmelon [14]. Methylation level (β-value), ranging from 0 (no methylation) to 1 (100% methylation), was estimated as the ratio of fluorescence signal intensity between methylated alleles and the sum of methylated and unmethylated alleles at each CpG locus. The levels of mRNA expression from HT-12 chips were normalized using the R lumi package (https://bioconductor.org/biocLite.R).

#### *2.3. Feature Selection for Prediction of Lung Cancer*

To select candidate CpGs for lung cancer prediction among differentially methylated CpGs and to build models for lung cancer prediction, we divided the normal and tumor tissues from 42 patients into training and test datasets, according to a 7:3 ratio. Supervised machine learning algorithms were applied to select features in the training dataset. Age-related CpGs or any CpGs that were significantly correlated in the normal or tumor tissues were removed during the model building. Supervised machine learning algorithms for feature selection and model building were applied using RapidMiner Studio version 8.2 (RapidMiner Inc, Boston, MA, USA).
