**1. Introduction**

It is now widely accepted that cancer develops through a series of stages [1]. It starts from a very limited area, not invasive and metastatic at the early stage, then spreads to distant sites in the body, and becomes highly invasive and metastatic at the late stage. In addition, patient survival times are significantly reduced at the late stages. For example, the 5-year relative survival rate for lung cancer is 54% at a localized stage, and is reduced to 4% at the distant stage [2]. More than half of lung cancers are diagnosed at a distant stage, which indicates that early diagnosis of cancer is the main factor to enhance patient survival. Therefore, markers for early detection and proper classification of the tumor are extremely critical to improve life expectancy. Furthermore, identifying high-risk cancer patients at an early stage, would allow them to receive standard chemotherapy in advance.

DNA methylation has been found to be a marker for disease diagnosis, such as in cancer [3]. Significant progress has been made using DNA methylation differences to capture substantial information about the molecular and gene-regulatory states among biology subtypes, such as tumor and normal tissues [4].

In addition, DNA methylation can be used as a marker to differentiate disease severity, such as early and late stages in breast cancer [5], ovarian cancer [6] and prostate cancer [7]. Most of them have potential functions in inducing and suppressing cancer metastasis. Moreover, DNA methylation is associated with tumor size in colorectal cancer [8].Patients with higher methylation showed more frequent recurrence as compared with the low-methylation group, and shortened cancer-related survival and recurrence-free survival [8].

These findings show the critical importance of a better understanding of cancer progression and metastasis, which could help make better prediction of the clinical aggressiveness of cancer. Since DNA methylation is associated with disease severity, detecting differentially methylated regions (DMRs) can help understand cancer progression.

Most analyses are conducted by creating dichotomies based on biological subtypes, such as early and late cancer stages, and then detect DMRs by comparing the differences of DNA methylation rates between two groups [5–7]. However, when there are actually more than two groups, such approaches may lose information regarding multiple disease status, due to collapsing or ignoring clinically relevant subtypes, resulting in suboptimal clinical conclusions and decisions.

To use multiple disease status, it is possible to run multiple testing for the association between DNA methylation and multiple group responses, using the methods for two groups. Although we can simply run analysis for all pair-wise comparisons and combine the results, it is not trivial when considering the regional correlation of DMRs, and would increase the multiple testing burden.

Another possible method is the generalized linear model that includes indicator variables for different levels of disease status. This method has the advantage that it can adjust for covariates. However analysts are often faced with noisy estimates of category-specific regression coefficients, which can lead to unreasonable patterns in the regression coefficients corresponding to different levels of disease status, and it can reduce the power [9].

To improve the efficacy of an overall test, one can take advantage of the fact that cancer develops through a series of stages, or different levels of disease severity in general, and develop statistical methods that can incorporate the ordering of disease status. However, the widely used trend test is not an ideal method, because it requires scores or weights for different levels of disease status, which are generally unknown.

Here we propose a Bayesian approach and use the Bayes factor to test the association between methylation rates and disease severity. The proposed Bayes Factor Method (BFM) can incorporate monotonicity constraints, and find DMRs in which methylation rates increase (or decrease) as the diseases become more severe. Patients are classified into groups based on the disease severity (e.g., stages of cancer), and DMRs are detected by using moving windows along the genome. Within each window, the Bayes factor is calculated and is used to test the hypothesis of constant versus monotonic increase in methylation rates corresponding to the severity of the disease.

In addition, since DNA methylation rates have been shown to be correlated at nearby CpG sites with complicated correlation structure [10], a linear mixed-effect model is used to incorporate the correlation of methylation rates between and within CpG sites in the region.
