*2.1. Patients*

This study was a prospective, clinic-based, descriptive study and adhered to the guidelines of the Declaration of Helsinki (as amended in 2013). The study protocol was approved by the institutional review board of the Keio University School of Medicine (No. 20120210). All patients received a full explanation of the study procedures and written informed consent was obtained from each subject prior to enrolment. To ensure privacy, all records were identified by an anonymous subject identification number.

All GO patients were referred to our clinic from Olympia Eye Hospital, Tokyo, Japan. They were diagnosed with GO in accordance with previously described criteria [14]. Additionally, the patients underwent diagnostic imaging by MRI at the hospital. The patients with subjective dry eye symptoms— such as dry eye sensation, foreign body sensation, and photophobia—had been referred to the Keio University Hospital MGD clinic consecutively. Normal participants without GO were also included in this study as controls. All patients underwent a general ophthalmic examination and slit-lamp biomicroscopy. Furthermore, they underwent a more detailed examination of MGD and the ocular surface.

#### *2.2. Ocular Surface Examination*

Two investigators with experience of investigation of the ocular surface (M.K. and S.I.) evaluated the tear breakup time (TBUT) as well as keratoconjunctival epithelial damage, on the basis of fluorescein staining scores (0–9) of the cornea and conjunctiva, as described previously [15]. For these tests, 2 μL of preservative-free 0.5% fluorescein dye was instilled into each eye using a micropipette, to avoid altering the tear dynamics. Then, conjunctival rose bengal staining was performed to detect superior limbic keratoconjunctivitis (SLK) and lid wiper syndrome. Finally, Schirmer's test was performed without topical anesthesia.

#### *2.3. Criteria for Dry Eye Diagnosis*

A dry eye diagnosis was made according to the latest Japanese dry eye diagnostic criteria (2016) [16]. Briefly, the presence of dry eye symptoms and the presence of qualitative disturbance of the tear film (TBUT ≤ 5 s) both had to be present for a diagnosis of dry eye [16].

#### *2.4. Criteria for Diagnosis of Meibomian Gland Dysfunction*

Obstructive MGD was diagnosed when an eye tested positive for all 3 of the following criteria: (1) symptoms of MGD, such as dry eye sensation, burning sensation, foreign body sensation; (2) abnormal findings around the orifices of the glands, and (3) findings indicating meibomian gland orifice obstruction. The presence of any two or more of these findings was defined as a lid margin abnormality in this study. An eye was judged positive for abnormal findings around the orifices when at least 1 of 3 findings, i.e., irregular lid margin, vascular engorgement, or anterior/posterior displacement of the mucocutaneous junction, was recognized. Vascular engorgemen<sup>t</sup> was defined as the presence of moderate or severe telangiectasia or redness. An eye was judged positive for orifice obstruction when both findings indicative of meibomian gland orifice obstruction, i.e., decreased meibomian secretion, and plugging, pouting, and ridging were recognized [17]. Meibomian secretion (meibum) was graded as follows: grade 0, clear meibum, easily expressed; grade 1, cloudy meibum, expressed with mild pressure; grade 2, cloudy meibum, expressed with more than moderate pressure; and grade 3, no meibum expression, even with hard pressure [18]. Morphological changes in the meibomian glands were observed using a non-contact mobile meibography system (Japan Focus Corporation, Tokyo, Japan) and graded accordingly [19]. Partial or complete meibomian gland loss was scored using the following grades for each eyelid, as previously described: grade 0, no meibomian gland loss; grade 1, area of meibomian gland loss < 1/3 of the total meibomian gland area; grade 2, area of meibomian gland loss between 1/3 and 2/3 of the total area; and grade 3, area of meibomian gland loss > 2/3 of the total area. Meiboscores (0–6) were summed to obtain a score from 0 through 6 for each eye [20]. The central region was also defined as the middle 1/3 part of the lid width.

#### *2.5. Examination for Graves' Ophthalmopathy*

Proptosis (protrusion > 15 mm) was measured using a Hertel exophthalmometer (Handaya, Tokyo, Japan). Lid retraction was defined as a palpebral fissure height > 7 mm [14] and exposure of the upper sclera. In the MRI investigation, levator muscle enlargement was determined in a sagittal section. Swelling of the lacrimal gland and extraocular muscle enlargement was determined in a coronal section. Muscles that were clearly thicker than the optic nerve were determined to be enlarged. GO activity was defined based on clinical activity score (CAS) [21] and T2 signal intensity ratios (T2SIR) [22]. Each patient was assigned a CAS after examination by an ophthalmologist. This score is based on 4 well-known classic signs of inflammation, i.e., pain, redness, swelling, and impaired function, and consists of scores for 10 items. Each sign judged as present is scored 1 point, and each sign has the same weight. The active phase of GO was defined by CAS ≥ 4 points. The T2SIR was defined by the ratio of signal strengths of extraocular muscles and the ipsilateral temporal muscle on T2-weighted images.

The severity of GO was classified using Olympia Eye Hospital diagnostic criteria. The classification of mild GO was as follows: palpebral fissure height was 7–10 mm, lid swelling was mild (eyelids mildly swollen with fluid), the conjunctiva showed chemosis, injection, or congestion; the range of proptosis was 15–18 mm, the extra-ocular muscles exhibited no or intermittent diplopia. Additionally, no optic nerve, retina, or corneal findings were exhibited.

The classification of moderate GO was as follows: palpebral fissure height was 10–12 mm; lid swelling was moderate (eyelid skin showed obvious swelling but the tissue was not tense); the conjunctiva showed SLK; the range of proptosis was 18–21 mm; the extra-ocular muscle showed disorder, as exhibited by diplopia of the peripheral field-of-view; and the cornea showed infiltration due to lagophthalmos that affected the entire cornea. Additionally, no optic nerve or retina findings were exhibited.

Severe GO was classified as severe if the following were present: palpebral fissure height was 12 mm or more, lid swelling was severe (the upper eyelid skin-fold was ballooned out, filled with fluid, and the skin was taught), the conjunctiva showed upper scleral vessel engorgement, the range of proptosis was 21 mm or more, the extra-ocular muscle disorder extended to diplopia of the 1st eye position, the cornea shows infiltration due to lagophthalmos, affecting the entire cornea. Additionally, no optic nerve or retina findings were present.

Furthermore, the most severe form of GO was classified when, in addition to the findings of severe GO, corneal perforation, ulcer, or optic neuropathy were present. If even one of these findings was present, the condition was regarded as the most severe GO.

#### *2.6. Assessment of Subjective Symptoms, Self-Reported via a Questionnaire*

A validated dry eye symptom questionnaire, the Dry Eye-related Quality-of-Life Score (DEQS) questionnaire, was administered. The DEQS questionnaire was recently developed in Japan and its internal consistency, test-retest reliability, discriminant validity, and responsiveness to change have all

been validated previously [23]. It comprises 15 questions; 6 questions assess ocular symptoms and 9 assess the effect of DED on the quality of life. The 6 questions related to ocular symptoms query respondents on the presence and severity of foreign body sensations, dry eye sensations, pain or soreness, ocular fatigue, eyelid heaviness, and eye redness. The frequency of symptoms is scored from 0 (none) to 4 (highest frequency), and the severity of symptoms is scored from 1 (low) to 4 (high) [23]. The summary scale score ranges from 0 (best) to 100 (worst).

#### *2.7. Statistical Analysis*

Data were analyzed using the Statistical Package for the Social Sciences version 26.0 (IBM Corp., Armonk, NY, USA) and Excel® version 14.1.0 (Microsoft®, Redmond, WA, USA). The t-test was used for continuous variables, the Mann–Whitney U-test for ordinal variables and chi-squared test for nominal variables. Statistical significance was indicated at *p* < 0.05.
