Next Article in Journal
Comparative Analysis of Carbon Footprints from Away and Home Matches: A Study on Leading Basketball and Football Teams in Türkiye
Next Article in Special Issue
Environmental Justice Specialization and Corporate ESG Performance: Evidence from China Environmental Protection Court
Previous Article in Journal
Air Quality Benefits of Renewable Energy: Evidence from China’s Renewable Energy Heating Policy
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Textual Attributes of Corporate Sustainability Reports and ESG Ratings

School of Business Administration, Capital University of Economics and Business, 121 Zhangjialukou, Beijing 100070, China
*
Author to whom correspondence should be addressed.
Sustainability 2024, 16(21), 9270; https://doi.org/10.3390/su16219270
Submission received: 25 September 2024 / Revised: 17 October 2024 / Accepted: 19 October 2024 / Published: 25 October 2024
(This article belongs to the Special Issue Sustainable Governance: ESG Practices in the Modern Corporation)

Abstract

:
While the textual attributes of corporate financial documents, such as annual reports, have been extensively analyzed in the academic literature, those of corporate sustainability reports, which serve as a critical channel for nonfinancial disclosure, are relatively under-explored. Given the increasing importance of Environmental, Social, and Governance (ESG) factors in corporate strategy and stakeholder evaluation, understanding the role of textual attributes in sustainability reporting is crucial. This study examines 10,021 hand-collected sustainability reports from Chinese firms between 2009 and 2021, focusing on six key textual attributes: length, readability, tone, boilerplate language, redundancy, and completeness. Using computational linguistics, we analyze how these attributes evolve over time and their impact on ESG ratings provided by both international (MSCI, FTSE) and domestic (SNSI) agencies. Our findings reveal that the length and completeness of sustainability reports significantly influence ESG scores across agencies, demonstrating a shared appreciation for detailed and transparent disclosures. However, international and domestic rating agencies exhibit differing responses to attributes like tone, boilerplate language, and redundancy. These differences highlight variations in evaluation standards, methodologies, and value orientations between global and local stakeholders. The results emphasize the need for firms to tailor their sustainability disclosures to meet diverse stakeholder expectations. This study contributes to the growing body of literature on nonfinancial reporting by providing empirical evidence on how specific textual characteristics of sustainability reports can shape ESG evaluations, offering insights for both corporate communicators and policymakers.

1. Introduction

The study of textual attributes in corporate communications, such as annual reports, letters to shareholders and other firm-issued documents, has garnered significant attention in the academic literature [1,2]. Researchers have examined various textual elements, including but not limited to readability, tone and redundancy [3,4,5]. It has been shown that the textual attributes can carry implications extending beyond mere stylistic considerations. They offer an alternative view of the quality and effectiveness of corporate disclosures and play a pivotal role in shaping stakeholder perceptions, influencing firm performance, and affecting regulatory evaluations [4,6].
However, existing research on textual attributes has predominantly focused on corporate annual reports and financial disclosures [2,7,8], paying insufficient attention to the increasingly important domain of nonfinancial sustainability reporting. Nowadays sustainability reporting is on the path to becoming a cornerstone of corporate communication [9,10], fueled by regulatory pressures, investor demand, and consumer awareness among other factors [11]. For example, the European Union’s Corporate Sustainability Reporting Directive (CSRD), having entered into force in 2023, requires mandatory reporting of sustainability information for companies meeting certain thresholds [12]. A recent global survey finds that investors are increasingly integrating sustainability factors into decision-making and place significant importance on the quality of sustainability reporting [13]. Sustainability reports are expected to convey a firm’s commitment to the sustainability agenda, detailing their goals, efforts, challenges, opportunities and performance in this regard [14]. Moreover, sustainability reports are a crucial source of information for environmental, social and governance (ESG) ratings, which are developed by rating agencies such as MSCI and FTSE to assist investors in making informed ESG investing decisions [15,16]. A broad spectrum of studies has analyzed the validity, consistency, and determinants of ESG ratings as well as their implications for other aspects of the companies [17,18,19]. For rating agencies, sustainability disclosure is a crucial source of information to gauge and benchmark the companies’ sustainability performance.
Financial markets and participants rely heavily on these reports as they provide critical insights into a firm’s long-term viability and risk management strategies. The financial implications of sustainability reporting are profound, as they can influence investment decisions [20], credit ratings [21], and overall market perception [22]. High-quality sustainability reports that are comprehensive and transparent can lead to better financial outcomes by attracting more investors [23] and enhancing the firm’s reputation [24]. On the other hand, poorly constructed reports with low readability or high redundancy might lead to negative financial consequences [25], such as reduced investor confidence [26] and increased capital costs [27].
This study aims to understand the progression of textual attributes of sustainability reports and their implications for ESG ratings, based on a sample of 10,021 sustainability reports issued by public Chinese firms from 2009 to 2021. Drawing on computational linguistics, we define and compute six textual attributes, including length, readability, tone, boilerplate words, redundancy, and completeness, for Chinese sustainability reports. Length and completeness reflect the depth and breadth of information provided, while readability and tone influence how easily the information can be understood and perceived by stakeholders. Boilerplate words and redundancy can indicate the level of customization and specificity of the reports. We analyze the association between the six textual attributes with ESG ratings by two international rating agencies, MSCI and FTSE, and one domestic rating agency, Sino-Securities Index (SNSI).
This paper presents three main findings. First, it traces the evolution of the six textual attributes over the study period, illustrating how Chinese firms have refined their sustainability reporting practices in response to changing regulatory environments and stakeholder expectations. Second, it identifies significant associations between three specific attributes (length, readability, and tone) and various ESG ratings (MSCI, FTSE, SNSI), highlighting the critical role these attributes play in enhancing the perceived quality of sustainability reports. Third, the study reveals that international ESG ratings and domestic ESG ratings respond differently to these textual attributes, underscoring the diverse standards and expectations that global and local stakeholders apply when evaluating ESG disclosures.
By focusing on the textual dynamics of sustainability reports, this research makes several important contributions. It provides a comprehensive temporal analysis of how reporting practices have evolved in a major emerging market, offering insights into the broader trends in corporate communication and stakeholder engagement. It also demonstrates the substantive impact of report quality on ESG ratings, providing empirical evidence that can inform both corporate reporting strategies and regulatory policies. Moreover, by highlighting the differential responses of international and domestic ESG ratings to textual attributes, the study underscores the need for firms to tailor their reporting practices to meet diverse stakeholder expectations effectively.
The remainder of the paper is structured as follows. Section 2 presents the literature review and our hypothesis. Section 3 defines the textual attributes and describes the method. Section 4 presents the results. Section 5 discusses the findings. Section 6 presents the conclusion.

2. Literature Survey and Hypothesis

2.1. Literature Survey

Textual analysis has become increasingly pivotal in accounting and finance research, as the explosion of unstructured data and advancements in natural language processing enable researchers to leverage textual analysis to gain insights that were previously inaccessible [28,29]. Existing studies have applied textual analysis to a wide array of corporate documents, such as annual reports [30], letters to shareholders [31], and earnings call transcripts [2,31,32]. It has been shown that textual analysis could unveil hidden information, improve forecast accuracy, enhance risk management, and decode managerial intentions, based on various textual attributes, such as readability, length, tone, redundancy, boilerplate words, and comprehensiveness. For example, the readability of corporate annual reports, as measured by the Fog Index, is related to earnings [4]. Additionally, a substantial amount of disclosure volume in annual reports can be explained by managerial discretion [33]. Tone analysis reveals that managers’ sentiment based on the use of optimistic or pessimistic language correlates with future financial outcomes, where optimistic tones often indicate better performance [34,35,36,37,38]. Redundancy in reports can dilute key information, leading to inefficient price discovery and higher processing costs for readers [39]. The use of boilerplate language undermines report credibility, emphasizing the need for specific, detailed information to maintain transparency [40]. Comprehensive disclosures reduce information asymmetry, enhancing market efficiency and investor confidence [41,42]. These findings underscore the importance of textual attributes in shaping market and stakeholder responses to corporate disclosures.
While existing studies predominantly focus on the textual attributes of financial documents, corporate sustainability reports, also labeled as corporate social responsibility (CSR) or ESG reports, have gradually become a focal point for textual analysis [43]. Sustainability reports, as an increasingly important medium for non-financial disclosure, are subject to limited regulatory oversight [44]. Sustainability information holds incremental value for investors, providing insights into future financial performance, cost of equity, labor productivity, and corporate misconduct propensity [27,28]. Studies have shown a positive association between CSR performance and the readability of CSR reports [29,30,45,46,47]. Additionally, tone analysis in CSR reports reveals that poor performers often exhibit a biased optimistic tone compared to better performers [48]. These findings emphasize the need for rigorous analysis of sustainability disclosures to ensure they accurately reflect corporate sustainability efforts.
Rating agencies like MSCI and Sustainalytics incorporate firms’ sustainability disclosures as a critical source of information in their assessment of ESG ratings [15]. These ratings serve as crucial compasses in the financial market for ESG investing [49,50,51]. Companies can potentially gain the most by publishing sustainability reports that offer detailed insights into their sustainability strategies and practices, thereby minimizing information asymmetry between the companies and market participants [33,34]. In financial reporting, existing studies have explored how the textual attributes of annual reports affect credit ratings [52]. For example, companies with less-readable 10-K filings tend to receive lower credit ratings [53]. However, to our knowledge, no studies have been conducted on the role of textual attributes of sustainability reports in the ESG rating market. We aim to fill this gap by examining how the textual attributes of sustainability disclosures impact ESG ratings.
Our study differs from the literature in three important aspects. First, to our knowledge, this is the first investigation of the relationship between textual attributes of sustainability disclosure and ESG ratings, while existing studies focus on the relationship between textual attributes and financial measures (e.g., earnings, stock returns). Second, this is the first study to analyze the textual attributes of an exhaustive sample of sustainability reports by Chinese public firms, as opposed to the predominantly English-language studies so far. Third, we explore whether international and domestic rating agencies process textual information differently.

2.2. Hypothesis

We conjecture that the textual attributes of sustainability reports can crucially impact how the ESG rating agencies evaluate corporate performance through three channels. First, from the lens of disclosure level, the clarity and complexity of sustainability reports can impact stakeholders’ understanding and assessment of a firm’s ESG performance. As noted in a prior study [54], the quality of sustainability reports can affect the ease with which stakeholders process and incorporate the disclosed information into their decision-making processes. When reports are less readable, market participants facing limited cognitive resources [55] may struggle to interpret the extensive qualitative data often included in sustainability disclosures. This can lead to increased time and effort spent on deciphering the information or seeking additional data from other sources, thus raising their information processing costs. Furthermore, sustainability disclosures are generally seen as a signal of a company’s stronger commitment to the ESG agenda and a better ability to integrate ESG factors into its overall corporate strategy [56]. Clear and comprehensive disclosures not only facilitate a more accurate evaluation of the firm’s ESG practices but also signal to stakeholders that the company is taking these issues seriously and is proactive in its strategic integration of ESG considerations [57]. Consequently, firms with thorough disclosures are likely to be viewed more favorably by rating agencies who appreciate the transparency and the apparent alignment of ESG practices with corporate strategy.
Second, from the lens of communication style, readability and tone reflect a corporation’s incentive to disclose information about its ESG performance. Sustainability reports often rely on qualitative, textual descriptions to convey information, particularly when discussing aspects that are difficult to quantify, such as employee well-being and environmental stewardship. The tone of sustainability disclosures can serve as a tool for managers to communicate nuanced and hard-to-quantify information. According to the truthful disclosure hypothesis, managers use both positive and negative language to reflect their genuine expectations about future performance, including ESG outcomes [20,38,39,58]. However, the opportunistic disclosure motive cannot be overlooked. Some studies indicate that managers might manipulate the tone of sustainability reports to present a more favorable image of the company’s performance, a practice often referred to as “greenwashing” [31,40,41,59]. For example, it is found that firms with poorer environmental records often employ more optimistic and less certain language in their sustainability reports to mislead stakeholders about their true performance [48]. In summary, the readability and tone of sustainability reports are crucial elements that reflect a corporation’s disclosure strategy. A positive and transparent tone generally signals a genuine commitment to ESG principles and a willingness to integrate these into the corporate strategy.
Third, from the lens of report specificity, boilerplate language and redundancy significantly impact how rating agencies interpret sustainability information. The interpretation largely relies on the methodology and ability of agencies. Local ESG rating agencies, leveraging their contextual knowledge and specific understanding of regional issues, can better assess ESG risks that are not captured by standardized global metrics. This advantage is particularly pronounced in evaluating social and governance issues, where local knowledge plays a crucial role [60,61]. In contrast, global rating agencies tend to use a uniform set of criteria, which may overlook nuanced local issues but ensure consistency across different regions. This standardized approach might not fully capture the complexities of local ESG risks, potentially leading to less accurate assessments in specific contexts [1]. Additionally, the predictive ability of local agencies regarding future ESG incidents, especially in social and governance areas, underscores their deeper insight into local conditions and regulatory environments.
In summary, we hypothesize that textual attributes of sustainability disclosures are associated with third-party ESG ratings but with different impacts on international and domestic rating agencies. The divergence between international and domestic rating agencies may stem from a domestic agency’s ability to navigate the local regulatory and cultural landscape. Hence, we present the following two hypotheses:
H1. 
There is a significant correlation between the textual attributes of a sustainability report and ESG ratings.
H2. 
The rating standards in text recognition of international agencies differ from those of domestic agencies.

3. Methodology

3.1. Sample and Data

Sample. We manually collected all sustainability reports (including both CSR and ESG reports) released by Chinese public companies between 2009 and 2021, from various channels, including corporations’ official websites, the Shanghai Stock Exchange, the Shenzhen Stock Exchange, and Cninfo (Cninfo was established in 1995. It is the earliest professional securities information website and the first comprehensive online platform to disclose announcement information and market data for all public companies in the Shanghai and Shenzhen Stock Exchanges. It is designated by the China Securities Regulatory Commission as the official platform for disseminating information related to public companies in the country). Although the first sustainability report by a Chinese firm was published in 2004, ESG rating data before 2009 were significantly incomplete, and matching with reports would yield few data points. Therefore, this study focuses on the period from 2009 to 2021. The final sample comprises 10,021 reports.
The quantity of data in the sample from 2009 to 2021 is increasing annually, exhibiting a gradual rise from 413 reports to 1277 reports. Figure 1 provides a comprehensive illustration of the distribution of ESG report disclosures by industry on an annual basis. The figure presents a ranking of the ten most prominent industries in terms of the number of disclosures, ordered from the highest to the lowest number of disclosures. The manufacturing industry is the first to report, with a total of 5380 reports, representing 53.69% of the total. The financial industry occupies the second position, with a total of 744 reports. Moreover, an examination of the corporate identity of the sample companies indicates that although the number of reports disclosed by state-owned enterprises (SOEs) exceeded 55% of the total, the number of reports disclosed by SOEs each year demonstrates a decline over time. In 2009, SOEs are responsible for disclosing 67.55% of the reports. By 2021, however, their disclosure share has decreased to 45.34%.
Dependent Variables. The dependent variable in this study is the ESG rating scores from two international agencies, FTSE and MSCI, and one domestic agency, SNSI. All ratings were obtained from the WIND database, a premier database for Chinese public firms. Different agencies may update the firms’ ratings at different time points, and we need to match the ratings with sustainability reports for regression analysis. To capture the impact of text attributes on a firm’s ESG score, we used the ESG rating after the release of the sustainability reports for the t + 1 period as the dependent variable, with the rating for period t as a baseline.
The FTSE ratings include an overall score for aggregate ESG performance and sub-ratings for performance in underlying pillars and themes. Firms were rated on a scale of 0–5, with 5 indicating the best performance. Matching the FTSE Russel ratings with the reports resulted in a sample of 1481 observations. MSCI assigns ESG scores annually from 0 to 10, based on the assessment of 35 key ESG issues, resulting in 1836 matched observations. The domestic SNSI rating uses a four-level indicator system with 3 Level-1 indicators, 16 Level-2 indicators, 44 Level-3 indicators, and nearly 80 Level-4 indicators. The SNSI ratings are given in letter format, from the highest “AAA” to the lowest “C”. We converted the letter ratings into numeric values, with “AAA” being 9 and “C” being 1, resulting a final matched dataset of 9796 observations. To ensure comparability across these three rating systems, we converted all numeric values into a standardized 0–100 scale. This involves rescaling the FTSE ratings from 0 to 5, the MSCI ratings from 0 to 10, and the SNSI ratings from 1 to 9, such that the lowest possible score is 0 and the highest is 100.
Independent Variables. The text attributes of sustainability reports issued by public companies in China are the independent variables for this study. We utilized six independent variables: LENGTH, READABILITY, TONE, BOILERPLATE, REDUNDANCY, and COMPLETENESS. The definitions and measurement methods for them are summarized in Table 1 and Table 2. In a study conducted by Li (2008) [4], the number of words within a given text was proposed as a metric for measuring its LENGTH. READABILITY can be defined as the ease with which a text can be understood by the reader. A text with a high readability value is more difficult to understand than one with a low readability value. The TONE of sustainability reports is evaluated through the identification of positive and negative words, which are then used to construct a sentiment lexicon for subsequent analysis. The BOILERPLATE is the proportion of boilerplate words in corporate documents. REDUNDANCY words are defined as the number of words in a sentence that are repeated word-for-word in other parts of the text. The evaluation of the COMPLETENESS of sustainability reports is conducted through an examination of the disclosure of information pertaining to the United Nations Sustainable Development Goals (SDGs).
Among the six textual attributes, LENGTH and COMPLETENESS reflect the depth and breadth of information provided. READABILITY and TONE influence how easily the information can be understood and perceived by stakeholders. BOILERPLATE and REDUNDANCY indicate the level of customization and specificity of the reports. Detailed measurement procedures are presented in Appendix A.
Control Variables. Various control variables are included based on established research [33,43]. In CSR contexts, larger companies face more public pressure for social responsibility, while those with higher litigation risk may improve CSR to avoid legal issues [62]. We control for company size (SIZE, natural logarithm of total assets) and litigation risk (LITG, binary indicator of litigation penalties). Financially successful companies have more resources for CSR compliance [51,63], and competitive industries drive better CSR performance [51]. The nature (SOE) and age of the business (LISTAGE) are subject to our control. We control for profitability (ROA) and industry competition (HHI) using data from the National CSMAR Database. Global companies face higher CSR demands due to their international exposure (GLOBAL, binary indicator of foreign income) [51]. To improve stock liquidity for equity issuance and share sales, we measured liquidity (LIQUIDITY) as the reciprocal of a non-liquidity indicator [64]. We accounted for growth opportunities (TOBINQ) and financial risk (LEV, debt-to-assets ratio) since higher leverage typically demands greater disclosure [65]. CSR disclosure correlates with general financial transparency [51]. Lastly, we considered board characteristics: duality (DUAL, same chairman and CEO), board size (BOARD, number of board members), and proportion of independent directors (INDEP). Additionally, we controlled for the ESG rating scores from the previous period to account for the inertia of the same institution’s ratings for companies, or in other words, their reliance on past ratings.

3.2. Model

In accordance with the model proposed in [61], we employed the fixed effect model to estimate the relationship between textual attributes and ESG ratings as follows:
E S G i , t + 1 = α + β 1 T e x t A t t r i b u t e i , t + β 2 C o n t r o l i , t + γ t + δ i + ε i , t + 1 ,
where ESGi,t+1 is the ESG rating at time t + 1 for firm i. α is the constant term. TextAttributei,t is the text attribute at time t for firm i. β1 and β2 are the regression coefficients corresponding to the text attribute and the vector of control variables, respectively. γt represents the time-fixed effects, which control for common effects across all firms at specific time points. δi represents the industry-fixed effects, which control for unobserved individual effects specific to each industry. εi,t+1 is the error term, representing unexplained random disturbances. Robust standard errors clustered by company are employed to address issues related to heteroskedasticity and serial correlation, and industry and year-fixed effects are incorporated into all the regression models. To detect potential multicollinearity in the data, variance inflation factor (VIF) values are computed.

4. Results

Table 3 presents summary statistics for the sample. The sustainability reports of public companies are generally difficult to read, with an average READABILITY of 16.175 and a standard deviation of 3.297, indicating unsatisfactory readability. This score suggests that firms in the sample are using complex language and long sentences, which could make it challenging for readers to interpret the reports easily. The average TONE is 0.729, reflecting a predominantly positive sentiment in the reports. However, this positive tone might vary significantly between firms, as indicated by the standard deviation (SD = 0.075). The average REDUNDANCY level is 0.012, suggesting that firms tend to avoid repeating information excessively, maintaining a more concise narrative. However, the reports are poorly diversified, addressing an average of only 2.958 Sustainable Development Goals (SDGs) per report, representing less than a quarter of the total targets, which points to a limited breadth of sustainability topics addressed by firms.
Table 4 shows the correlation coefficients between the main variables. The correlations between LENGTH, READABILITY, and COMPLETENESS with ESG ratings (MSCI, FTSE, and SNSI) are all significantly positive, indicating that more detailed and comprehensive reports are associated with higher ESG ratings. This finding supports the hypothesis that detailed disclosures positively influence how both international and domestic agencies perceive sustainability efforts. Interestingly, TONE shows a negative correlation with MSCI ratings, which suggests that a more optimistic tone might not always be perceived favorably by international rating agencies like MSCI. On the other hand, SNSI’s significant correlation with BOILERPLATE (negative) and REDUNDANCY (positive) highlights the differences in evaluation criteria between domestic and international agencies. These findings suggest that SNSI may appreciate a more detailed and exhaustive narrative, even if it includes redundant information, while international agencies might favor reports with more original content.
Figure 2 depicts the evolution of key textual attributes in sustainability reports from 2009 to 2021, with lines representing the first quartile, mean, median, and third quartile.
Over this period, the average report length has increased steadily, reflecting a growing trend toward more detailed disclosures. Interestingly, the readability of reports peaked in 2018 before declining, indicating that while reports were becoming more detailed, they were also becoming harder to read, possibly due to the use of more technical or specialized language. The tone of the reports has remained predominantly positive throughout the period, with at least 75% of the reports exhibiting a positive sentiment. The boilerplate ratio shows fluctuations but has generally increased over time, which could suggest a growing reliance on standardized language in sustainability reports. Redundancy levels have remained consistently low, further indicating that reports tend to avoid unnecessary repetition. Finally, the completeness of the reports, as measured by the number of SDGs covered, has gradually increased, indicating more comprehensive sustainability disclosures over time.
Overall, the textual attributes of sustainability reports have changed significantly from 2009 to 2021. Reports have become longer and more comprehensive, with improved readability and a more positive tone. The readability and boilerplate ratio have fluctuated, while redundancy remains low, indicating efficient information presentation.
Table 5 presents the main regression results, showing a significant positive correlation between report LENGTH and ESG ratings across all three agencies (MSCI, FTSE, and SNSI). This indicates that longer reports tend to receive higher ESG scores, likely because they provide more comprehensive and detailed information. Additionally, the COMPLETENESS of a report—measured by the number of SDGs covered—also has a significant positive impact on ESG ratings from FTSE and SNSI, but not from MSCI. This finding suggests that domestic agencies like SNSI might value the breadth of sustainability disclosures more than international agencies. The results highlight the importance of detailed and comprehensive reporting, especially when aiming for higher domestic ESG ratings.
Disclosure level—Length and Completeness. According to Table 5, there exists a significant positive correlation between LENGTH and ratings by all three agencies. Upon considering the influence of COMPLETENESS, which measures another dimension of disclosure level, our analysis reveals a significant positive correlation between COMPLETENESS and ratings by FTSE(t+1) and SNSI(t+1). However, we do not observe such a relationship with MSCI(t+1). Notably, the statistical significance of the correlation between COMPLETENESS and ratings by FTSE(t+1) and SNSI(t+1) was found to be stronger (with a p-value of 0.000). This relationship holds true for both the number of covered SDGs and the ratio of covered SDGs. The expected average change in SNSI(t+1) when the number of covered SDGs increases by one unit is 11.2295 (t-statistic: 8.2792)
The positive correlation between comprehensive sustainability reports and higher ESG ratings can be understood from several perspectives. First, detailed reports demonstrate a company’s commitment to transparency and ESG factors. Organizations providing extensive information on their sustainable practices tend to receive higher ESG ratings due to the perceived depth and quality of their initiatives. Second, comprehensive reports reflect the integration of ESG considerations into a company’s strategy and operations, signaling a strong commitment to sustainability and responsible business practices. Third, detailed reports cover a broad scope of ESG initiatives, enabling thorough evaluation by rating agencies and resulting in enhanced ratings. In summary, a sustainability report with a higher disclosure level, measured by higher levels of LENGTH and COMPLETENESS, signifies a deeper commitment to ESG issues, positively influencing ESG ratings.
Communication Style—Readability and Tone. According to Table 6, the results show positive coefficients of READABILITY on the ESG score for three ratings, which stands in stark contrast to those reported by Nazari et al. (2017). Since a higher READABILITY value indicates lower readability, it reveals an intriguing fact that sustainability reports with lower readability tend to have higher ESG ratings in the subsequent period. This can be attributed to the following reasons. First, it is due to the use of complex, professional language reflecting the depth of Chinese culture and thorough ESG consideration. Companies that create detailed, comprehensive reports show a strong commitment to sustainability, positively affecting their ESG ratings. Second, firms with higher ESG scores often use sophisticated language to enhance their sustainability image. Reports with lower readability typically contain intricate vocabulary, jargon, and detailed explanations, indicating a deeper understanding of ESG issues and potentially leading to higher scores.
Regarding the attribute of TONE, the three rating agencies exhibit completely different reactions. Our sample finds no significant relationship between TONE and MSCI ratings. However, both FTSE and SNSI ratings show significant relationships with TONE, but with opposite coefficients. Specifically, TONE is negatively correlated with FTSE ratings but positively correlated with SNSI ratings. This means that a more positive tone may lower FTSE’s future ESG ratings while increasing SNSI’s ratings. These results can be attributed to the different evaluation standards of the three agencies. MSCI focuses more on data-driven and quantitative analysis, resulting in no significant relationship with TONE. FTSE might prioritize risk signals conveyed in the tone, viewing positive tones as attempts to obscure potential issues, leading to a negative correlation. Conversely, SNSI is likely more sensitive to the commitment and confidence expressed in the tone, resulting in a positive correlation. Additionally, the evaluation of Chinese texts and companies introduces a cultural dimension. FTSE, as a foreign institution, may misinterpret positive tones in Chinese reports as overly optimistic or lacking in substance, hence giving lower scores. In contrast, SNSI, being a domestic agency, has a better understanding of the nuances of Chinese language and culture, allowing it to accurately interpret positive tones, thus awarding higher scores [61].
Specificity—Boilerplate and Redundancy. Our study includes two variables that measure the specificity or customization of text: BOILERPLATE and REDUNDANCY. These variables assess whether a particular text contains customized and specific information. According to Table 7, the analysis of these two variables reveals some interesting findings. First, for BOILERPLATE, we find a significant positive correlation only with FTSE ratings. This suggests that standardized writing templates or boilerplate language positively influence FTSE’s evaluation system, while it does not affect the ratings of other agencies.
Second, regarding REDUNDANCY, we discover a significant positive correlation only with the ratings of the domestic agency—SNSI, with no impact on the two international rating agencies. This discrepancy may be due to differences in how machines and humans read information. As Loughran and McDonald (2016) point out, using parsing programs to read text content has problems, such as the reliance on consistent text structure and markup language. The lack of structural anchors in documents, often correlated with firm size and time period, can lead to systematic mismeasurement rather than random noise [1]. This issue can result in the inaccurate extraction of valuable information from texts. Compared to international agencies, domestic agencies like SNSI are more likely to extract useful information from what software might consider redundant content, highlighting the advantage of local expertise [61].
Robustness Check. We undertake further robustness checks to explore potential bias in textual attribute measurement. Particularly, we note that unlike English, there is no definitive approach for measuring the readability of Chinese text. We experiment with three alternative readability measures, as described in Equations (A2). All the results still hold. In this study, we conduct a multicollinearity test and find that the variance inflation factors (VIFs) of the regressions are all less than 10, indicating that the regressions are not affected by multicollinearity.

5. Discussion

Main Findings. Our study reveals several key insights regarding the textual attributes of sustainability reports and their impact on ESG ratings. By examining over a decade of reports from Chinese public firms, we provide a historical perspective on how reporting practices have evolved in response to regulatory changes and stakeholder demands. We identify significant associations between the length and completeness of reports with ESG ratings across all three agencies (MSCI, FTSE, and SNSI). Specifically, longer and more comprehensive reports consistently receive higher ESG ratings, indicating a shared valuation standard among both international and domestic rating agencies for detailed and thorough disclosures.
These findings have important financial implications. Detailed and transparent sustainability reports can positively influence a company’s financial standing by enhancing investor confidence and reducing the perceived risk. This, in turn, can lead to lower capital costs and better stock performance. Conversely, reports with low readability or high redundancy can obscure important information, leading to increased uncertainty and potentially higher capital costs. This underscores the need for firms to prioritize high-quality disclosures to achieve better financial outcomes.
In contrast, we observe divergent reactions to other textual attributes such as tone, boilerplate language, and redundancy. For instance, FTSE ratings are positively influenced by the use of boilerplate language. This suggests that FTSE views standardized and consistent writing favorably, possibly interpreting it as a sign of professionalism and clarity. Conversely, SNSI ratings are positively influenced by redundancy. This indicates that the domestic agency may perceive repetitive information as a sign of thoroughness and reliability, reflecting a more meticulous approach to reporting.
These discrepancies in the impact of textual attributes can be attributed to the differences in evaluation criteria and cultural contexts between international and domestic agencies. International agencies like FTSE may prioritize clarity and consistency to facilitate comparability across global firms, while domestic agencies such as SNSI might value the detailed and repeated information that aligns with local expectations of comprehensive disclosure. This divergence highlights the importance for firms to understand and adapt to the specific preferences and standards of different rating agencies to effectively communicate their sustainability efforts and enhance their ESG ratings.
Additionally, our analysis of readability reveals an intriguing insight specific to Chinese texts. Lower readability is associated with higher ESG ratings. This result contrasts with studies focusing on English texts, where higher readability typically correlates with better ratings. The unique structure and complexity of the Chinese language, which often employs intricate and professional terminology, may convey a deeper commitment to ESG issues, thus positively impacting ratings.
Contributions. This research makes significant contributions to the field of corporate communication and ESG ratings. First, our study is groundbreaking in its comprehensive analysis of the evolution of textual attributes in sustainability reports across the entire Chinese market. By examining over a decade of reports, we provide a historical perspective that illuminates the dynamic interaction between corporate reporting practices and market expectations. This longitudinal approach offers deep insights into how Chinese firms have refined their communication strategies in response to evolving regulatory pressures and stakeholder demands.
Second, by focusing on the most widely recognized textual attributes and their relationship with ESG ratings, our research offers practical implications for both corporate disclosure strategies and policy development. The identified attributes—length, readability, tone, boilerplate language, redundancy, and completeness—serve as critical indicators of report quality. Our findings suggest that companies aiming to improve their ESG ratings should prioritize producing detailed, comprehensive, and well-structured reports. This insight is invaluable for firms seeking to enhance their sustainability communication and for policymakers aiming to establish effective reporting standards.
Third, our study highlights the divergent responses of international and domestic rating agencies to textual attributes, providing empirical evidence of the varying standards and evaluation criteria used by different agencies. This discovery underscores the complexity of the ESG rating landscape and the need for companies to tailor their reporting practices to meet diverse stakeholder expectations. Furthermore, our findings support Loughran and McDonald’s argument that machine-based textual analysis can lead to significant errors. The inconsistency in how different agencies interpret redundancy and boilerplate language, particularly between machine-based and human evaluations, highlights the limitations of relying solely on automated text analysis tools. This revelation calls for a more nuanced approach that combines machine efficiency with human judgment to accurately assess the quality of sustainability reports. Overall, our research not only advances the understanding of the role of textual attributes in sustainability reporting but also provides actionable guidance for improving corporate disclosure practices and developing more effective regulatory frameworks.
Limitations and Future Research Directions. While our study offers important insights, it also has limitations. First, the focus on Chinese firms means our findings may not be directly applicable to companies in other countries with different regulatory environments and cultural contexts. Firms operating in diverse regulatory systems, such as those in the European Union or North America, may face distinct pressures and expectations in their sustainability reporting. Future research could extend this study by analyzing firms from various regions to validate whether similar textual attributes yield comparable impacts on ESG ratings across different regulatory and cultural settings. Second, our reliance on computational linguistics to analyze textual attributes may not capture the full complexity and nuance of sustainability reports. Although automated text analysis provides efficiency, it may overlook important qualitative insights, such as narrative style, corporate strategy context, or the use of subtle language cues that can only be fully appreciated through manual analysis. Future research could integrate more sophisticated machine learning techniques, natural language processing advancements, and qualitative methods to enhance the depth of analysis and provide richer interpretations of sustainability disclosures. Third, the study’s cross-sectional design limits our ability to infer causality between textual attributes and ESG ratings. Longitudinal studies that track changes in both ESG ratings and corporate reporting practices over time could offer deeper insights into how shifts in textual attributes influence ESG performance. Additionally, examining the role of external factors, such as regulatory changes or investor demand, in shaping these attributes would be valuable for understanding their broader impact on corporate sustainability.

Implementable Policy Recommendations

The findings of this research offer several actionable recommendations for policymakers, regulators, and firms aiming to improve the quality of sustainability reporting and its impact on ESG ratings. First, regulatory bodies should consider establishing clear guidelines on the desired textual attributes in sustainability reports, such as completeness, clarity, and tone. Policymakers could mandate that companies provide a minimum level of detail, including the disclosure of key ESG metrics and the specific actions taken to address sustainability goals. Such guidelines would help harmonize reporting practices and ensure that firms provide comprehensive, transparent, and comparable disclosures. Second, given the divergent responses from international and domestic rating agencies to certain textual attributes (e.g., tone, boilerplate language), regulators could work toward promoting greater standardization in ESG evaluation criteria. This would reduce discrepancies in how agencies interpret corporate disclosures and make it easier for firms to align their reports with global standards, improving their overall ESG ratings. Third, while machine-based textual analysis is efficient, its limitations suggest the need for incorporating human judgment in assessing the quality of corporate reports. Policymakers and firms should consider hybrid evaluation models that combine the speed of automated tools with human expertise to ensure more accurate and comprehensive assessments of sustainability disclosures. Finally, governments and regulatory bodies could introduce incentives for firms that produce high-quality, detailed, and transparent sustainability reports. For example, companies that demonstrate a clear commitment to comprehensive reporting could receive favorable treatment in public procurement processes or have access to sustainable finance mechanisms.

6. Conclusions

Our study provides a comprehensive longitudinal analysis of the evolution of textual attributes in sustainability reports based on a full sample of data from Chinese firms, revealing significant historical trends in reporting practices. By examining various textual attributes and their association with ESG ratings, we underscore the critical impact these attributes have on ESG evaluations. The analysis clearly shows that the length and completeness of sustainability reports significantly influence ESG scores for both international and domestic agencies, reflecting a shared appreciation for detailed disclosures. However, the differing reactions to attributes such as tone, boilerplate language, and redundancy suggest variations in evaluation standards, methods, and value orientations between international and domestic rating agencies. These findings highlight the importance of firms tailoring their sustainability disclosures to align with diverse stakeholder expectations and evaluation criteria. This research enhances our understanding of the intricate relationship between corporate communication practices and ESG performance, offering valuable insights for improving sustainability reporting and achieving better ESG ratings.

Author Contributions

J.H.: conceptualization, writing—original draft preparation, supervision, writing—review and editing, funding acquisition. D.D.W.: writing—review and editing, supervision. Y.W.: software, formal analysis, writing—original draft preparation. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (Grant No. 72002140) and Academic Innovation Team of Capital University of Economics and Business (Grant Number: XSCXTD202404).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

  • Calculation of textual attributes
LENGTH. Li (2008) [4] suggested utilizing the number of words within a given text as a metric for measuring its LENGTH. This practice is widely adopted in the fields of linguistics and natural language processing. Given the linguistic characteristics of Chinese characters, we substitute characters for words. In the context of sustainability reporting, we propose to define a report’s LENGTH as follows:
L E N G T H = l n   ( t h e   n u m b e r   o f   c h a r a c t e r s )
Assuming all else to be equal, longer documents are presumed to incur higher information-processing costs, making them appear more daunting and challenging to read. As the numerical value increases, so does the LENGTH of the text, exacerbating these barriers to comprehension.
READABILITY. This paper employs a Chinese readability index similar to the Fog index for English as a measure of sustainability report readability. The readability index takes into account two factors: the number of characters per sentence (long sentence or short sentence) and the ratio of complex words (long words or long idioms). We have devised four computation equations for the READABILITY by taking into account the variations in sentence structures and lexical classifications. By doing so, we can more effectively assess the intricacy of written materials. Following the computation of the aforementioned four indices, it has come to our attention that the outcomes display negligible disparity.
Consequently, we have elected to adopt the first index as our criterion for assessing the readability of this manuscript. This first counts the number of characters and long sentences (with sentences containing more than 15 characters). Next, one calculates the average number of characters per long sentence. Then, one calculates the percentage of long words with four or more characters. Here, N C h a r a c t e r s represents the total number of characters. Finally, these values are used to calculate the READABILITY. Therefore, we obtain READABILITY using the following formula:
R E A D A B I L I T Y = t h e   n u m b e r   o f   c h a r a c t e r s   p e r   l o n g   s e n t e n c e   + l o n g   w o r d s N C h a r a c t e r s × 0.4
A lower READABILITY indicates easier readability, while a higher READABILITY suggests that the text is more difficult to read. By providing a quantifiable measure of complexity, it allows readers to better understand and engage with the text. It is a widely-used formula that can be easily applied to assess the readability of any document.
TONE. We gauge the TONE of sustainability reports by identifying positive and negative words, which are used to construct a sentiment lexicon for analysis. We employ machine learning classifiers through the analysis feature of SnowNLP to determine the overall sentiment tendency of each report. SnowNLP is a natural language processing toolkit for Chinese text that offers a straightforward and efficient approach to sentiment analysis using a pre-defined sentiment lexicon [66]. The lexicon comprises sentiment words with their corresponding polarity values (i.e., positive or negative). To prepare for tone analysis, we gathered content pertaining to social responsibility and ESG and developed a customized lexicon. This paper outlines the calculation process for the TONE.
First, each sentence is firstly segmented into individual words, after which the sentiment lexicon is utilized to match these words and compute the overall sentiment polarity score of the sentence. The Chinese word segmentation module, “jieba”, was employed to segment the entire text contents of the sustainable development reports. We compiled a list of words based on existing lexicons. Second, for each word, the corresponding sentiment value is looked up in the sentiment lexicon. If the word is found in the sentiment lexicon, its corresponding sentiment value is retrieved; otherwise, it is considered as a neutral word (with a sentiment value of 0). Third, all sentiment values of the words are accumulated to obtain the overall sentiment value of the text. Finally, based on the polarity (positive or negative) of the sentiment value, the sentiment orientation of the text is determined. A positive value indicates a positive sentiment, while a negative value indicates a negative sentiment. The positive or negative sentiment score S t i is shown in the following formula:
S t = i = 1 n w d i p d i
Text score, on the other hand, is a metric commonly employed in sentiment analysis that measures the degree of positivity or negativity conveyed by a text. The text score ranges between −1 and 1, where a score closer to 1 reflects a more positive TONE, while a score closer to −1 suggests a more negative TONE.
BOILERPLATE. A search of the text in sustainable development reports was conducted using a trained list of boilerplate words to identify key template words. The resulting ratio of boilerplate words was subsequently calculated based on this analysis. The term “boilerplate words” refers to the number of words in a sentence that contain at least 75% of at least one four-character phrase shared by all companies within a given fiscal year.
To calculate the ratio of boilerplate words, the N-gram model used by Lang and Stice-Lawrence (2015) [67] can help. We initially count all four-character groups in each document within the sample to identify sample disclosures. We then sum these counts by country, considering only four-character groups that occur in at least 30% of documents in a country or at least five times per document on average. Furthermore, these groups are further analyzed in sample tests performed at the country level or for the entire sample.
Next, we mark sentences containing four-character groups that occur in at least 60% of the documents in the company’s home country. This is because these phrases are unusually common and potential templates. We exclude common and regulatory phrases, as well as phrases that are common in the documents. This is done to ensure that only relevant information is used for analysis.
Finally, we identify the sample by excluding sentences containing common four-word groups that appear in more than 80% of the sample documents in the countries from which a document was extracted or in at least 75% of the documents in the entire corpus of documents from all our countries. The calculation formula for the ratio of boilerplate words is:
B O I L E R P L A T E = b o i l e r p l a t e   w o r d s t o t a l   w o r d s
In conclusion, our methodology provides an effective way to measure the proportion of boilerplate words in corporate documents.
REDUNDANCY. Redundant words are defined as the number of words in a sentence that are repeated word-for-word in other parts of the text. In order to calculate the ratio of redundant words in a sustainable development report, an approach was used where each instance of verbatim repetition of a sentence was evaluated. Specifically, the number of words in the repeated sentence was multiplied by the number of times it was repeated and summed across all instances of verbatim repetition within the text. For example, if the sentence “The data in the chart is accurate” was repeated three times in the report, the word count of this sentence would be multiplied by three and added to the total for the analysis.
To perform this calculation, one possible method is to generate the corresponding string for each sentence in a document, sort these strings, and then traverse them to identify redundant words. The calculation formula for REDUNDANCY is:
R E D U N D A N C Y = r e d u n d a n t   w o r d s t o t a l   w o r d s
COMPLETENESS. In sustainability reporting, the completeness of text attributes is critical in ensuring that all the necessary information is captured without missing parts or empty values [68]. One approach to evaluating the completeness of sustainability reports is through examining the disclosure of information on the United Nations Sustainable Development Goals (SDGs). Two indicators—the number and coverage rate of SDGs mentioned in the text—can be used to assess the diversity of information included in the report.
To evaluate the completeness of information disclosure in sustainability reports, we examined the extent to which the SDGs were mentioned in the text. Specifically, we counted the number of SDGs mentioned and calculated their coverage rate. The number of SDGs in a sustainability report was counted using the keyword search method. The calculation formula for completeness is:
C O M P L E T E N E S S   =   t h e   n u m b e r   o f   S D G s   d i s c l o s e d
Overall, the completeness of text attributes is an essential component of sustainability reporting. By focusing on the SDGs, organizations can ensure that they are covering a diverse range of sustainability topics and disclosing all the necessary information to stakeholders. To improve completeness, organizations should strive to include all 17 SDGs and their corresponding targets in their sustainability reports.

References

  1. Loughran, T.; McDonald, B. Textual Analysis in Accounting and Finance: A Survey. J. Account. Res. 2016, 54, 1187–1230. [Google Scholar] [CrossRef]
  2. Loughran, T.; McDonald, B. Textual Analysis in Finance. Annu. Rev. Financ. Econ. 2020, 12, 357–375. [Google Scholar] [CrossRef]
  3. Dyer, T.; Lang, M.; Stice-Lawrence, L. The Evolution of 10-K Textual Disclosure: Evidence from Latent Dirichlet Allocation. J. Account. Econ. 2017, 64, 221–245. [Google Scholar] [CrossRef]
  4. Feng, L. Annual Report Readability, Current Earnings, and Earnings Persistence. J. Account. Econ. 2008, 45, 221–247. [Google Scholar] [CrossRef]
  5. Patelli, L.; Pedrini, M. Is Tone at the Top Associated with Financial Reporting Aggressiveness? J. Bus. Ethics 2015, 126, 3–19. [Google Scholar] [CrossRef]
  6. Ellerup Nielsen, A.; Thomsen, C. Reviewing Corporate Social Responsibility Communication: A Legitimacy Perspective. Corp. Commun. Int. J. 2018, 23, 492–511. [Google Scholar] [CrossRef]
  7. Bae, J.; Yu Hung, C.; van Lent, L. Mobilizing Text as Data. Eur. Account. Rev. 2023, 32, 1085–1106. [Google Scholar] [CrossRef]
  8. Soliman, M.; Ben-Amar, W. Corporate Social Responsibility Orientation and Textual Features of Financial Disclosures. Int. Rev. Financ. Anal. 2022, 84, 102400. [Google Scholar] [CrossRef]
  9. Christensen, H.B.; Hail, L.; Leuz, C. Mandatory CSR and Sustainability Reporting: Economic Analysis and Literature Review. Rev. Account. Stud. 2021, 26, 1176–1248. [Google Scholar] [CrossRef]
  10. Dinh, T.; Husmann, A.; Melloni, G. Corporate Sustainability Reporting in Europe: A Scoping Review. Account. Eur. 2023, 20, 1–29. [Google Scholar] [CrossRef]
  11. Hahn, R.; Kühnen, M. Determinants of Sustainability Reporting: A Review of Results, Trends, Theory, and Opportunities in an Expanding Field of Research. J. Clean. Prod. 2013, 59, 5–21. [Google Scholar] [CrossRef]
  12. European Commission Corporate Sustainability Reporting. Available online: https://finance.ec.europa.eu/capital-markets-union-and-financial-markets/company-reporting-and-auditing/company-reporting/corporate-sustainability-reporting_en (accessed on 10 October 2024).
  13. Bondar, M.; Steinmann, J.; Schena, K.; Kristen, D.; Bhaskar, S.; DP, C.; Patrick, S. How Can the Enterprise Earn Investor Trust Through Sustainability Disclosures? Available online: https://www.deloitte.com/global/en/issues/climate/earning-trust-with-investors-through-better-sustainability-data.html (accessed on 10 October 2024).
  14. Landrum, N.E.; Ohsowski, B. Identifying Worldviews on Corporate Sustainability: A Content Analysis of Corporate Sustainability Reports. Bus. Strateg. Environ. 2018, 27, 128–151. [Google Scholar] [CrossRef]
  15. Tsang, A.; Wang, Y.; Xiang, Y.; Yu, L. The Rise of ESG Rating Agencies and Management of Corporate ESG Violations. J. Bank. Financ. 2024, 169, 107312. [Google Scholar] [CrossRef]
  16. Dimson, E.; Marsh, P.; Staunton, M. Divergent ESG Ratings. J. Portf. Manag. 2020, 47, 75–87. [Google Scholar] [CrossRef]
  17. Berg, F.; Kölbel, J.F.; Rigobon, R.; Koelbel, J.F.; Rigobon, R. Aggregate Confusion: The Divergence of ESG Ratings. Rev. Financ. 2022, 26, 1315–1344. [Google Scholar] [CrossRef]
  18. Chatterji, A.K.; Durand, R.; Levine, D.I.; Touboul, S. Do Ratings of Firms Converge? Implications for Managers, Investors and Strategy Researchers. Strateg. Manag. J. 2016, 37, 1597–1614. [Google Scholar] [CrossRef]
  19. Christensen, D.M.; Serafeim, G.; Sikochi, A. Why Is Corporate Virtue in the Eye of The Beholder? The Case of ESG Ratings. Account. Rev. 2022, 97, 147–175. [Google Scholar] [CrossRef]
  20. Tanimura, J.K.; Okamoto, M.G. Reputational Penalties in Japan: Evidence from Corporate Scandals. Asian Econ. J. 2013, 27, 39–57. [Google Scholar] [CrossRef]
  21. Weber, O.; Scholz, R.W.; Michalik, G. Incorporating Sustainability Criteria into Credit Risk Management. Bus. Strateg. Environ. 2010, 19, 39–50. [Google Scholar] [CrossRef]
  22. Hillman, A.J.; Keim, G.D. Shareholder Value, Stakeholder Management, and Social Issues: What’s the Bottom Line? Strateg. Manag. J. 2001, 22, 125–139. [Google Scholar] [CrossRef]
  23. Samet, M.; Jarboui, A. How Does Corporate Social Responsibility Contribute to Investment Efficiency? J. Multinatl. Financ. Manag. 2017, 40, 33–46. [Google Scholar] [CrossRef]
  24. McWilliams, A.; Siegel, D. Corporate Social Responsibility: A Theory of the Firm Perspective. Acad. Manag. Rev. 2001, 26, 117–127. [Google Scholar] [CrossRef]
  25. Rahman, D.; Kabir, M. Does Board Independence Influence Annual Report Readability? Eur. Account. Rev. 2023, 1–28. [Google Scholar] [CrossRef]
  26. Miller, B.P. The Effects of Reporting Complexity on Small and Large Investor Trading. Account. Rev. 2010, 85, 2107–2143. [Google Scholar] [CrossRef]
  27. Hwang, B.-H.; Kim, H.H. It Pays to Write Well. J. Financ. Econ. 2017, 124, 373–394. [Google Scholar] [CrossRef]
  28. Gentzkow, M.; Kelly, B.; Taddy, M. Text as Data. J. Econ. Lit. 2019, 57, 535–574. [Google Scholar] [CrossRef]
  29. Beretta, V.; Demartini, M.C.; Sotti, F. Board Composition and Textual Attributes of Non-Financial Disclosure in the Banking Sector: Evidence from the Italian Setting after Directive 2014/95/EU. J. Clean. Prod. 2023, 385, 135561. [Google Scholar] [CrossRef]
  30. Bicudo de Castro, V.; Gul, F.A.; Muttakin, M.B.; Mihret, D.G. Optimistic Tone and Audit Fees: Some Australian Evidence. Int. J. Audit. 2019, 23, 352–364. [Google Scholar] [CrossRef]
  31. Runesson, E.; Samani, N. Goodwill or “No-Will”: Hubris in the Tone at the Top. J. Contemp. Account. Econ. 2023, 19, 100331. [Google Scholar] [CrossRef]
  32. Allee, K.D.; Do, C.; Sterin, M. Product Market Competition, Disclosure Framing, and Casting in Earnings Conference Calls. J. Account. Econ. 2021, 72, 101405. [Google Scholar] [CrossRef]
  33. Cazier, R.A.; Pfeiffer, R.J. Why Are 10-K Filings So Long? Account. Horiz. 2016, 30, 1–21. [Google Scholar] [CrossRef]
  34. Chen, Y. Predicting Financial Distress of Listed Companies Based on Information Disclosure Texts: A Study Using Chinese Annual Report Management Discussion and Analysis as Samples. Chin. J. Manag. Sci. 2019, 27, 23–34. [Google Scholar]
  35. Davis, A.K.; Ge, W.; Matsumoto, D.; Zhang, J.L. The Effect of Manager-Specific Optimism on the Tone of Earnings Conference Calls. Rev. Account. Stud. 2015, 20, 639–673. [Google Scholar] [CrossRef]
  36. Jiang, F.; Lee, J.; Martin, X.; Zhou, G. Manager Sentiment and Stock Returns. J. Financ. Econ. 2019, 132, 126–149. [Google Scholar] [CrossRef]
  37. Khan, W.; Ghazanfar, M.A.; Azam, M.A.; Karami, A.; Alyoubi, K.H.; Alfakeeh, A.S. Stock Market Prediction Using Machine Learning Classifiers and Social Media, News. J. Ambient Intell. Humaniz. Comput. 2022, 13, 3433–3456. [Google Scholar] [CrossRef]
  38. Bolognesi, E.; Burchi, A. The Impact of the ESG Disclosure on Sell-Side Analysts’ Target Prices: The New Era Post Paris Agreements. Res. Int. Bus. Financ. 2023, 64, 101827. [Google Scholar] [CrossRef]
  39. Cazier, R.A.; Pfeiffer, R.J. 10-K Disclosure Repetition and Managerial Reporting Incentives. J. Financ. Rep. 2017, 2, 107–131. [Google Scholar] [CrossRef]
  40. He, L.; Gan, S.; Zhong, T. The Impact of Financial Redundancy on Corporate Social Responsibility Performance: Evidence from Chinese Listed Firms. Front. Psychol. 2022, 13, 882731. [Google Scholar] [CrossRef]
  41. Botosan, C.A.; Plumlee, M.A. Assessing Alternative Proxies for the Expected Risk Premium. Account. Rev. 2005, 80, 21–53. [Google Scholar] [CrossRef]
  42. Yan, H. Environmental Information Disclosure, Earnings Quality and the Readability and Emotional Tendencies of Management Discussion and Analysis. Financ. Res. Lett. 2024, 60, 104913. [Google Scholar] [CrossRef]
  43. Herbohn, K.; Walker, J.; Loo, H.Y.M. Corporate Social Responsibility: The Link Between Sustainability Disclosure and Sustainability Performance. Abacus 2014, 50, 422–459. [Google Scholar] [CrossRef]
  44. Fisch, J.E. Making Sustainability Disclosure Sustainable. Georgetown Law J. 2019, 107, 923–966. [Google Scholar]
  45. Orlitzky, M.; Schmidt, F.L.; Rynes, S.L. Corporate Social and Financial Performance: A Meta-Analysis. Organ. Stud. 2003, 24, 403–441. [Google Scholar] [CrossRef]
  46. Nazari, J.A.; Hrazdil, K.; Mahmoudian, F. Assessing Social and Environmental Performance through Narrative Complexity in CSR Reports. J. Contemp. Account. Econ. 2017, 13, 166–178. [Google Scholar] [CrossRef]
  47. Wang, Z.; Hsieh, T.-S.; Sarkis, J. CSR Performance and the Readability of CSR Reports: Too Good to Be True? Corp. Soc. Responsib. Environ. Manag. 2018, 25, 66–79. [Google Scholar] [CrossRef]
  48. Cho, C.H.; Roberts, R.W.; Patten, D.M. The Language of US Corporate Environmental Disclosure. Account. Organ. Soc. 2010, 35, 431–443. [Google Scholar] [CrossRef]
  49. Du, S.; Yu, K.; Bhattacharya, C.B.; Sen, S. The Business Case for Sustainability Reporting: Evidence from Stock Market Reactions. J. Public Policy Mark. 2017, 36, 313–330. [Google Scholar] [CrossRef]
  50. Dhaliwal, D.S.; Li, O.Z.; Tsang, A.; Yang, Y.G. Voluntary Nonfinancial Disclosure and the Cost of Equity Capital: The Initiation of Corporate Social Responsibility Reporting. Account. Rev. 2011, 86, 59–100. [Google Scholar] [CrossRef]
  51. Dhaliwal, D.S.; Radhakrishnan, S.; Tsang, A.; Yang, Y.G. Nonfinancial Disclosure and Analyst Forecast Accuracy: International Evidence on Corporate Social Responsibility Disclosure. Account. Rev. 2012, 87, 723–759. [Google Scholar] [CrossRef]
  52. Lebelle, M.; Lajili Jarjir, S.; Sassi, S. The Effect of Issuance Documentation Disclosure and Readability on Liquidity: Evidence from Green Bonds. Glob. Financ. J. 2022, 51, 100678. [Google Scholar] [CrossRef]
  53. Bonsall, S.B.; Miller, B.P. The Impact of Narrative Disclosure Readability on Bond Ratings and the Cost of Debt. Rev. Account. Stud. 2017, 22, 608–643. [Google Scholar] [CrossRef]
  54. Rennekamp, K. Processing Fluency and Investors’ Reactions to Disclosure Readability. J. Account. Res. 2012, 50, 1319–1354. [Google Scholar] [CrossRef]
  55. Hirshleifer, D.; Teoh, S.H. Limited Attention, Information Disclosure, and Financial Reporting. J. Account. Econ. 2003, 36, 337–386. [Google Scholar] [CrossRef]
  56. Tetlock, P.C.; Saar-tsechansky, M.; Macskassy, S. More Than Words: Quantifying Language to Measure Firms’ Fundamentals. J. Financ. 2008, 63, 1437–1467. [Google Scholar] [CrossRef]
  57. Davis, A.K.; Piger, J.M.; Sedor, L.M. Beyond the Numbers: Measuring the Information Content of Earnings Press Release Language*. Contemp. Account. Res. 2012, 29, 845–868. [Google Scholar] [CrossRef]
  58. Patten, D.M. Intra-Industry Environmental Disclosures in Response to the Alaskan Oil Spill: A Note on Legitimacy Theory. Account. Organ. Soc. 1992, 17, 471–475. [Google Scholar] [CrossRef]
  59. Mahoney, L.S.; Thorne, L.; Cecil, L.; LaGore, W. A Research Note on Standalone Corporate Social Responsibility Reports: Signaling or Greenwashing? Crit. Perspect. Account. 2013, 24, 350–359. [Google Scholar] [CrossRef]
  60. Chen, J.Z.; Li, Z.; Mao, T.; Yoon, A. Global vs. Local ESG Ratings: Evidence from China. SSRN Electron. J. 2022. [Google Scholar] [CrossRef]
  61. Servaes, H.; Tamayo, A. The Impact of Corporate Social Responsibility on Firm Value: The Role of Customer Awareness. Manag. Sci. 2013, 59, 1045–1061. [Google Scholar] [CrossRef]
  62. Skinner, D.J. Earnings Disclosures and Stockholder Lawsuits. J. Account. Econ. 1997, 23, 249–282. [Google Scholar] [CrossRef]
  63. Bose, S.; Ali, M.J.; Hossain, S.; Shamsuddin, A. Does CEO–Audit Committee/Board Interlocking Matter for Corporate Social Responsibility? J. Bus. Ethics 2022, 179, 819–847. [Google Scholar] [CrossRef]
  64. Amihud, Y.; Mendelson, H. Asset Pricing and the Bid-Ask Spread. J. Financ. Econ. 1986, 17, 223–249. [Google Scholar] [CrossRef]
  65. Leftwich, R.W.; Watts, R.L.; Zimmerman, J.L. Voluntary Corporate Disclosure: The Case of Interim Reporting. J. Account. Res. 1981, 19, 50. [Google Scholar] [CrossRef]
  66. Zhao, Y.Y.; Qin, B.; Liu, T. Sentiment Analysis. J. Softw. 2010, 21, 1834–1848. [Google Scholar] [CrossRef]
  67. Lang, M.; Stice-Lawrence, L. Textual Analysis and International Financial Reporting: Large Sample Evidence. J. Account. Econ. 2015, 60, 110–135. [Google Scholar] [CrossRef]
  68. Wang, G.P. Strengthening the Completeness of Information Disclosure of Listed Companies. China Secur. J. 2009, A09. [Google Scholar]
Figure 1. The distribution of Chinese public companies in the sample by industry, 2009–2021.
Figure 1. The distribution of Chinese public companies in the sample by industry, 2009–2021.
Sustainability 16 09270 g001
Figure 2. Evolution of textual attributes over 2009–2021.
Figure 2. Evolution of textual attributes over 2009–2021.
Sustainability 16 09270 g002aSustainability 16 09270 g002b
Table 1. The definition of textual attributes.
Table 1. The definition of textual attributes.
Textual AttributesDefinition
LENGTHThe text length refers to the natural logarithmic number of characters in the report.
READABILITYA combination of sentence length and proportion of long words, similar to the Fog index for English.
TONETone usually refers to the mood or emotional color of a text. It can describe the attitude, emotion, intention, and expression characteristics of reports.
BOILERPLATEBoilerplate refers to the percentage of standardized terms or phrases used in the entire text to convey specific types of information
REDUNDANCYRedundancy refers to the repeated information contained in reports.
COMPLETENESSCompleteness in text attributes refers to the text dataset containing all the necessary information without missing parts or empty values.
Table 2. The measures of textual attributes.
Table 2. The measures of textual attributes.
Textual AttributesMeasures
LENGTH L E N G T H = l o g   ( t h e   n u m b e r   o f   c h a r a c t e r s )
READABILITY R E A D A B I L I T Y = t h e   n u m b e r   o f   c h a r a c t e r s   p e r   l o n g   s e n t e n c e   + l o n g   w o r d s N C h a r a c t e r s × 0.4
TONE S t = i = 1 n w d i p d i
BOILERPLATE B O I L E R P L A T E r a t i o = b o i l e r p l a t e   w o r d s t o t a l   w o r d s
REDUNDANCY R E D U N D A N C Y = r e d u n d a n t   w o r d s t o t a l   w o r d s
COMPLETENESS C O M P L E T E N E S S   =   t h e   n u m b e r   o f   S D G s   d i s c l o s e d   i n   a   r e p o r t
Table 3. Descriptive statistics.
Table 3. Descriptive statistics.
VariablesObs.MeanSD.MinMaxp1p99Skew.Kurt.
MSCI183634.77415.5463.0872.833.0872.830.2132.53
FTSE148129.1611.113105810580.5232.65
SNSI979652.95211.45222.22277.77822.22277.778−0.4033.108
READABILITY10,02116.1753.29711.02829.19411.02829.1941.4545.825
TONE10,0210.7290.0750.4860.8870.4860.887−0.4873.463
LENGTH10,0219.4820.767.99811.3517.99811.3510.292.359
BOILERPLATE10,0210.1590.0850.0270.4860.0270.4861.265.166
REDUNDANCY10,0200.0120.01200.0700.072.48611.005
COMPLETENESS10,0210.1740.0870.0590.5290.0590.5291.1665.266
SIZE10,02123.3941.74220.40429.26520.40429.2650.9164.021
LEV10,0210.5030.2120.0690.940.0690.94−0.0212.3
ROA10,0210.0460.057−0.1530.238−0.1530.2380.3085.887
BOARD10,0212.2030.2261.6092.8331.6092.8330.1693.748
INDEP10,0210.3750.0550.3080.5710.3080.5711.4765.059
DUDAL10,0210.1960.39701011.5333.351
TOBINQ10,0211.831.2380.828.1080.828.1082.77112.105
HHI10,0210.0910.0980.0110.5840.0110.5843.01813.676
LIQUIDITY10,0210.0330.0430.0010.2780.0010.2783.19415.71
LITG10,0210.1460.35301012.0055.021
GLOBAL10,0210.0380.11800.69300.6933.8718.226
SOE10,0210.5560.4970101−0.2261.051
LISTAGE10,0212.3630.77603.33203.332−1.1963.964
Table 4. Matrix of correlation coefficients for the main variable.
Table 4. Matrix of correlation coefficients for the main variable.
MSCIFTSESNSIREADABILITYTONELENGTHBOILERPLATEREDUNDANCYCOMPLETENESS
MSCI1
FTSE0.507 ***1
SNSI0.109 ***0.068 ***1
READABILITY0.200 ***0.161 ***0.066 ***1
TONE−0.099 ***−0.0220.089 ***0.110 ***1
LENGTH0.517 ***0.461 ***0.187 ***0.411 ***−0.0161
BOILERPLATE−0.037−0.028−0.041 ***0.179 ***−0.078 ***−0.030 ***1
REDUNDANCY0.143 ***0.110 ***0.040 ***−0.047 ***0.0040.259 ***−0.053 ***1
COMPLETENESS0.386 ***0.357 ***0.168 ***0.252 ***−0.0110.622 ***−0.028 ***0.135 ***1
Note: * p < 0.1, ** p < 0.05, *** p < 0.01.
Table 5. Regression results of report disclosure level.
Table 5. Regression results of report disclosure level.
(1)(2)(3) (4)(5)(6)
MSCI(t+1)FTSE(t+1)SNSI(t+1) MSCI(t+1)FTSE(t+1)SNSI(t+1)
LENGTH1.0070 **1.4385 ***1.5182 ***COMPLETENESS2.21928.7332 ***11.2881 ***
(2.9461)(4.5676)(8.9978) (1.0225)(3.7397)(8.3304)
MSCIt0.8245 *** MSCIt0.8354 ***
(44.9122) (47.2395)
FTSEt 0.8222 *** FTSEt 0.8312 ***
(29.1310) (30.1200)
SNSIt 0.5179 ***SNSIt 0.5198 ***
(52.8760) (52.9901)
INTERCEPT−26.2510 ***−23.6615 ***5.1250INTERCEPT−22.0732 **−13.0380 *13.8183 ***
(−3.4874)(−3.7996)(1.8355) (−2.9151)(−2.2504)(4.8765)
CONTROLYESYESYESCONTROLYESYESYES
YEARYESYESYESYEARYESYESYES
INDUSTRYYESYESYESINDUSTRYYESYESYES
NUMBER13499578334NUMBER13499578334
Adj. R20.82280.81840.4317Adj. R20.82170.81810.4306
Note: Year and Industry represent fixed year and industry effects. “YES” indicates that the variable is controlled. * p < 0.1, ** p < 0.05, *** p < 0.01.
Table 6. Regression results of report communication style.
Table 6. Regression results of report communication style.
(1)(2)(3) (4)(5)(6)
MSCI(t+1)FTSE(t+1)SNSI(t+1) MSCI(t+1)FTSE(t+1)SNSI(t+1)
READABILITY0.1312 *0.0988 *0.1055 **TONE−2.1507−5.2704 *4.9089 ***
(1.9882)(2.2135)(3.0837) (−0.8062)(−2.1625)(3.4740)
MSCIt0.8352 *** MSCIt0.8372 ***
(47.8215) (47.8154)
FTSEt 0.8496 *** FTSEt 0.8532 ***
(33.0744) (33.9422)
SNSIt 0.5275 ***SNSIt 0.5261 ***
(54.3936) (54.0718)
INTERCEPT−24.6719 ***−20.1288 ***8.3913 **INTERCEPT−21.6940 **−14.8613 *5.8855 *
(−3.3134)(−3.3049)(3.0205) (−2.8497)(−2.4188)(2.0116)
CONTROLYESYESYESCONTROLYESYESYES
YEARYESYESYESYEARYESYESYES
INDUSTRYYESYESYESINDUSTRYYESYESYES
NUMBER13499578334NUMBER13499578334
Adj. R20.82210.81420.4265Adj. R20.82170.81810.4306
Note: Year and Industry represent fixed year and industry effects. “YES” indicates that the variable is controlled. * p < 0.1, ** p < 0.05, *** p < 0.01.
Table 7. Regression results of report specificity.
Table 7. Regression results of report specificity.
(1)(2)(3) (4)(5)(6)
MSCI(t+1)FTSE(t+1)SNSI(t+1) MSCI(t+1)FTSE(t+1)SNSI(t+1)
BOILERPLATE−2.81894.1325 *−1.8776REDUNDANCY3.031512.573727.9223 **
(−1.1765)(2.0922)(−1.4855) (0.1871)(0.8419)(3.2428)
MSCIt0.8382 *** MSCIt0.8384 ***
(48.3312) (48.2555)
FTSEt 0.8568 *** FTSEt 0.8567 ***
(34.3235) (34.2080)
SNSIt 0.5277 ***SNSIt 0.5275 ***
(54.3941) (54.2678)
INTERCEPT−22.3310 **−19.9159 **9.7917 ***INTERCEPT−23.0659 **−18.4850 **9.5876 ***
(−2.9706)(−3.2535)(3.4888) (−3.0761)(−3.0702)(3.4431)
CONTROLYESYESYESCONTROLYESYESYES
YEARYESYESYESYEARYESYESYES
INDUSTRYYESYESYESINDUSTRYYESYESYES
NUMBER13499578334NUMBER13499578334
Adj. R20.82170.81400.4259Adj. R20.82150.81330.4265
Note: Year and Industry represent fixed year and industry effects. “YES” indicates that the variable is controlled. * p < 0.1, ** p < 0.05, *** p < 0.01.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Huang, J.; Wang, D.D.; Wang, Y. Textual Attributes of Corporate Sustainability Reports and ESG Ratings. Sustainability 2024, 16, 9270. https://doi.org/10.3390/su16219270

AMA Style

Huang J, Wang DD, Wang Y. Textual Attributes of Corporate Sustainability Reports and ESG Ratings. Sustainability. 2024; 16(21):9270. https://doi.org/10.3390/su16219270

Chicago/Turabian Style

Huang, Jie, Derek D. Wang, and Yiying Wang. 2024. "Textual Attributes of Corporate Sustainability Reports and ESG Ratings" Sustainability 16, no. 21: 9270. https://doi.org/10.3390/su16219270

APA Style

Huang, J., Wang, D. D., & Wang, Y. (2024). Textual Attributes of Corporate Sustainability Reports and ESG Ratings. Sustainability, 16(21), 9270. https://doi.org/10.3390/su16219270

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop