*3.2. Data Quality Assessment Framework*

Researchers define various frameworks and approaches for data quality assessment. For example, Madnick and Zhu [50], English [51], and Redman [52] explored strategies for increasing data quality, Batini et al. [53] provided a thorough and comparative description of data quality techniques for assessing and improving data quality, Gao et al. [54] proposed a fussing attributes approach for improving uncertain data quality, and Madnick et al. [55] reviewed current practices and research in the field. The research literature describes or defines data quality from simple lists of data quality dimensions to comprehensive frameworks (for example, [24,25,29,56]).

Hassenstein and Vanella [57] presented a data quality encyclopedia for the data life cycle. It describes the data quality dimensions, the data quality evaluation procedure, and the data quality context and practices in various fields. At the same time, Gabr et al. [58] comprehensively defined each traditional and big data quality dimension, metrics, and handling approach with specific definitions. They examined the metrics and methodologies used to monitor and manage each dimension and how they are monitored and managed. The study also examined the most-used data quality dimensions of traditional and large data sets.

Svetlana [59] presented the findings of an expert survey on data quality concerns to demonstrate that it is not required to employ all the numerous dimensions of data quality provided by researchers. However, the essential data quality criteria may be blended for a particular application. The study equips data users and producers with the knowledge necessary to effectively address application-specific data quality issues. In addition to the Svetlana findings, Eliza et al. [60] provided a methodology that allows users to manage data quality and make decisions based on data quality. It eliminates the requirement to fully integrate insufficient data by considering the operational context of the user to enhance a specific element of data quality.

Different approaches from the literature review were summarised to review the wellknown and established frameworks for assessing and improving data quality for different data types. Table 1 lists fourteen data quality frameworks identified from the literature.


**Table 1.** Frameworks identified from the literature review.


#### **Table 1.** *Cont.*

According to the analysis of the frameworks listed in Table 1, the data quality dimensions considered by each framework vary considerably. Some data quality dimensions are recognised by only one framework, whereas specific dimensions appear frequently. For example, the HDQM and OODADQ frameworks considered only two dimensions for assessment, while the frameworks DQA and HIQM considered more than four dimensions. The dimensions varied according to the field of applications and perspective of the application, such as the health care industry, information technology, and business management. For example, let us consider how the accuracy dimension has been used in the HDQM and HIQM frameworks. In the HDQM framework in the IT industry, dimension accuracy is defined as the proximity between a value "v" and another value "v." of the domain D in the user interface development. This is regarded as the correct representation of the real-world phenomenon value "v" seeks to represent. At the same time, the HIQM framework in the business management sector defines accuracy as the value difference between two

databases containing the same value as the correct representation of the real-world value. To understand the most critical dimensions applied in the various fields, the frequency of usage by different data quality dimensions was considered and is shown in Figure 1. Only dimensions used more than once are considered in the figure. The study of Figure 1 helps finalise the dimensions from the literature review perspective to be identified in the data quality in the semiotic framework for assessing highway infrastructure data. The semiotic approach data quality framework is the most applicable of the 14 frameworks mentioned above for evaluating highway infrastructure data. The reason for the selection is explained in the semiotic framework section.

**Figure 1.** Number of frameworks that used specific data quality dimensions.
