5.2.4. Pragmatic Category

The pragmatic category focused on how individuals use information. It concerns the relationship between data, information, and behaviour in each context. For ranking the dimensions within the pragmatic category, the percentage position of the ranks was calculated using Equation (1), as shown in Table 7. The dimension value of data was given the highest priority over the other dimensions, such as relevant and appropriateness. In the context of highway stakeholders, the dimension value is crucial because it determines the extent to which the data can inform decision-making about highway infrastructure projects, budgeting, and maintenance. Although the dimension appropriateness is critical, it was ranked third in this context because it is a prerequisite for both relevance and value. As per the Garrett ranking technique, the dimensions were ranked as value, relevant, and appropriateness, respectively, as shown in Table 10.

**Table 10.** Ranking of dimensions within the pragmatic category of semiotic framework.


The dimensions were ranked to understand the decision-makers' data quality requirements for decision-making at the individual decision-making levels [30]. As the level of decision-making in the organisation changes, the priority of data quality also changes. At the strategic level, decision-makers focus on policymaking, which could be implemented

throughout the organisation. Hence, the data quality requirements at the strategic level differ at the network and project levels. It is important to note that this study utilised semiotic-based quality dimensions to assess data quality at different decision-making levels from the data users' perspective. This proactive assessment of the highway management decision-making hierarchy allows data collectors to determine the level of data quality requirements of highway infrastructure managers and potential decision-makers in a more integrated manner. It allows highway agencies' data management teams to identify the causes behind minimal data usage to improve the quality of generating information and supporting decisions.

#### **6. Conclusions**

This research was conducted in a multidisciplinary framework that included three primary fields: data quality, big data, and highway infrastructure project data. Even though data quality has been a well-studied topic for the past two decades, the precise terminology for data quality aspects is still lacking. Digitalisation and data management in construction, particularly highway infrastructure, is a developing topic in India, with a scant prior study focusing on data quality. Using data quality dimensions as part of data governance projects is undoubtedly crucial, as it ensures that data users and stakeholders may derive the most significant benefit from data usage. The research discussed in this paper aims to investigate a framework in which data quality dimensions could be more important within the context of highway infrastructure projects in the construction sector. The semiotic framework was adopted from the literature review of various data quality frameworks for this study to establish data quality dimensions for highway infrastructure data. The systematic literature review, semiotic framework, and Garrett ranking were chosen as research methods because of the increasing novelty of vast quantities of data quality and highway infrastructure data, as well as the impracticality of implementing other research methods due to geographical, legal, ethical, and organisational constraints.

Accuracy, accessibility, and consistency are well-discussed data quality dimensions that are supported by the results. Based on this research, the data quality dimensions of completeness and timeliness were added to the three previously mentioned data quality dimensions to produce a list of the five most appropriate data quality dimensions for highway infrastructure data in the construction industry. Considering the results of the semiotic framework of the hierarchical data quality dimensions for the overall highway project data, the contextual category of data quality dimensions was considered to be the most crucial for evaluating data quality. This is easily explained by the breadth of the three domains involved (i.e., data quality, big data, and highway infrastructure data), where thousands of unique data applications used in the highway infrastructure database are possible. Thus, each application's probability of selecting different data quality dimensions increases.

The current research study provides a ranking of the most critical data quality dimensions in the specific context of highway infrastructure projects, as shown in Table 3. This is one of the first studies within this field to use the semiotic framework to achieve this. This research study also considered the level of importance at each decision-making level of the hierarchy, as shown in Table 4. Considering the very contextual nature of data quality, different contexts would be expected to produce a different list of the most critical data quality dimensions. Thus, the study also provided the ranking of the dimensions within the semiotic framework categories using the Garrett raking technique to understand the priorities of the stakeholders.

The comparatively little amount of literature, and more significantly, publications with the perspective of highway infrastructure data, is one of the most significant limitations of this study. Planned are additional research methods that could be applied to the same corpus of literature, with the primary objective of reducing the amount of author bias introduction when evaluating the significance of the other data quality frameworks.

This study serves as a foundation for further research by the authors in highway infrastructure to assess overall data usage in terms of significant data quality using data quality dimensions as features for assessing the current data quality satisfaction levels at decision-making levels from the data users' perspective. There is a need for agencies and data management teams to assess the root cause of the minimal usage of data to improve the quality of generating information and supporting decisions, and they are also required to show the interdependency of various decisions in the final output of a project and address the potential data users' requirements. In ongoing research, the semiotic framework provides a theoretical foundation for developing an instrument, i.e., data quality dimensions, to access the subjective quality of highway project data. The development of quantitative indices for each data quality dimension to quantify the quality would eventually help to develop the decision-making competency of decision-makers. This would help the organisation in the effective execution of projects without delaying the projects and avoid losses due to wrong decisions. By using data quality dimensions as features for machine learning algorithms, further work will distinguish quality data from non-quality data from very large streams of highway datasets. Finally, the ten main data quality dimensions identified serve as a foundation for determining which machine learning algorithms might identify data usage more effectively. Following this, a computationally efficient method for optimum data usage will be designed to use data effectively.
