Data Quality: A Negotiator between Paper-Based and Digital Records in Pakistan’s TB Control Program

Ali, Syed Mustafa; Naureen, Farah; Noor, Arif; Kamel Boulos, Maged N.; Aamir, Javariya; Ishaq, Muhammad; Anjum, Naveed; Ainsworth, John; Rashid, Aamna; Majidulla, Arman; Fatima, Irum

doi:10.3390/data3030027

Open AccessArticle

Data Quality: A Negotiator between Paper-Based and Digital Records in Pakistan’s TB Control Program

by

Syed Mustafa Ali

^1,*

,

Farah Naureen

¹,

Arif Noor

¹,

Maged N. Kamel Boulos

²

,

Javariya Aamir

¹,

Muhammad Ishaq

¹,

Naveed Anjum

¹,

John Ainsworth

³,

Aamna Rashid

¹,

Arman Majidulla

⁴ and

Irum Fatima

¹

Mercy Corps, Islamabad 45550, Pakistan

²

Moray College, University of the Highlands and Islands, Elgin IV30 1JJ, Scotland, UK

³

Division of Informatics, Imaging & Data Sciences, University of Manchester, Manchester M13 9PT, UK

⁴

Interactive Research and Development, Karachi 75300, Pakistan

^*

Author to whom correspondence should be addressed.

Data 2018, 3(3), 27; https://doi.org/10.3390/data3030027

Submission received: 23 May 2018 / Revised: 16 July 2018 / Accepted: 18 July 2018 / Published: 19 July 2018

(This article belongs to the Special Issue Data Management Strategy, Policy and Standard)

Download

Browse Figures

Versions Notes

Abstract

Background: The cornerstone of the public health function is to identify healthcare needs, to influence policy development, and to inform change in practice. Current data management practices with paper-based recording systems are prone to data quality defects. Increasingly, healthcare organizations are using technology for the efficient management of data. The aim of this study was to compare the data quality of digital records with the quality of the corresponding paper-based records using a data quality assessment framework. Methodology: We conducted a desk review of paper-based and digital records over the study duration from April 2016 to July 2016 at six enrolled tuberculosis (TB) clinics. We input all data fields of the patient treatment (TB01) card into a spreadsheet-based template to undertake a field-to-field comparison of the shared fields between TB01 and digital data. Findings: A total of 117 TB01 cards were prepared at six enrolled sites, whereas just 50% of the records (n = 59; 59 out of 117 TB01 cards) were digitized. There were 1239 comparable data fields, out of which 65% (n = 803) were correctly matched between paper based and digital records. However, 35% of the data fields (n = 436) had anomalies, either in paper-based records or in digital records. The calculated number of data quality issues per digital patient record was 1.9, whereas it was 2.1 issues per record for paper-based records. Based on the analysis of valid data quality issues, it was found that there were more data quality issues in paper-based records (n = 123) than in digital records (n = 110). Conclusion: There were fewer data quality issues in digital records as compared with the corresponding paper-based records of tuberculosis patients. Greater use of mobile data capture and continued data quality assessment can deliver more meaningful information for decision making.

Keywords:

mHealth; mobile data collection; data quality; data quality assessment framework; Tuberculosis Control; developing countries

1. Introduction

With an increased adoption of performance indicators for monitoring the healthcare delivery systems, the need for high-quality data generation has also increased [1]. Health information management systems are intended to provide the right information to their users through feedback and data sharing, and are designed for facilitating data-driven decisions, policy making, and health planning [2].

Improving the quality of healthcare data is beneficial in many ways, such as in making informed decisions about service delivery, ensuring patient safety, conducting research, informing patients regarding their illness and care, and measuring effectiveness of the clinical pathways. Sharing data within and across departments or organizations can provide much needed evidence about healthcare community needs [3], offering a reliable summary of the true health status of patients and the community, and guiding policy makers in making healthcare system adjustments as necessary [4]. Similarly, the cornerstone of the public health function is to identify healthcare needs, to influence policy development, and to ensure that healthcare services are equitably provided [5].

Organizations rely heavily on various data resources for the effective and efficient management of their operational processes. However, the volume and complexity of some data resources can make them susceptible to defects that can reduce data quality [6] and result in higher operational costs [7]. Data quality (DQ) management aims at objectively measuring quality, with particular emphasis on various data quality aspects [8,9]; therefore, many DQ management approaches exist that utilize different perspectives and have been adopted by organizations [10].

In medical and public health communities, documentation is a critical aspect of DQ and quality of care. Complete documentation records the history of the clinical pathway and its outcomes or effectiveness in providing decision support to healthcare providers. Documentation is commonly maintained in paper-based format in low resource settings [11,12]. Previous research has shown that paper-based information systems tend to produce low-quality data and result in limited or less than optimal data use [13]. The quality of care and quality improvement planning are adversely affected in the case of paper-based information systems; for example, illegibility, incompleteness, and poor organization of records are problems often plaguing the paper format [14].

On the other hand, the benefits of maintaining digital records in healthcare, such as rapid data sharing, reduced paperwork, lower incidence of medical errors, and cost savings, have been commonly discussed in the literature [15,16,17]. Furthermore, with proper digital data security and handling provisions implemented, the degree of patient data confidentiality and privacy protections obtained with digital records can exceed that afforded by any paper-based system [18].

Many organizations have started using technology for efficient data management because of the huge quantities of data that are involved in their operational processes [9]. Among these technologies, mobile health (mHealth) technologies have gained particular attention for digital data capturing in the public health domain [19]. However, in the absence of an adequate DQ improvement strategy, it becomes challenging to translate data into meaningful information and later into programmatic and strategic decisions [9]. Moreover, a data quality assessment framework (DQAF) is a vital constituent of an effective DQ improvement strategy [20,21].

Despite efforts in improving the data quality of paper-based records, the overall data quality remains low, especially in the developing countries. In most of the developing countries, data quality defects are because of the information system’s inability to detect and prevent errors. In addition, these countries do not adapt context-specific data quality measurement as a usual approach. Because Mercy Corps Pakistan is digitizing data collection and setting up a computerized management information system for its Tuberculosis Control Program, the objective of this study is to compare data quality in digital records and their corresponding paper-based records.

2. Methods

2.1. Sample Description

Supported by the Global Fund, Mercy Corps Pakistan undertook a mHealth initiative in the public–private mix (PPM) model of the TB control program in Pakistan. In the PPM model, all registered care providers are providing free treatment and diagnostic services for TB patients. The sample for this study included six clinics (or six healthcare providers) that qualified for inclusion uniquely because paper-based and digital recording systems were managed simultaneously at each of them. The initiative was focused within the limited geographic areas of the intervention districts (Narowal and Chiniot) and represented by six clinics (three in each district).

Mobile Application for Physician–Patient–Lab Efficiency (MAPPLE) was developed using CommCare platform (https://www.commcarehq.org/), which is open source code that can work well with Java-enabled phones. It is an extension to the JavaROSA codebases (code.javarosa.org) that supports a range of mobile data collection applications in low-income countries. MAPPLE is a mHealth application loaded with TB-related forms that allows users to enter data on the application and share data with a remote cloud server (Figure 1).

University graduates from the United States extended their support to Mercy Corps in developing MAPPLE (mHealth application) for the TB Control Program in Pakistan. Application design and development phase could not use participatory approach because prospective application users and developers were not co-located. However, MAPPLE was tested and re-designed (based on feedback) before its actual use.

Before the enrollment of healthcare providers, it was agreed that completing both paper and digital records would be their responsibility during the pilot phase. At each clinic, paramedic staff were given responsibility and there was no incentive for the application user. Each paramedic staff was given a smartphone with MAPPLE deployed on it during the month of March 2016.

2.2. Data Collection

Paper-based patient treatment cards (TB01 card) prepared during the study period of four months (April 2016–July 2016) were requested from the six enrolled clinics of the Narowal and Chiniot districts. These enrolled clinics are operated by private and primary healthcare providers, where only one clinician conducts clinical assessment and is helped by support staff, whereas support staff manage medical stock inventory and patient recording registers. Generally, these healthcare providers are not regulated by health authorities and clinical documentation is also not mandatory.

During the study period, support staff collected both digital and handwritten data. The copies of TB01 cards were compared with the corresponding digital records retrieved from the server. The TB01 card contains data fields that are representative of the patient’s profile, and clinical and diagnostic details. The TB01 card captures a multi-visit report of a patient’s treatment expanded over a period of either six or eight months, depending upon the category of TB patient (CAT I and CAT II). Data fields representing each data category are summarized in Table 1.

2.3. Data Quality Assessment Method

Prior to analysis, an approach for logical and comprehensive review was developed and a desk review of the collected paper-based and corresponding digital records of the same service delivery points was conducted. All data fields of the TB01 card were input into a spreadsheet-based template to undertake field-to-field comparisons of the shared data fields between TB01 card (paper-based data) and MAPPLE (digital) data (Table 2). Upon culmination of the review, non-matching data fields were ordered into classifiable and non-classifiable issues. Classifiable issues were categorized according to the context-specific data quality dimensions, for example, completeness, accuracy, consistency, understandability, and timeliness; the details of which are reported elsewhere [22]. Non-classifiable issues were those differences for which correctness or completeness could not be determined without contacting the patient. For example, difference in reported age noted in two formats (digital and paper-based) can only be corrected if the patient is contacted for this purpose.

2.4. Data Analysis

The operational definitions of the identified data quality dimensions were applied to the data variances for classification purposes. The non-matching fields between paper-based and digital records were regarded as a data quality issue. Each issue was attributed to either paper-based record or digital record, hence called a classifiable issue. There were issues occurring due to application design modifications; as these issues were emerging because of technology shortcomings or application workflow, which was not aligning clinical workflow, they were excluded from the main dataset. The data quality issues in both paper-based and digital records were recorded against each of the data quality dimension, entered in an Excel sheet. In addition to basic descriptive statistical analyses, a test of proportion was conducted to test the significance of results.

2.5. Ethical Considerations

In Pakistan, ethical approval is only required for experimental research involving humans and this study is exempt as it does not qualify as experimental research. However, the study followed all of Mercy Corps’ established confidentiality guidelines (https://www.mercycorps.org/research-resources) and was carefully checked by the Monitoring, Evaluation, and Learning Unit and Data Controller of Mercy Corps Pakistan.

2.6. Study Findings

2.6.1. Comparison of the Paper-Based and Digital Records

During the study period, April 2016–July 2016, a total of 117 TB01 cards were prepared at six enrolled sites, including 68 TB01 cards from three sites in Chiniot district and 49 TB01 cards from three sites in Narowal district. Only 50% of records (n = 59; 59 out of 117 TB01 cards) were digitized by paramedics and sent to the server, which is a rather low use of the mHealth application (MAPPLE) for the purpose of data collection. The TB01 card and MAPPLE had 21 data fields in common, hence the total of 1239 (n = 59 × 21) comparable data fields that were available for analysis (Figure 2).

Out of the 1239 data fields, 65% (n = 803) were found to be correctly matched across paper-based and digital records. However, 35% of data fields (n = 436) had anomalies either in paper-based records or in digital records. Among the data anomalies, 67% were classifiable (292 out of 436) and 33% were non-classifiable issues (144 out of 436). Non-classifiable issues were the differences in data fields that could not be clearly attributed as an issue neither in the paper-based record nor in the corresponding digital record. Discrepancies in comparable data fields, such as different paper versus digital values for patient’s contact number, national identification number, age, weight, and lab serial number could not be settled until feedback from the provider or patient was taken (which was not possible in this study as researchers had no access to the patients in question). These mismatches were therefore categorized as non-classifiable issues. For example, if a patient’s age in the paper-based record is 34 and the age of the same patient in the digital record is 42, then this difference was categorized as a non-classifiable issue.

Similarly, classifiable issues were those differences in data fields that could be attributed as an issue either in the paper-based record or in the digital record. For example, if the age of the patient is given in the digital record, while in the corresponding paper-based record, this field was left empty, then this is considered a paper-based record completeness issue.

In an effort to integrate data collection and care delivery processes, within the study period, various design modifications of the data entry forms took place (e.g., making fields ‘required’, re-organizing questions, adding new forms or questions to capture missing information), in response to feedback received from application users. Among the classifiable issues, a sub-set of data (n = 59) was excluded from the analysis because it had been affected by these design modification activities (Table 3). Therefore, only valid issues (n = 110) of the digital records (DRs) were compared with issues recorded in the paper-based records (PBR). The distribution of excluded issues in the digital records that occurred as a result of change in the application design is shown in Table 4.

Overall, 1.9 DQ issues were calculated per digital patient record, whereas the corresponding figure was 2.1 issues per single paper-based record. Additionally, at the beginning of the study, the number of issues per digital and paper-based records was 1.5 and 2.2, respectively, but these figures later dropped down to 0.7 and 1.4 issues per record, respectively, by the end of the study period. Based on the analysis of valid data quality issues, it was found that there were more DQ issues in the paper-based records (n = 123) than in the digital records (n = 110). A month-by-month comparison of the data showed that April had significantly different entry errors between DR and PBR. In the case of April, errors in the paper-based records significantly exceeded those in the digital records. All other months under consideration were not significantly different. The difference between months among the digital records showed a significant improvement (p-value = 0.0328), while no significant improvements were observed in the case of the paper-based records over time (p-value = 0.0629).

2.6.2. Analysis of Non-Classifiable Issues

Table 5 lists all 13 data fields where differences were recorded, but not settled because of patients’ confidentiality concerns (no researchers’ access to patients). Patient’s age was the data field in which most differences were observed, that is, n = 47. However, differences in patient’s weight (n = 22), among others, were critically important in relation to effective case management, because of the clinical significance of body weight value, its use in patient condition monitoring, and its potential to affect certain treatment decisions. Issues with the patient identifier code (n = 20) were also of considerable significance.

2.6.3. Analysis of Classifiable Issues

All valid classifiable issues (excluding those issues that occurred because of the aforementioned design modifications) were further categorized according to data quality dimensions as shown in Figure 3. Overall, there were more completeness issues (n = 148; 63.5%), followed by timeliness (n = 44; 19%), accuracy (n = 30; 13%), understandability (n = 10; 4%), and consistency issues (n = 1; 0.5%) in the set of valid classifiable issues.

The detailed findings of the data quality assessment exercise are presented below, categorized by data quality dimension.

Classifier 1: Completeness

An operational definition of completeness is “information having all required parts of an entity’s description” [23].

Data completeness issues were found in both datasets; however, there were more such issues in the paper-based medical records, that is, 58% of all observed data completeness issues (Figure 4). Upon further analysis, if was found that patient Name (n = 9) and address (n = 14), and treatment supporter name (n = 11) and address (n = 12), were the digital data fields that showed more issues of completeness. In the paper-based records, the top completeness issues were in patient’s address (n = 10), type of referral (n = 11), and laboratory examination date (n = 7). Therefore, it can be said that most of the encountered data completeness issues were in the patient profile data types that allowed free text input, and hence were more prone to errors. However, there were relatively less observed completeness issues in clinical and diagnostics data types, except for laboratory examination date.

Classifier 2: Accuracy

Applying understanding of the field of practice (tuberculosis treatment) and work settings, accuracy can be defined as “the degree to which data correctly describe the “real world” object or event being described” [24].

Accuracy is one of the key data quality dimensions that helps the data user in building trust in data representativeness. Data-field-level analysis showed that most of such issues were found in the digital records (Figure 5). Out of a total of 30 observed accuracy issues, 77% (n = 23) were found in the digital records, and most of these issues were in patient identifier code (n = 12) and national identity card number (n = 6).

Classifier 3: Consistency

By consistency, we mean that the “representation of data values remains the same in multiple data items in multiple locations” [25].

Consistency was the least reported issue type in our set, with only one issue found in the paper-based records (Figure 6).

Classifier 4: Understandability

Utilizing the “fitness-for-use” perspective, understandability can be defined as “the statement or the term that has clear or specific meaning” [1].

Under understandability, the findings of this data quality assessment exercise can be mainly linked to one of the most common issues associated with paper-based records, namely illegibility of handwriting. There were a total of 10 understandability issues in our set, and most of them were spotted in the paper-based records (n = 8; 80%), as shown in Figure 7.

Classifier 5: Timeliness

Under the MAPPLE mHealth initiative, timeliness means that “shared data should be as near to real-time as possible. Thus, data should be timely, in that it relates to the present” [26].

Though the general principles of informatics encourage the integration of application and clinical workflows, technology use also ensures the timeliness of data recording and reporting. However, in our studied set, there were slightly more timeliness issues observed in digital records than in paper records as shown in Figure 8 (DR = 23; PBR = 21; difference = 2), which is a clear indication of the weak integration between workflow processes. Besides the importance of integrating workflows, treatment start date and lab exam date are also of critical importance for achieving the desired health outcomes monitoring of treatment timeline. It was observed that all of these issues (n = 44), either in paper-based records or in digital records, were in those data fields storing treatment start date and follow-up evaluation dates.

3. Discussion

Global evidence identifies high data quality as a necessary condition for the delivery of quality healthcare [27]. In developing countries, health information systems are needed to tackle the growing public health concerns, as current paper-based documentation systems are becoming increasingly inadequate [28]. Therefore, mHealth technology is being implemented in the public health settings of developing countries.

This study looked at the paper-based records and their corresponding digital records at the six points or locales of TB care that have started using a mobile data collection application (MAPPLE) from March 2016. As a theoretical framework is helpful in addressing data variability issues [29], we used a data quality assessment framework to assess data quality. According to the study’s findings, digital records have generated better data quality in the first quarter of their implementation. On the other hand, despite years of staff practice in maintaining the paper-based patient record, our assessment results showed relatively poor data quality associated with handwritten paper forms.

Moreover, relatively low (50.4%) use of MAPPLE in data collection can be explained by overburdening of the data collection workflows, hence resulting in frustration of the involved staff. Additionally, factors such as unregulated and non-standardized practices in developing countries, and non-incentivized data collection in private healthcare settings are possible reasons for low mHealth adoption.

Currently, in the public–private mix model of TB care delivery, there are multiple stakeholders representing different levels of the management within an organization and across different organizations. The complexity in the management structure demands a high level of collaborative relationship between different management units [29]. However, the problem of management complexity can be addressed if different organizations have a similar level of direct control over the data they generate during their normal care and management procedures [30]. Hence, all stakeholders get an equal opportunity for the data quality review. Therefore, organizations will start producing high quality data by strategizing the use of a data quality assessment framework.

Data quality issues were found in all three data types: patient profile, and clinical and diagnostic data. Issues in the clinical variables are of critical importance [18]. As part of data quality improvement strategy, there should be a mechanism to flag disparities in the clinically important data fields [31]. Errors in clinical practice are sometimes attributed to medical documentation errors in paper-based records [32], but digital records, when not properly designed and implemented, can equally suffer from data inaccuracies leading to medical errors [18]. Furthermore, it is critically important to receive complete and correct patient information, which is achievable if mHealth technology is fully exploited beyond the mere basic functions of digital data collection, storage, retrieval, and sharing [31,33].

3.1. User Adoption and Acceptance Issues of Digital Data Collection

Though there was some improvement in the data quality of digital records over the study period of four months, there was also a gradual decrease in the use of MAPPLE (mobile application). This might be because of frequent application design modifications and non-incentivized data collection. The current use of the MAPPLE, used primarily for data collection at the six study sites, is inconsistent and without any supportive supervision or management’s active role in ensuring the regular use of the application. Additionally, no reward mechanism was introduced to encourage application use for the purpose of data collection.

It has been observed that data collection, digitization, and aggregation are increasingly difficult tasks in developing countries [34] because of the lack of incentive programs [35]. Additionally, application design considerations should include making all required functions available on the user’s device in a highly usable and intuitive fashion [36]. Applications should be designed with full user involvement from the early design stages and throughout the application’s lifecycle, including its regular maintenance and updates. Applications should seamlessly integrate with existing clinical workflows, improving rather than overburdening them, and taking into consideration the already high work and cognitive loads of most healthcare professionals today [37]. Free text input should be kept to a minimum in digital forms (also to avoid errors), and clear and comprehensive choices should be offered instead for users to select from them. Integrity and validation checks should be built into digital forms. Other strategies for minimizing user input, reducing errors, and improving acceptance include cross-linking relevant databases to ‘autocomplete’ certain fields where applicable, based on values entered in other fields.

Improving data quality is task-dependent and includes aligning data collection processes, operationalizing quality improvement strategy, and building capacity for those responsible for data entry and review. Therefore, with an application like MAPPLE, there is a wide range of organizational and system-specific factors that may affect the adoption of healthcare information technology [38].

3.2. Novel Contribution, Replicability and Generalizability of the Work beyond the Six Study Locales

The novel contribution made by this study concerns our model of using an assessment framework that is inclusive of the management perspective and is more relevant to local work settings and field of practice. We believe that a meaningful assessment would not have been possible had we opted to use existing frameworks (generic or developed for other contexts), as only the local data users can conceptualize and contextualize data quality [21,29]. Long ago, it was identified that the definition of data quality varies between users, locales, and contexts, which makes the data quality concept multi-dimensional and complicated [39,40]. Considering this, a similar approach was also used elsewhere [1,41].

Though the study included six participating primary healthcare clinics, it is observed that across the country, the characteristics of clinics and their clinical and data management practices remain nearly the same [42]. The private healthcare system remains largely un-regulated because of a lack of interest of public health authorities [43]. This provides non-governmental organizations (NGOs) with an opportunity to bridge the gap between service need and service provision [43]. As the public–private mix model is working in 65 districts of Pakistan, our approach can be replicated in other districts of the country (and other countries sharing our settings) when digitization and data quality improvement plans are rolled out in those places.

3.3. Follow-Up on Current Work

The current work was completed as part of mHealth initiative within the Tuberculosis Control Program of Mercy Corps, Pakistan. A context-specific data quality assessment framework was developed [22] to report the field-to-field review and comparison of digital record with corresponding paper-based records. Mercy Corps also conducted operational research to examine external and organizational factors that have affected the adoption level of the mHealth application. As also discussed in this study, unregulated private healthcare practice is the biggest challenge, hence data collection is not given importance in routine clinical practice. Because of the work burden of healthcare providers, stakeholders from outside the clinical practice are identified and involved in the mHealth initiative. Additionally, results of the current work are being used iteratively to refine mHealth initiative (MAPPLE) and its expansion plan is already developed. The mHealth initiative does not only emphasize data collection, it also includes elements in application design that will fulfil the information needs of the users in their routine work.

3.4. Research Implications

This study included a review of patient records in paper and digital formats, and concluded that, in the studied set of records, digital data were of moderately better quality compared with data from the corresponding paper-based records. For significant and sustained improvement in data quality, the study emphasized the improved technology adoption supported by the incentive program. The present study also identifies the need for iterative revisions so that successful transition from paper-based to digital records is achieved. Despite engaging users in design and development phases, sufficient time for application development and iterations, given by detailed feedback of users, should be incorporated [44].

4. Strengths and Limitations of the Study

The scope of study included a comprehensive review and comparison of paper-based and digital data to identify quality issues and to categorize the identified issues into classifiable and non-classifiable ones.
The strength of the present work is its usefulness in developing a case for implementation agencies for expanding their digital health initiatives, particularly for data collection.
As a result of patients’ information confidentiality concerns and provisions (researchers had no access to or contact with the patients), the researchers were unable to categorize non-classifiable issues (those data that would have required contacting the patient to verify them), which can be considered as a limitation of the current study. Nonetheless, we demonstrated the need for putting in place an adequate data quality improvement strategy so that reliability and sanity of healthcare data can be fully achieved.
With the limited human and other resources in the enrolled clinics, running two systems (paper-based and digital) in parallel during the study period might have caused frustration among clinic staff. Overburdening the data collection workflows of the involved staff might have also been a reason for the relatively low (50.4%) overall use of MAPPLE in data collection. With sufficient incentives in place and a complete switch to a digital format (following any necessary tweaking and optimization of MAPPLE), digital data collection rates can greatly improve in the future.

5. Conclusions

Overall quality of digital records is moderately better than the quality of paper-based records. Therefore, in addition to the presence of a data quality improvement strategy, the data quality assessment should also be introduced as routine practice. Likewise, considering the inherent ability of the technology in improving data quality, design modifications and workflow optimization and integration should also be considered essential for the adotion of mHealth technology. Efforts towards improving adoption levels should be concentrated on system-level initiatives, such as regulation of private practice, incentivizing data collection, and making data collection an essential part of private clinical practice. Consequently, strengthening of the information management system would help organizations in building trust in data, and making evidence-based and informed decisions about health policy and practice.

Supplementary Materials

The following are available online at https://www.mdpi.com/2306-5729/3/3/27/s1.

Author Contributions

S.M.A. wrote the first draft of this paper. The idea was conceived by N.A., S.M.A., F.N., A.N., A.R. S.M.A., N.A. and J.A. discussed and formulated an analysis plan. J.A., M.I. and I.F. conducted data review and analysis and later, F.N. and A.R. checked data review process for consistency. M.N.K.B. provided critical expert input to the manuscript throughout, and helped with the discussion, interpretation, final presentation, and ‘putting in perspective’ of the study findings. A.M. conducted statistical analysis. M.N.K.B., J.A., A.N. and N.A. proofread the final manuscript, and changes were made based on their feedback. All the authors read the final manuscript before submission.

Funding

This research received no external funding.

Acknowledgments

The authors would like to acknowledge the efforts of District Field Supervisors (Hafiz Yawer Bashir and Sohail Akhtar) and Regional Coordinators of the respective districts (Zaheer Sattar and Maliha Batool) in making data available for the analysis.

Conflicts of Interest

The authors declare no conflict of interest.

References

St-Maurice, J.; Burns, C.M. A method for developing data quality measures and metrics for primary health care. In Proceedings of the International Symposium on Human Factors and Ergonomics in Health Care: Improving the Outcomes; Sage Publications: New Delhi, India, 2016. [Google Scholar] [CrossRef]
Ndabarora, E.; Chipps, J.A.; Uys, L. Systematic review of health data quality management and best practices at community and district levels in LMIC. Inf. Dev. 2013, 30, 1–18. [Google Scholar] [CrossRef]
Kerr, K.A.; Norris, T.; Stockdale, R. The strategic management of data quality in healthcare. Health Inform. J. 2008, 14, 259–266. [Google Scholar] [CrossRef] [PubMed]
Haux, R. Health information systems past, present, future. Int. J. Med. Inform. 2006, 75, 268–281. [Google Scholar] [CrossRef] [PubMed]
Walker, R. Health information and public health. Health Inf. Manag. J. 2008, 37, 4–5. [Google Scholar] [CrossRef] [PubMed]
Even, A.; Shankaranarayanan, G. Dual assessment of data quality in customer databases. ACM J. Data Inf. Qual. 2009, 1, 1–29. [Google Scholar] [CrossRef]
Redman, T.C. Data: An unfolding quality disaster. DM Rev. 2004, 14, 21–23. [Google Scholar]
Cappiello, C.; Francalanci, C.; Pernici, B. Data quality assessment from the user’s perspective. In Proceedings of the 2004 International Workshop on Information Quality in Information Systems, Paris, France, 18 June 2004; pp. 68–73. [Google Scholar] [CrossRef]
Madnick, S.E.; Wang, R.Y.; Lee, Y.W.; Zhu, H. Overview and framework for data and information quality research. ACM J. Data Inf. Qual. 2009, 1, 1–22. [Google Scholar] [CrossRef]
Glowalla, P.; Sunyaev, A. Process-driven data quality management: A critical review on the application of process modeling languages. ACM J. Data Inf. Qual. 2014, 5, 1–30. [Google Scholar] [CrossRef]
Agyeman-Duah, J.N.A.; Theurer, A.; Munthali, C.; Alide, N.; Neuhann, F. Understanding the barriers to setting up a healthcare quality improvement process in resource-limited settings: A situational analysis at the medical department of Kamuzu Central Hospital in Lilongwe, Malawi. BMC Health Serv. Res. 2014, 14, 1. [Google Scholar] [CrossRef] [PubMed]
Ogunsola, O.O.; Aburogbola, F.; Olajide, O.; Ladi-Akinyemi, B. Clinical documentation and doctor: Is it a challenge in HIV care? Experience of four new comprehensive HIV sites in Oyo State, Nigeria. Adv. Trop. Med. Public Health Int. 2015, 5, 77–89. [Google Scholar]
Lium, J.T.; Tjora, A.; Faxvaag, A. No paper, but the same routines: A qualitative exploration of experiences in two Norwegian hospitals deprived of the paper based medical records. BMC Med. Inform. Decis. Mak. 2008, 8, 2. [Google Scholar] [CrossRef] [PubMed]
Pourasghar, F.; Malekafzali, H.; Koch, S.; Fors, U. Factors influencing the quality of medical documentation when a paper-based medical records system is replaced with an electronic medical records system: An Iranian case study. Int. J. Technol. Assess. Health Care 2008, 24, 445–451. [Google Scholar] [CrossRef] [PubMed]
Wager, K.A.; Lee, F.W.; White, A.W.; Ward, D.M.; Ornstein, S.M. Impact of an electronic medical record system on community-based primary care practices. J. Am. Board Fam. Pract. 2000, 13, 338–348. [Google Scholar] [PubMed]
Makoul, G.; Curry, R.H.; Tang, P.C. The use of electronic medical records: Communication patterns in outpatient encounters. J. Am. Med. Inf. Assoc. 2001, 8, 610–615. [Google Scholar] [CrossRef]
Ammenwerth, E.; Eichstadter, R.; Haux, R.; Pohl, U.; Rebel, S.; Ziegler, S. A randomized evaluation of a computer-based nursing documentation system. Method Inf. Med. 2001, 40, 61–68. [Google Scholar]
Ozair, F.F.; Jamshed, N.; Sharma, A.; Aggarwal, P. Ethical issues in electronic health records: A general overview. Perspect. Clin. Res. 2015, 6, 73–76. [Google Scholar] [PubMed]
Van Velthoven, M.H.; Car, J.; Zhang, Y.; Marušić, A. mHealth series: New ideas for mHealth data collection implementation in low– and middle–income countries. J. Glob. Health 2013, 3, 1–3. [Google Scholar] [CrossRef] [PubMed]
Sebastian-Coleman, L. Measuring Data Quality for Ongoing Improvement: A Data Quality Assessment Framework; Morgan Kaufmann Elsevier: Waltham, MA, USA, 2013. [Google Scholar]
Pringle, M.; Wilson, T.; Grol, R. Measuring ‘goodness’ in individuals and healthcare systems. BMJ 2002, 325, 704–707. [Google Scholar] [CrossRef] [PubMed]
Ali, S.M.; Anjum, N.; Boulos, M.N.K.; Ishaq, M.; Aamir, J.; Haider, G.R. Measuring management’s perspective of data quality in Pakistan’s tuberculosis control programme: A test-based approach to identify data quality dimensions. BMC Res. Notes 2018, 11, 40. [Google Scholar] [CrossRef] [PubMed]
Bovee, M.; Srivastava, R.; Mak, B. A conceptual framework and belief-function approach to assessing overall information quality. Int. J. Intell. Syst. 2001, 18, 51–74. [Google Scholar] [CrossRef]
DAMA. The Six Primary Dimensions for Data Quality Assessment: Defining Data Quality Dimensions. DAMA UK Working Group on Data Quality Dimensions. Available online: https://www.whitepapers.em360tech.com/wp-content/files_mf/1407250286DAMAUKDQ DimensionsWhitePaperR37.pdf (accessed on 5 May 2016).
Almutiry, O.; Wills, G.; Alwabel, A.; Crowder, R.; Walters, R. Toward a framework for data quality in cloud-based health information system. In Proceedings of the International Conference on Information Society (i-Society 2013), Toronto, ON, Canada, 24–26 June 2013; Available online: http://ieeexplore.ieee.org/document/6636362/ (accessed on 3 July 2016).
Orfanidis, L.; Bamidis, P.D.; Eaglestone, B. Data quality issues in Electronic Health Records: An adaptation framework for the Greek health system. Health Inform. J. 2004, 10, 23–36. [Google Scholar] [CrossRef]
Chaudhry, B.; Wang, J.; Wu, S.; Maglione, M.; Mojica, W.; Roth, E.; Morton, S.C.; Shekelle, P.G. Systematic review: Impact of health information technology on quality, efficacy, and costs of medical care. Ann. Intern. Med. 2006, 144, 742–752. [Google Scholar] [CrossRef] [PubMed]
Kalogriopoulos, N.A.; Baran, J.; Nimunkar, A.J.; Webster, J.G. Electronic medical record systems for developing countries: Review. In Proceedings of the Annual International Conference of IEEE in Engineering in Medicine and Biology Society, Mineapolis, MN, USA, 3–6 September 2009; pp. 1730–1733. [Google Scholar]
Kahn, M.G.; Raebel, M.A.; Glanz, J.M.; Riedlinger, K.; Steiner, J.F. A pragmatic framework for single-site and multisite data quality assessment in electronic health record-based clinical research. Med. Care 2012, 50, S21–S29. [Google Scholar] [CrossRef] [PubMed]
Divorski, S.; Scheirer, M.A. Improving data quality for performance measures: Results from a GAO study of verification and validation. Eval. Program Plan. 2001, 24, 83–94. [Google Scholar] [CrossRef]
Chen, H.; Hailey, D.; Wang, N.; Yu, P. A review of data quality assessment methods for public health information systems. Int. J. Environ. Res. Public Health 2014, 11, 5170–5207. [Google Scholar] [CrossRef] [PubMed]
Tsai, J.; Bond, G. A comparison of electronic records to paper records in mental health centers. Int. J. Qual. Health Care 2008, 20, 136–143. [Google Scholar] [CrossRef] [PubMed]
Ward, M.; Brandsema, P.; van Straten, E.; Bosman, A. Electronic reporting improves timeliness and completeness of infectious disease notification, The Netherlands, 2003. Euro Surveill. 2005, 10, 27–30. [Google Scholar] [CrossRef] [PubMed]
Shovlin, A.; Ghen, M.; Simpson, P.; Mehta, K. Challenges facing medical data digitization in low-resource contexts. In Proceedings of the IEEE 2013 Global Humantarian Technology Conference, San Jose, CA, USA, 2 July 2013; pp. 365–371. [Google Scholar]
Bram, J.T.; Warwick-Clark, B.; Obeysekare, E.; Mehta, K. Utilization and monetization of healthcare data in developing countries. Big Data 2015, 3, 59–66. [Google Scholar] [CrossRef] [PubMed]
Duhm, J.; Fleischmann, R.; Schmidt, S.; Hupperts, H.; Brandt, S.A. Mobile electronic medical records promote workflow: Physicians’ perspective from a survey. JMIR MHealth UHealth 2016, 4, e70. [Google Scholar] [CrossRef] [PubMed]
Despont-Gros, C.; Rutschmann, O.; Geissbuhler, A.; Lovis, C. Acceptance and cognitive load in a clinical setting of a novel device allowing natural real-time data acquisition. Int. J. Med. Inform. 2007, 76, 850–855. [Google Scholar] [CrossRef] [PubMed]
Bassi, J.; Lau, F.; Lesperance, M. Perceived impact of electronic medical records in physician office practices: A review of survey-based research. Interact. J. Med. Res. 2012, 1, e3. [Google Scholar] [CrossRef] [PubMed]
Wand, Y.; Wang, R.Y. Anchoring data quality dimensions in ontological foundations. Commun. ACM 1996, 39, 86–89. [Google Scholar] [CrossRef]
Wang, R.Y.; Strong, D.M. Beyond accuracy: What data quality means to data consumers. J. Manag. Inf. Syst. 1996, 12, 5–33. [Google Scholar] [CrossRef]
National Health Service. Executive Summary of the First National Data Quality Review. Quality Information Committee of the National Health Service. Available online: https://www.england.nhs.uk/wp-content/uploads/2013/04/1ndqr-exec-sum.pdf (accessed on 25 February 2017).
Nishtar, S. The Gateway Paper on Health System in Pakistan—A Way Forward. Pakistan’s Health Policy Forum and Heartfile, Islamabad, Pakistan. Available online: http://www.heartfile.org/pdf/phpf-GWP.pdf (accessed on 25 February 2017).
Shaikh, B.T. Private sector in health care delivery: A reality and a challenge in Pakistan. J. Ayub Med. Coll. Abbottabad 2015, 27, 496–498. [Google Scholar] [PubMed]
Tweya, H.; Feldacker, C.; Gadabu, O.J.; Ng’ambi, W.; Mumba, S.L.; Phiri, D.; Kamvazine, L.; Mwakilama, S.; Kanyerere, H.; Keiser, O.; et al. Developing a point-f-care electronic medical record system for TB/HIV co-infected patients: Experiences from Lighthouse Trust, Lilongwe, Malawi. BMC Res. Notes 2016, 9, 146. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Multiple Screenshots taken from Application Workflow. MAPPLE—Mobile Application for Physician–Patient–Lab Efficiency; TB—tuberculosis; GP—general practitioner.

Figure 2. Overview of data quality assessment result.

Figure 3. Data quality classification tree.

Figure 4. Trends in data completeness issues. DR—digital records; PBR—paper-based records.

Figure 5. Trends in data accuracy issues.

Figure 6. Trends in data consistency issues.

Figure 7. Trends in data understandability issues.

Figure 8. Trends in data timeliness issues.

Table 1. Type and number of data fields on patient treatment card (TB01 card).

Category of Data	Data Fields	Type of Data
Category of Data	Data Fields	Number	Text	Alpha-Numeric	Selection
Profile	17	5	9	1	2
Clinical	7	3	1	−	3
Diagnostic	6	4	1	−	1

Table 2. Comparable data fields of the patient treatment card (TB01).

Category of Data	Comparable Data Fields
Profile	12
Clinical	6
Diagnostic	3
Total	21

Table 3. Quantification and type of issues that occurred as a result of design change.

Data Category	Data Field	# of Issues	Total (n = 59)	Data Quality Dimension
Patient Profile	Father/Spouse Name	25	37	Completeness (missing responses)
Patient Profile	Type of Referral	12	37
Clinical	Treatment Start Date	5	22
	Disease Category	3
	Type of Patient	12
	Disease Site	2

Table 4. Monthly chart of classifiable issues in paper-based records (PBRs) and digital records (DRs).

Month	Total Comparable Records	No. of Issues in DR	No. of Issues in DR (Design Concern)	No. of Valid Issues in DR	No. of Issues Per DR	No. of Issues in PBR	No. of Issues Per PBR	Test
	n	x	x¹	x^v = x − x¹	d = x^v/n	y	p = y/n	p-Value
April	31	85	40	45	1.5	67	2.2	0.0287
May	10	50	8	42	4.2	34	3.4	0.3124
June	10	24	7	17	1.7	11	1.1	0.2411
July	8	10	4	6	0.7	11	1.4	0.2138
Total	59	169	59	110	1.9	123	2.1	0.3711

Table 5. Data field-wise distribution of the non-classifiable issues.

Name of Data Fields	April	May	June	July	Total
Name	0	0	2	0	2
Age	25	12	8	2	47
Weight	12	2	6	2	22
Patient Identifier Code	3	9	4	4	20
National Identity Number	6	3	1	1	11
Address	1	0	0	1	2
Phone Number	6	1	1	1	9
Father/Spouse Name	0	2	0	0	2
Supporter Name	1	0	0	0	1
Type of Referral	6	11	3	2	22
Lab Number	2	0	1	0	3
Disease Site	0	2	0	0	2
Lab Result	1	0	0	0	1
Total	63	42	26	13	144

© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ali, S.M.; Naureen, F.; Noor, A.; Kamel Boulos, M.N.; Aamir, J.; Ishaq, M.; Anjum, N.; Ainsworth, J.; Rashid, A.; Majidulla, A.; et al. Data Quality: A Negotiator between Paper-Based and Digital Records in Pakistan’s TB Control Program. Data 2018, 3, 27. https://doi.org/10.3390/data3030027

AMA Style

Ali SM, Naureen F, Noor A, Kamel Boulos MN, Aamir J, Ishaq M, Anjum N, Ainsworth J, Rashid A, Majidulla A, et al. Data Quality: A Negotiator between Paper-Based and Digital Records in Pakistan’s TB Control Program. Data. 2018; 3(3):27. https://doi.org/10.3390/data3030027

Chicago/Turabian Style

Ali, Syed Mustafa, Farah Naureen, Arif Noor, Maged N. Kamel Boulos, Javariya Aamir, Muhammad Ishaq, Naveed Anjum, John Ainsworth, Aamna Rashid, Arman Majidulla, and et al. 2018. "Data Quality: A Negotiator between Paper-Based and Digital Records in Pakistan’s TB Control Program" Data 3, no. 3: 27. https://doi.org/10.3390/data3030027

APA Style

Ali, S. M., Naureen, F., Noor, A., Kamel Boulos, M. N., Aamir, J., Ishaq, M., Anjum, N., Ainsworth, J., Rashid, A., Majidulla, A., & Fatima, I. (2018). Data Quality: A Negotiator between Paper-Based and Digital Records in Pakistan’s TB Control Program. Data, 3(3), 27. https://doi.org/10.3390/data3030027

Article Menu

Data Quality: A Negotiator between Paper-Based and Digital Records in Pakistan’s TB Control Program

Abstract

1. Introduction

2. Methods

2.1. Sample Description

2.2. Data Collection

2.3. Data Quality Assessment Method

2.4. Data Analysis

2.5. Ethical Considerations

2.6. Study Findings

2.6.1. Comparison of the Paper-Based and Digital Records

2.6.2. Analysis of Non-Classifiable Issues

2.6.3. Analysis of Classifiable Issues

Classifier 1: Completeness

Classifier 2: Accuracy

Classifier 3: Consistency

Classifier 4: Understandability

Classifier 5: Timeliness

3. Discussion

3.1. User Adoption and Acceptance Issues of Digital Data Collection

3.2. Novel Contribution, Replicability and Generalizability of the Work beyond the Six Study Locales

3.3. Follow-Up on Current Work

3.4. Research Implications

4. Strengths and Limitations of the Study

5. Conclusions

Supplementary Materials

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI