Good Practice Data Linkage (GPD): A Translation of the German Version †
Abstract
:1. Introduction
1.1. Objectives and Target Group of GPD
- Data owners;
- Scientists;
- The reviewers of scientific projects and publications;
- Supervisory authorities, ethics commissions, and data protection officers.
1.2. Areas of Application
2. Methodology
- The research objectives, research questions, data sources, and resources;
- The data infrastructure and data flow;
- Data protection;
- Ethics;
- The key variables and linkage methods;
- Data validation/quality assurance;
- The long-term use of data for questions still to be determined.
2.1. Guideline 1: Research Objectives, Research Questions, Data Sources, and Resources
2.1.1. Recommendation 1.1: Research Objectives, Questions, and, Where Applicable, Hypotheses Must Be Formulated as Precisely as Possible in Order to Create a Detailed Profile of the Requirements for the Data to Be Used
2.1.2. Recommendation 1.2: The Availability of the Information Needed to Pursue the Research Objectives and Answer the Research Questions Must Be Verified as Early as Possible. In Addition, the Research Design, Relevant Observation Periods, and Study Population Have to Be Specified
2.1.3. Recommendation 1.3: From a Data Provenance Perspective, Data Owners Must Disclose and Communicate Developments in the Data Sources. In Particular, Historical/Systematic Changes to the Data Sources, Such as the Introduction of New Coding Systems or Key Variables, Must Be Communicated
2.1.4. Recommendation 1.4: The Data Sources to Potentially Be Linked Must Be Described in Terms of Their Origin, Their Original Intended Use, Their Data Owner, and Their Advantages and Disadvantages
- Primary data, that are prepared and analyzed in the context of their original intended purpose, e.g., survey data.
- Secondary data, that are used beyond their original, primary intended use. These include, for example, a large number of data from social insurance institutions (e.g., statutory health and pension insurance), as well as other health care data (e.g., from medical office software or hospital information systems) or data from (clinical) research projects that are subsequently used to answer further questions. A more detailed description and examples of different types of data linkage can be found in Jacobs et al. [17], March et al. [10], and Swart et al. [18].
2.1.5. Recommendation 1.5: The Use of Data Linkage Must Be Justified by the Question
2.1.6. Recommendation 1.6: Due to the Complexity of a Research Project with Data Linkage, Sufficient Time, Financial, and Human Resources Must Already Be Provided for When Planning and Elaborating the Design
2.2. Guideline 2: Data Infrastructure and Data Flow
- The data to be exchanged and further information (e.g., data dictionary);
- Necessary mutual obligations where applicable (e.g., timely data deletion);
- The technical procedures used.
2.2.1. Recommendation 2.1: The Data Flow and Responsibilities Must Be Clearly Defined
- Data-collecting agencies responsible for data acquisition;
- Various data owners (depending on the type of data supplied);
- A trust center, which pseudonymizes personal data, in particular;
- Bodies that anonymize personal data;
- Bodies that perform the data linkage themselves;
- Bodies that conduct linkage quality control;
- Bodies that perform the data evaluation.
2.2.2. Recommendation 2.2: The General Technical and Organizational Requirements for Data Transfer (See Guideline 6.1 of the GPS) Must Be Observed, and the Special Features of Projects with Data Linkage Have to Be Taken into Account
2.2.3. Recommendation 2.3: Software That Is Suitable for the Selected Record Linkage Method Must Be Used
2.2.4. Recommendation 2.4: A Suitable Process Must Be Defined for the Deletion of Data and for Contradiction Management
2.3. Guideline 3: Data Protection
2.3.1. Recommendation 3.1: Data Protection Regulations Must Be Taken into Account during the Initial Planning Stage Right through to the Completion of the Project. The De-Anonymization/Re-Identification of Individual Persons through the Linkage Must Thereby Be Prevented
- Data protection and data security;
- Ethical and legal regulations concerning data access and use, where applicable;
- Structure and maintenance of the database(s);
- Data transfer and data deletion.
2.3.2. Recommendation 3.2: It Must Be Checked Whether a Declaration of Consent Is Necessary
2.3.3. Recommendation 3.3: A Data Protection Concept Must Be Developed
- A description of the project (background, objective, database, and methodology);
- Responsibilities (which public and non-public bodies are involved);
- The identification of the persons who have access to the data (trust center and researchers);
- The names of the persons involved, their data used, and/or the data categories (in particular, the key variables);
- The legal basis;
- Data-related processes, the resulting risks or protection requirements, and confidentiality;
- Organizational and technical measures or procedures;
- Time limits/deadlines;
- A concrete procedure for data deletion, including clarification of when data can/shall no longer be deleted, e.g., due to the anonymity of the data (see also Recommendation 2.4);
- Cancellation management: defining a procedure for deleting individual data records when requested to do so by a participant (see also Recommendation 2.4).
2.3.4. Recommendation 3.4: If Linkage in a Research Project Is Only Planned Retrospectively, a Careful Examination of the Data Protection Regulations That May Have to Be Adhered to Needs to Take Place
2.4. Guideline 4: Ethics
Recommendation 4.1: Possible Effects of the Data Linkage on the Benefit–Harm Potential of the Research Project Must Be Examined
- The minimization of incorrect linkage and any resulting false outcomes;
- Minimizing the risk of the re-identification of natural persons (see Guideline 3).
2.5. Guideline 5: Key Variables and Linkage Methods
2.5.1. Recommendation 5.1: Before Defining and Using Key Variables, the Existing Framework Conditions Regarding Their Use for Linkage Must Be Clarified
- The legal (data protection) requirements to be observed must be clarified. Of particular importance is the question as to whether the linkage ensues on the basis of informed consent, as this provides the framework for the available key variables. For the purposes of risk assessment, it should be examined whether linkage increases the risk of the re-identifiability of individuals (see Guideline 3).
- The type of data source influences the quality of the key variables. In contrast to in data that are collected retrospectively, in prospective data, it is possible, where applicable, to supplement (collect) variables that enable or simplify subsequent linkage.
- It has to be clarified at what points in time the record linkage is to take place: automatically at the time of data collection, at regular time intervals (e.g., per quarter), or after the final acquisition of all the data sources for the research question. Particularly in the case of long-term projects such as registries or the establishment of research databases, the timing of the data collection and data consolidation should be described. The variables to be collected, including the key variables, must also be defined.
2.5.2. Recommendation 5.2: All Key Variables Must Be Precisely Defined and Checked with Regard to Their Completeness and Susceptibility to Errors
- Automatically recorded variables that can be used as keys (e.g., specific insurance numbers) are to be given priority (if possible). The use and comparison of the check digits reduces linkage errors that can arise due to key variables being collected or transmitted incorrectly.
- It must be checked to what extent directly identifying identifiers (e.g., names or insurance numbers) are used as key variables in plain text form or are to be masked by suitable procedures (pseudonymization, a hash function, or a Bloom filter). It should be taken into account to what extent the selected masking method can be used for each data source. Depending on the existing framework conditions and data protection requirements, this masking can be carried out in the same way by any data owner whose data are to be linked (possibly using a pseudonymization service such as Mainzelliste [10,47]) or with the involvement of a trust center/data trustee [41]. Further information on this topic can be found in the Status Quo Data Linkage publication [10].
- Appropriate procedures should be used to minimize false negative (synonym errors) or false positive classifications (homonym errors). In this way, different spellings of the key variables can be harmonized (e.g., phonetic coding methods, substrings, and Bloom filters [48,49,50,51]) and assignment errors reduced by incorporating further features. If key variables can change over time (e.g., married names), appropriate precautions must be taken for further assignment (e.g., a translation table for translating old to new IDs or the inclusion of the maiden name—see also Recommendation 6.2).
2.5.3. Recommendation 5.3: A Suitable Technical Procedure Must Be Chosen for the Data Linkage
- A distinction is made between direct and indirect linkage and between probabilistic and deterministic linkage. In addition, there are exact and fault-tolerant methods, which are, in turn, subdivided into rule-based and distance-based fault-tolerant methods. Furthermore, blocking methods [53], in which only data sets with the same manifestations of specific features are compared, can also play an important role because they are able to improve the performance of the linkage process; however, they can also adversely affect the quality of the linkage.
- All processes have advantages and disadvantages that must be considered before they are used. They can be linked together under certain conditions. It is advisable to involve IT officers and experts in the choice and implementation of the linkage process at an early stage.
- If the records to be linked contain plain-text identifiers or pseudonymized plain-text identifiers as key variables, direct linkage methods can be used. However, the use of pseudonymized identifiers is only possible if the data records to be linked contain key variables that have been pseudonymized according to the same procedure.
- If the result of the direct linkage is not satisfactory, e.g., because data records could not be linked due to incorrect key variables, then an additional indirect procedure can be used (see Guideline 6). Deterministic record linkage with indirect identifiers is considered an indirect procedure for data linkage. If direct patient identifiers such as names are not available, data may be linked through a combination of indirect identifiers, i.e., age, sex, and point in time [54].
2.6. Guideline 6: Data Validation/Quality Assurance
2.6.1. Recommendation 6.1: A Description of the Quality of the Key Variables Must Be Included in the Project Report
2.6.2. Recommendation 6.2: It Must Be Examined Whether an Iterative Procedure Leads to Better Linkage Quality
2.6.3. Recommendation 6.3: In the Context of Quality Assurance, It Must Be Possible for the Body Conducting the Evaluation or the Body Performing the Data Linkage to Make Inquiries with the Data Owner. Implausibilities Must Be Clarified with the Data Owner to Avoid Inconsistent Data or a False Interpretation of the Data
2.6.4. Recommendation 6.4: After Each Data Linkage, the Number of Merged and Non-Mergeable Records Must Be Checked on the Basis of the Source Files
2.6.5. Recommendation 6.5: After Each Data Linkage, a Comparison Must Be Made between the Transferred and the Merged Data
2.6.6. Recommendation 6.6: The Actual Error Rate Must Be Measured and Included in the Result Report. If the Linkage Is Repeated Several Times, the Error Rate Must Be Continuously Checked and Compared with Previous Results
2.6.7. Recommendation 6.7: After Each Data Linkage, the Description of the Properties of the Resulting Research Data Set Must Be Made with Reference to the Original Data
2.7. Guideline 7: Long-Term Use of Data for Questions Still to Be Determined
2.7.1. Recommendation 7.1: If the Data Owner Intends to Make Further Use of the Merged Data beyond the Primary Issue, or Should This Possibility Exist in Principle, the Appropriate Regulations Must Then Be Taken into Account as Early as at the Design Stage of a Research Project
2.7.2. Recommendation 7.2: If the Merged Data Are to Be Made Accessible for Scientific Use by Third Parties within the Framework of a Research Database, This Use Must Be Regulated by a Standardized Access Procedure
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
References
- Gilbert, R.; Lafferty, R.; Hagger-Johnson, G.; Harron, K.; Zhang, L.-C.; Smith, P.; Dibben, C.; Goldstein, H. GUILD: GUidance for Information about Linking Data sets. J. Public Health 2018, 40, 191–198. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Benchimol, E.I.; Smeeth, L.; Guttmann, A.; Harron, K.; Moher, D.; Petersen, I.; Sørensen, H.T.; von Elm, E.; Langan, S.M. The REporting of studies Conducted using Observational Routinely-collected health Data (RECORD) Statement. PLoS Med. 2015, 12, e1001885. [Google Scholar] [CrossRef] [PubMed]
- Von Elm, E.; Altman, D.G.; Egger, M.; Pocock, S.J.; Gøtzsche, P.C.; Vandenbroucke, J.P. Das Strengthening the Reporting of Observational Studies in Epidemiology (STROBE-) Statement. Internist 2008, 49, 688–693. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Vandenbroucke, J.P.; Von Elm, E.; Altman, D.G.; Gøtzsche, P.C.; Mulrow, C.D.; Pocock, S.J.; Poole, C.; Schlesselman, J.J.; Egger, M. Strengthening the Reporting of Observational Studies in Epidemiology (STROBE). Explanation and elaboration. PLoS Med. 2007, 4, e297. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- March, S.; Andrich, S.; Drepper, J.; Horenkamp-Sonntag, D.; Icks, A.; Ihle, P.; Kieschke, J.; Kollhorst, B.; Maier, B.; Meyer, I.; et al. Good Practice Data Linkage [Gute Praxis Datenlinkage (GPD)]. Gesundheitswesen 2019, 81, 636–650. [Google Scholar]
- Datenschutz-Grundverordnung. Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the Protection of Natural Persons with Regard to the Processing of Personal Data and on the Free Movement of Such Data, and Repealing Directive 95/46/EC [Verordnung (EU) 2016/679 des Europäische Parlaments und des Rates Vom 27 April 2016 zum Schutz natürlicher Personen bei der Verarbeitung Personenbezogener Daten, zum freien Datenverkehr und zur Aufhebung der Richtlinie 95/46/EG] (4 May 2016). Available online: https://publications.europa.eu/de/publication-detail/-/publication/3e485e15-11bd-11e6-ba9a-01aa75ed71a1/language-deletzterZugriff (accessed on 8 January 2019).
- Swart, E.; Gothe, H.; Geyer, S.; Jaunzeme, J.; Maier, B.; Grobe, T.G.; Ihle, P. Good practice of secondary data analysis (GPS) [Gute Praxis Sekundärdatenanalyse (GPS)]. Guidelines and recommendations [Leitlinien und Empfehlungen]. Gesundheitswesen 2015, 77, 120–126. [Google Scholar]
- Hoffmann, W.; Latza, U.; Baumeister, S.E.; Brünger, M.; Buttmann-Schweiger, N.; Hardt, J.; Hoffmann, V.; Karch, A.; Richter, A.; Schmidt, C.O.; et al. Guidelines and recommendations for ensuring Good Epidemiological Practice (GEP). A guideline developed by the German Society for Epidemiology. Eur. J. Epidemiol. 2019, 34, 301–317. [Google Scholar] [CrossRef] [Green Version]
- Keller, S.; Korkmaz, G.; Orr, M.; Schroeder, A.; Shipp, S. The Evolution of Data Quality: Understanding the Transdisciplinary Origins of Data Quality Concepts and Approaches. Annu. Rev. Stat. Appl. 2017, 4, 85–108. [Google Scholar] [CrossRef]
- March, S.; Antoni, M.; Kieschke, J.; Kollhorst, B.; Maier, B.; Müller, G.; Sariyar, M.; Schulz, M.; Enno, S.; Zeidler, J.; et al. Quo vadis data linkage in Germany? An initial inventory [Quo vadis Datenlinkage in Deutschland? Eine erste Bestandsaufnahme]. Gesundheitswesen 2018, 80, e20–e31. [Google Scholar]
- Wichmann, H.-E.; Kaaks, R.; Hoffmann, W.; Jöckel, K.-H.; Greiser, K.H.; Linseisen, J. The German National Cohort [Die Nationale Kohorte]. Bundesgesundheitsbl 2012, 55, 781–787. [Google Scholar] [CrossRef]
- Ahrens, W.; Jöckel, K.-H. The benefit of large-scale cohort studies for health research: The example of the German National Cohort [Der Nutzen großer Kohortenstudien für die Gesundheitsforschung am Beispiel der Nationalen Kohorte]. Bundesgesundheitsbl 2015, 58, 813–821. [Google Scholar] [CrossRef]
- German National Cohort. The German National Cohort. Aims, study design and organization. Eur. J. Epidemiol. 2014, 29, 371–382. [Google Scholar] [CrossRef]
- Swart, E.; Bitzer, E.M.; Gothe, H.; Harling, M.; Hoffmann, F.; Horenkamp-Sonntag, D.; Maier, B.; March, S.; Petzold, T.; Röhrig, R.; et al. A consensus German reporting standard for secondary data analyses, version 2 [STROSA-STandardisierte BerichtsROutine fur SekundardatenAnalysen]. Gesundheitswesen 2016, 78, e145–e160. [Google Scholar] [PubMed] [Green Version]
- Buneman, P.; Chapman, A.; Cheney, J.; Vansummeren, S. A Provenance Model for Manually Curated Data. In Provenance and Annotation of Data; 3–5 May 2006, Revised Selected Papers; Moreau, L., Foster, I., Eds.; International Provenance and Annotation Workshop, IPAW 2006: Chicago, IL, USA, 2006; pp. 162–170. [Google Scholar]
- Bohensky, M.; Jolley, D.; Sundararajan, V.; Evans, S.; Ibrahim, J.; Brand, C. Development and validation of reporting guidelines for studies involving data linkage. Aust. N. Z. J. Public Health 2011, 35, 486–489. [Google Scholar] [CrossRef]
- Jacobs, S.; Stallmann, C.; Pigeot, I. Linkage of large secondary and registry data sources with data of cohort studies [Verknüpfung großer Sekundär- und Registerdatenquellen mit Daten aus Kohortenstudien. Usage of a dual potential [Doppeltes Potenzial nutzen]. Bundesgesundheitsbl 2015, 58, 822–828. [Google Scholar] [CrossRef]
- Swart, E.; Stallmann, C.; Powietzka, J.; March, S. Data linkage of primary and secondary data. [Datenlinkage von Primär- und Sekundärdaten]. A gain for small-area health-care analysis? [Ein Zugewinn auch für die kleinräumige Versorgungsforschung in Deutschland?]. Bundesgesundheitsbl 2014, 57, 180–187. [Google Scholar] [CrossRef] [PubMed]
- Antoni, M.; Jacobebbinghaus, P.; Seth, S. ALWA-Befragungsdaten Verknüpft Mit Administrativen Daten des IAB (ALWA-ADIAB) 1975–2009; Aktualisierte Version vom 25.05.2012; FDZ Datenreport 05/2011; Bundesagentur für Arbeit: Nürnberg, Germany, 2011. [Google Scholar]
- Antoni, M.; Seth, S. ALWA-ADIAB–Linked Individual Survey and Administrative Data for Substantive and Methodological Research. Schmollers Jahrb. 2012, 132, 141–146. [Google Scholar] [CrossRef] [Green Version]
- Czaplicki, C.; Korbmacher, J. SHARE-RV: Verknüpfung von Befragungsdaten des Survey of Health, Ageing and Retirement in Europe mit administrativen Daten der Rentenversicherung. Gesundh. Migr. Einkomm. 2010, 55, 28–37. [Google Scholar]
- Kajüter, H.; Geier, A.S.; Wellmann, J.; Krieg, V.; Fricke, R.; Heidinger, O.; Hense, H.-W. Cohort study of cancer incidence in patients with type 2 diabetes [Kohortenstudie zur Krebsinzidenz bei Patienten mit Diabetes mellitus Typ 2]. Record linkage of encrypted data from an external cohort with data from the epidemiological cancer registry of North Rhine-Westphalia [Record Linkage von kryptografierten Daten einer externen Kohorte mit Daten des Epidemiologischen Krebsregisters Nordrhein-Westfalen]. Bundesgesundheitsbl 2014, 57, 52–59. [Google Scholar]
- Korbmacher, J.M.; Czaplicki, C. Linking SHARE survey data with administrative records: First experiences from SHARE-Germany. In SHARE Wave 4. Innovations & Methodology; Malter, F., Börsch-Supan, A., Eds.; MEA, Max Planck Institute for Social Law and Social Policy: München, Germany, 2013; pp. 47–52. [Google Scholar]
- Maier, B.; Wagner, K.; Behrens, S.; Bruch, L.; Busse, R.; Schmidt, D.; Schühlen, H.; Thieme, R.; Theres, H. Deterministic record linkage with indirect identifiers [Deterministisches Record Linkage mit indirekten Identifikatoren]. Data of the Berlin myocardial infarction registry and the AOK nordost for patients with myocardial infarction [Daten des Berliner Herzinfarktregisters und der AOK Nordost zum Herzinfarkt]. Gesundheitswesen 2015, 77, e15–e19. [Google Scholar]
- March, S.; Rauch, A.; Thomas, D.; Bender, S.; Swart, E. Procedures according to data protection laws for coupling primary and secondary data in a cohort study [Datenschutzrechtliche Vorgehensweise bei der Verknüpfung von Primär- und Sekundärdaten in einer Kohortenstudie]. The lidA study [Die lidA-Studie]. Gesundheitswesen 2012, 74, e122–e129. [Google Scholar]
- March, S. Individual Data Linkage of Survey Data with Claims Data in Germany-An Overview Based on a Cohort Study. Int. J. Environ. Res. Public Health 2017, 14, 1543. [Google Scholar] [CrossRef] [Green Version]
- Ohlmeier, C.; Hoffmann, F.; Giersiepen, K.; Rothgang, H.; Mikolajczyk, R.; Appelrath, H.-J.; Elsässer, A.; Garbe, E. Linkage of statutory health insurance data with those of a hospital information system [Verknüpfung von Routinedaten der Gesetzlichen Krankenversicherung mit Daten eines Krankenhausinformationssystems]. Feasible, but also “useful”? [Machbar, aber auch “nützlich”?]. Gesundheitswesen 2015, 77, e8–e14. [Google Scholar]
- Ohlmeier, C.; Langner, I.; Garbe, E.; Riedel, O. Validating mortality in the German Pharmacoepidemiological Research Database (GePaRD) against a mortality registry. Pharmacoepidemiol. Drug Saf. 2016, 25, 778–784. [Google Scholar] [CrossRef] [PubMed]
- Ohmann, C.; Smektala, R.; Pientka, L.; Paech, S.; Neuhaus, E.; Rieger, M.; Schwabe, W.; Debold, P.; Jonas, M.; Hupe, K.; et al. A new model of comprehensive data linkage—Evaluation of its application in femoral neck fracture. Z. Evid. Fortbild. Qual. Gesundhwes 2005, 99, 547–554. [Google Scholar]
- Swart, E.; Ihle, P.; Gothe, H.; Matusiewicz, D. (Eds.) Routinedaten im Gesundheitswesen. In Handbuch Sekundärdatenanalyse: Grundlagen, Methoden und Perspektiven; 2. Aufl.; Huber: Bern, Germay, 2014. [Google Scholar]
- Stallmann, C.; Ahrens, W.; Kaaks, R.; Pigeot, I.; Swart, E.; Jacobs, S. Individual linkage of primary data with secondary and registry data within large cohort studies [Individuelle Datenverknüpfung von Primärdaten mit Sekundär- und Registerdaten in Kohortenstudien]. Capabilities and procedural proposals [Potenziale und Verfahrensvorschläge]. Gesundheitswesen 2015, 77, e37–e42. [Google Scholar]
- Stang, A.; Jöckel, K.-H. Avoidance of representativeness in presence of effect modification. Int. J. Epidemiol. 2014, 43, 630–631. [Google Scholar] [CrossRef] [Green Version]
- Weiskopf, N.G.; Weng, C. Methods and dimensions of electronic health record data quality assessment. Enabling reuse for clinical research. J. Am. Med. Inform. Assoc. JAMIA 2013, 20, 144–151. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Watts, S.; Shankaranarayanan, G.; Even, A. Data quality assessment in context. A cognitive perspective. Decis. Support. Syst. 2009, 48, 202–211. [Google Scholar] [CrossRef]
- March, S.; Swart, E.; Robra, B.-P. Can statutory health insurance claims data complete primary data without bias? [Können Krankenkassendaten Primärdaten verzerrungsfrei ergänzen?]. Selectivity analyses in the context of the Lida-study [Selektivitätsanalysen im Rahmen der lidA-Studie]. Gesundh. Qual. 2017, 22, 104–115. [Google Scholar]
- German National Cohort Study. Available online: https://nako.de (accessed on 28 March 2019).
- Swart, E.; Thomas, D.; March, S.; Salomon, T.; von dem Knesebeck, O. Experience with the linkage of primary and secondary claims data in an intervention trial [Erfahrungen mit der Datenverknüpfung von Primär- und Sekundärdaten in einer Interventionsstudie]. Gesundheitswesen 2011, 73, e126-32. [Google Scholar] [CrossRef] [PubMed]
- Brown, J.S.; Kahn, M.; Toh, S. Data quality assessment for comparative effectiveness research in distributed data networks. Med. Care 2013, 51, S22–S29. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Purchase, H.C.; Welland, R.; McGill, M.; Colpoys, L. Comprehension of diagram syntax. An empirical study of entity relationship notations. Int. J. Hum. Comput. Stud. 2004, 61, 187–203. [Google Scholar] [CrossRef]
- Hassenpflug, J.; Liebs, T.R. Registries as a tool for optimizing safety of endoprostheses. Experiences from other countries and the setup of the German arthroplasty register [Register als Werkzeug für mehr Endoprothesensicherheit. Erfahrungen aus anderen Ländern und dem Aufbau des Endoprothesenregisters Deutschland]. Bundesgesundheitsbl 2014, 57, 1376–1383. [Google Scholar]
- Pommerening, K.; Drepper, J.; Helbing, K.; Ganslandt, T. Leitfaden zum Datenschutz in medizinischen Forschungsprojekten. Generische Lösungen der TMF 2.0; 1. Aufl.; MWV Medizinisch Wissenschaftliche Verlagsgesellschaft: Berlin, Germany, 2014. [Google Scholar]
- March, S.; Rauch, A.; Bender, S.; Ihle, P. Data Protection Aspects Concerning the Use of Social or Routine Data; FDZ-Methodenreport 12/2015; Bundesagentur für Arbeit: Nürnberg, Germany, 2015. [Google Scholar]
- Ihle, P. Data protection and methodological aspects in compiling a routine database from statutory health insurance data for research purposes [Datenschutzrechtliche und methodische Aspekte beim Aufbau einer Routinedatenbasis aus der Gesetzlichen Krankenversicherung zu Forschungszwecken]. Bundesgesundheitsbl 2008, 51, 1127–1134. [Google Scholar]
- Swart, E.; Stallmann, C.; Schimmelpfennig, M.; Feißel, A.; March, S. Expertise on the Use of Secondary Data for Research on Work and Health [Gutachten zum Einsatz von Sekundärdaten für die Forschung zu Arbeit und Gesundheit]; 1. Aufl.; Bundesanstalt für Arbeitsschutz und Arbeitsmedizin (BAuA): Dortmund, Germany, 2018. [Google Scholar]
- Deutsche Forschungsgemeinschaft. Proposals for Safeguarding Good Scientific Practice [Denkschrift zur Sicherung guter wissenschaftlicher Praxis]; Wiley-VCH: Weinheim, Germany, 2013. [Google Scholar]
- Bialke, M.; Bahls, T.; Havemann, C.; Piegsa, J.; Weitmann, K.; Wegner, T.; Hoffmann, W. MOSAIC—A Modular Approach to Data Management in Epidemiological Studies. Methods Inf. Med. 2015, 54, 364–371. [Google Scholar] [CrossRef] [Green Version]
- Lablans, M.; Borg, A.; Ückert, F. A RESTful interface to pseudonymization services in modern web applications. BMC Med. Inform. Mak. 2015, 15, 2. [Google Scholar] [CrossRef]
- Schnell, R.; Bachteler, T.; Reiher, J. Development of a New Method for Privacy-Preserving Record Linkage Allowing for Errors in Identifiers [Entwicklung einer neuen fehlertoleranten Methode bei der Verknüpfung von personenbezogenen Datenbanken unter Gewährleistung des Datenschutzes]. Methoden Daten Anal. 2009, 3, 203–217. [Google Scholar]
- Boyd, J.; Randall, S.; Ferrante, A.M. Application of Privacy-Preserving Techniques in Operational Record Linkage Centres. In Medical Data Privacy Handbook; Gkoulalas-Divanis, A., Loukides, G., Eds.; Springer: Berlin/Heidelberg, Germany, 2015; pp. 267–287. [Google Scholar]
- Randall, S.M.; Ferrante, A.M.; Boyd, J.H.; Bauer, J.K.; Semmens, J.B. Privacy-preserving record linkage on large real world datasets. J. Biomed. Inf. 2014, 50, 205–212. [Google Scholar] [CrossRef] [Green Version]
- Vatsalan, D.; Christen, P. Privacy-preserving matching of similar patients. J. Biomed. Inf. 2016, 59, 285–298. [Google Scholar] [CrossRef]
- Vatsalan, D.; Christen, P.; Verykios, V.S. A taxonomy of privacy-preserving record linkage techniques. Inf. Syst. 2013, 38, 946–969. [Google Scholar] [CrossRef]
- Steorts, R.C.; Ventura, S.L.; Sadinle, M.; Fienberg, S.E. A Comparison of Blocking Methods for Record Linkage; Springer: Cham, Germany, 2014. [Google Scholar]
- Lawson, E.H.; Ko, C.Y.; Louie, R.; Han, L.; Rapp, M.; Zingmond, D.S. Linkage of a clinical surgical registry with Medicare inpatient claims data using indirect identifiers. Surgery 2013, 153, 423–430. [Google Scholar] [CrossRef] [PubMed]
- Nonnemacher, M.; Nasseh, D.; Stausberg, J. Datenqualität in der Medizinischen Forschung. Leitlinie zum Adaptiven Management von Datenqualität in Kohortenstudien und Registern; 2. Aufl.; MWV Medizinisch Wiss. Ver: Berlin, Germany, 2014. [Google Scholar]
- Sakshaug, J.; Antoni, M. Errors in Linking Survey and Administrative Data. In Total Survey Error in Practice; Biemer, P.P., Leeuw, E.D.D., Eckman, S., Edwards, B., Kreuter, F., Lyberg, L., Tucker, C., West, B.T., Eds.; John Wiley & Sons: Hoboken, NJ, USA, 2017. [Google Scholar]
- Baldi, I.; Ponti, A.; Zanetti, R.; Ciccone, G.; Merletti, F.; Gregori, D. The impact of record-linkage bias in the Cox model. J. Eval. Clin. Pract. 2010, 16, 92–96. [Google Scholar] [CrossRef] [PubMed]
- Krawczak, M.; Weichert, T. Proposal of a Modern Data Infrastructure for Medical Research in Germany [Vorschlag Einer Modernen Dateninfrastruktur für die Medizinische Forschung in Deutschland]; Christian-Albrechts-Universität: Kiel, Germany, 2017. [Google Scholar]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
March, S.; Andrich, S.; Drepper, J.; Horenkamp-Sonntag, D.; Icks, A.; Ihle, P.; Kieschke, J.; Kollhorst, B.; Maier, B.; Meyer, I.; et al. Good Practice Data Linkage (GPD): A Translation of the German Version. Int. J. Environ. Res. Public Health 2020, 17, 7852. https://doi.org/10.3390/ijerph17217852
March S, Andrich S, Drepper J, Horenkamp-Sonntag D, Icks A, Ihle P, Kieschke J, Kollhorst B, Maier B, Meyer I, et al. Good Practice Data Linkage (GPD): A Translation of the German Version. International Journal of Environmental Research and Public Health. 2020; 17(21):7852. https://doi.org/10.3390/ijerph17217852
Chicago/Turabian StyleMarch, Stefanie, Silke Andrich, Johannes Drepper, Dirk Horenkamp-Sonntag, Andrea Icks, Peter Ihle, Joachim Kieschke, Bianca Kollhorst, Birga Maier, Ingo Meyer, and et al. 2020. "Good Practice Data Linkage (GPD): A Translation of the German Version" International Journal of Environmental Research and Public Health 17, no. 21: 7852. https://doi.org/10.3390/ijerph17217852