Improving the Quality of Survey Data Documentation: A Total Survey Error Perspective

Jedinger, Alexander; Watteler, Oliver; Förster, André

doi:10.3390/data3040045

Open AccessArticle

Improving the Quality of Survey Data Documentation: A Total Survey Error Perspective

by

Alexander Jedinger

^1,*

,

Oliver Watteler

¹

and

André Förster

²

¹

GESIS—Leibniz Institute for the Social Sciences, Unter Sachsenhausen 6–8, 50667 Cologne, Germany

²

Service Center eSciences, Trier University, 54286 Trier, Germany

^*

Author to whom correspondence should be addressed.

Data 2018, 3(4), 45; https://doi.org/10.3390/data3040045

Submission received: 2 October 2018 / Revised: 23 October 2018 / Accepted: 25 October 2018 / Published: 29 October 2018

Download Versions Notes

Abstract

:

Surveys are a common method in the social and behavioral sciences to collect data on attitudes, personality and social behavior. Methodological reports should provide researchers with a complete and comprehensive overview of the design, collection and statistical processing of the survey data that are to be analyzed. As an important aspect of open science practices, they should enable secondary users to assess the quality and the analytical potential of the data. In the present article, we propose guidelines for the documentation of survey data that are based on the total survey error approach. Considering these guidelines, we conclude that both scientists and data-holding institutions should become more sensitive to the quality of survey data documentation.

Keywords:

total survey error; data quality; documentation quality; methodology reports; data sharing; reproducibility; open science

1. Introduction

Surveys are a common method in the social and behavioral sciences to collect data on attitudes, personality and social behavior. In recent years, however, the demands on the availability and documentation of survey data have increased continuously [1]. Against the backdrop of the reproducibility crisis [2] and in accordance with open science practices [3], the scientific community, as well as most funding organizations, expect that the gathered data are archived in a data-holding institution (e.g., an institutional repository or data archive) after the completion of research projects [4,5,6]. This ensures that the data are available for replication and secondary analyses, and thus facilitates transparency and integrity of social research [7]. In addition to datasets, codebooks, and survey instruments, a methodological report constitutes the centerpiece of high-quality survey data documentation. As an integral part of open research practices, methodological reports provide additional information beyond the regular method section of an article. Current research, however, focuses on a narrow concept of survey data quality that involves errors that are induced by sampling, measurement and non-responses, e.g., [8,9,10,11,12], but almost completely ignores the complementary role of data documentation quality in the context of open science practices.

Methodological reports should provide researchers with a complete and comprehensive overview of the design, collection and statistical processing of the survey data that are to be analyzed. During the process of data sharing, data depositors often ask themselves what concrete information should be included in a good methodological report. However, there are few concrete guidelines for the preparation of methodological reports in the social and behavioral sciences. For instance, the excellent recommendations of the APA Publications and Communications Board Working Group [13], the American Association of Public Opinion Research [14] or the STROBE Statement [15] focus primarily on the publication of survey results in scientific journals or the mass media. Other guidelines, such as the Quality Reports proposed by Eurostat [16], go beyond basic methodological requirements and can hardly be implemented by research projects with smaller budgets.

In the present article, we attempt to fill this gap by developing systematic guidelines for the documentation of survey data and propose disclosure requirements that are guided by the total survey error (TSE) approach. Our recommendations are mainly targeted at individual researchers and members of smaller research projects who are involved in the creation and curation of survey datasets and who intend to make their data available to the scientific community.

2. Survey Documentation and Survey Quality

Transparent documentation of survey methodology is a prerequisite for assessing the quality and the analytical potential of the data collected. The issue of survey data quality is related to possible “errors” that can occur during the entire research process [8,9]. In recent years, the TSE approach has been established as a systematic framework to understand the various sources of error that are associated with each of these steps [8,10,17]. In the context of survey research, the notion of an error is not to be understood as a mistake. The ultimate aim of surveys is to estimate certain parameters of a population (e.g., means or percentages). The term survey error refers to the deviation of an estimator from the true value in a population [17]. According to Weisberg [10], these potential errors can be divided into the three categories of respondent selection (e.g., coverage error), response accuracy (e.g., item nonresponse error), and survey administration (e.g., mode effects). Furthermore, an issue that has been widely ignored by the TSE approach is the ethics of surveys and the respect for legal provisions regarding data protection. While not a statistical issue, violation of respondents’ rights might lead to severe legal issues for the archiving and dissemination of survey data.

The first category of the TSE concept refers to errors that relate to the selection of respondents. A coverage error occurs when a sampling frame does not cover all the elements of the target population. For example, when certain segments of the population are systematically excluded in a list of addresses that serves as a sampling frame, they cannot be part of the sample (e.g., hospitalized individuals). Because not all of the elements of a target population are present in a sample, natural fluctuations arise that are called random sampling errors. These fluctuations in survey estimates can be mathematically determined in probability samples, whereas in nonprobability sampling, systematic bias in estimates can occur that is unknown. Even in carefully conducted studies, not all respondents participate in the survey. Nonresponse error at the individual level, the so-called unit level, occurs if certain segments of the population systematically refuse to participate in a survey.

The second category of errors refers to the accuracy of the measured responses. Item nonresponse error arises when respondents selectively avoid answering particular questions, for example, if the questions concern sensitive issues such as sexual or illegal behavior. If the respondents’ answers do not accurately reflect what should be measured with a question, this is called measurement error due to respondents, for example, if the question is misunderstood or respondents adjust their answers to socially accepted standards. Interviewer-related measurement error occurs when the characteristics or behaviors of an interviewer systematically bias the measurement.

The third category of Weisberg’s concept includes errors that are associated with the administration of a survey. The selection of a particular mode of data collection can influence the obtained results (mode effects) as well as the different practices of survey organizations (house effects). After data have been collected, they are usually extensively edited. Editing refers to the correction of errors in the data, often called “cleaning”, as well as the addition of other information like weighting factors. The errors that occur when handling data are called processing errors and should not be underestimated. This also applies to errors that are caused by incorrect or inadequate weighting of survey data (adjustment error).

Not all aspects of the TSE approach need to necessarily occur, and not all errors can be completely avoided. In general, one can think of the TSE as a quality continuum, and survey researchers attempt to minimize errors or keep them at an acceptable level within their budget constraints [10]. This inevitability of error makes it even more important for researchers to document the relevant technical information for the secondary users of their data. In the following paragraphs, we describe the main sections of a methodological report and recommend what technical information should be included to maximize transparency. In doing so, we build upon the components of the TSE, previous recommendations [14,18,19,20,21], the current best practices of major survey programs (e.g., European Social Survey) and our own experience. In this context, we distinguish between basic requirements that apply to all kinds of survey modes and requirements specific to interviewer-administered surveys. The complete list with the recommended items is provided in Appendix A.

3. Assessing the Quality of Survey Documentation

3.1. Basic Requirements

A methodological report is based on the flow of survey research. The starting point is a description of the objectives of the survey project. In this section, potential users should be concisely informed concerning the background of the study and the research problems that the study addresses. This section is usually also the right place to provide information regarding the source of funding for the study. The remainder of the report addresses key methodological aspects of the survey. This includes the identification of the target population, the choice of a sampling frame and the exact sampling method. Simultaneously, researchers determine the mode of data collection (e.g., in person, by phone, by mail) and design the questionnaire. Before this questionnaire goes into the field, it is often pretested and revised. Next, fieldwork occurs that is either conducted by a researcher or delegated to a professional survey organization. In the final step, the data are cleaned, edited and weighted for analysis. In each of these steps, important methodological decisions are made that should be fully documented and made transparent to assess the quality of the collected data. The key questions in the preparation of such a report are summarized in Table 1.

3.1.1. Target Population and Sampling

The population that the survey is intended to represent should be clearly defined in this section. This includes the exact eligibility criteria that are typically based on age, gender, citizenship, residence or the type of housing. The explanation of the sample design should generally start with a description of the sampling frame and its completeness, e.g., lists from registration offices. The sampling unit and the method in which the sampling units were chosen from the sampling frame should be explained in detail, particularly whether a probability or nonprobability sampling method was used. With multistage sampling, the respective sampling units and selection methods should be described for each stage of the sampling. If some type of (proportionate or disproportionate) stratification is applied at a particular sampling stage, the characteristics on which the strata are built should be specified. The targeted sample size and which modes of data collection have been applied should also be specified [22].

The constantly growing numbers of web surveys differ greatly in their target populations, sampling frames and specific sampling methods for the recruitment and selection of respondents [23,24,25]. In general, a distinction can be made between web surveys that use probability sampling methods to select respondents (e.g., probability-based Internet panels) and web surveys based on self-selected samples (e.g., online access panels or unrestricted web surveys advertised on social media). Nonprobability web surveys typically lack a clear definition of the population and a sampling frame. However, if (probability or nonprobability) Internet panels are used the procedures to recruit panel members and the methods to ensure the quality of the panel should be described [26]. This includes information on how many active participants the panel has, how up-to-date the profile data is, what action has been taken against panelists who give fraudulent or contradictory answers, and how often panelists participate in surveys in order to avoid “over-surveying” and “professionalization” of respondents [27]. For specific surveys, the documentation should describe how the sample has been drawn from the Internet panel.

3.1.2. Mode of Data Collection

Each method of data collection has its specific advantages, disadvantages and implications on sampling frames, respondent selection and measurement, which are relevant to the evaluation of the study [28]. In practice, increasingly different survey modes are combined within a study [29,30]. These combinations may be performed sequentially or simultaneously. For example, in the contact phase, different modes can be used for the initial screening and recruitment of respondents. In the main phase, respondents can be interviewed with different modes in waves of a longitudinal survey. Within a survey, different modes can be used for varying parts of the questionnaire or particular segments of the population. If mixed-mode designs have been employed, then the phase or segments of the target populations in which they were used should be described.

3.1.3. Survey Instrument

The next section of the methodology report entails a description of the content of the questionnaire. For this purpose, it is advisable to group related sets of questions together to form overarching topics. This grouping may not necessarily coincide with the order of the questions in the questionnaire. The development of the questionnaire should be documented, and if special scales or indices have been adopted in their original or in a modified way, their construction and quality should be discussed. If available, the report should provide psychometric information on these special scales or indices’ dimensionality, reliability and validity [31]. Special instruments, such as aptitude tests or question techniques (e.g., randomized response techniques and implicit attitude measurements), should be clearly explained so that they are understandable without knowledge of the literature. Usually, survey instruments are subjected to a pretest [32,33]. Important methodological decisions can occur during the pretest phase, such as changes in question wording and order, the exchange of interviewers, or reducing the length of the survey. If a pretest was conducted, the following information should be documented: the fieldwork dates, mode of data collection, sampling method, number of interviews and interviewers, and outcomes, e.g., changes in the questionnaire.

Ideally, the original programming code of the questionnaires in computer-assisted interviews would be part of the public survey instrument documentation. As this is often not possible to achieve, it might be reasonable to ask survey organizations about which internal quality control procedures were applied in programming and testing these questionnaires, or in the subsequent data processing. The latter would also apply to other agents processing the data after collection, including data archives.

3.1.4. Fieldwork

The least standardized section of methodological reports is the documentation of the fieldwork that depends highly on the data collection method that is used. However, information on the fieldwork is crucial to assess the quality of the obtained survey data [34,35]. The report should mention which survey organization was responsible for the data collection. Any existing subcontractors should also be listed. Regardless of the survey mode, the report should contain at least the dates of the fieldwork period, the total number of respondents over the course of the fieldwork (absolute/cumulative), and the descriptive statistics on the duration of the survey.

3.1.5. Response Rates

The reporting of the response rates should be based on the detailed recommendations of the American Association for Public Opinion Research [23]. Basically, the response rate is the ratio of the actual realized interviews, also called the net sample, and the adjusted gross sample. There are several variants of response rates that differ by whether partial interviews count as respondents and how to address the cases whose eligibility is unknown. The extent to which high response rates are a necessary or sufficient condition for high-quality survey data is controversial [36]. However, to determine the extent of non-response bias, the marginal distributions of the demographic characteristics of the sample should be compared with the known characteristics of the population (e.g., census information).1 If measures to increase the response rate were taken (i.e., incentives, homepages, informational material and reminders), they should also be reported. In longitudinal surveys, panel attrition contributes to the challenge of proper survey data documentation as a special case of non-response. Thus, response rates (and, if available, additional information about possible determinants of panel attrition and countermeasures) should be reported for each separate wave of a longitudinal survey and cumulatively [37].

In web surveys using probability-based Internet panels, non-response may occur during the recruitment stage, the profiling stage, and the specific study stage. Callegaro and DiSogra [38] have developed specific response metrics for each of these stages that should be reported (see also [23]). As already mentioned, nonprobability web surveys lack a proper sampling frame. Thus, the problem of non-response is difficult to grasp in these surveys. Although the so-called participation rate of nonprobability Internet panels cannot be used to interpret non-response bias it can still be used to evaluate the panel’s efficiency and should therefore be included in the survey documentation [23].

3.1.6. Data Processing

A set of problems that is frequently underestimated are the errors that occur after the actual survey is out of the field and the data are prepared for analysis or publication [28]. Data preparation includes entering data, as concerns non-computer-assisted surveys, and the cleaning and editing of data (e.g., assigning variable labels, value labels and missing values). There are hardly any binding standards for preparing survey data, but a methodological report should address how the data were cleaned and edited and which quality assurance measures have been performed [18,39]. Furthermore, if problems or inconsistencies should arise during data preparation (e.g., implausible values or errors in routing), this should be reported, as should how they were corrected, for example, when respondents are removed from the dataset because of implausible values.

During the process of data preparation, weighting variables are also prepared by the survey organization or the survey research team itself. There are mainly two types of weights [40]. Sample or design weights correct for differential selection probabilities in sampling, for example, due to different household sizes or a deliberate over- or underrepresentation of particular subgroups. Adjustment weights or post-stratification weights correct for differential response rates in socio-demographic subgroups by an adjustment to a known population distribution. A classic example is the selective participation of women in polls that results in an unrepresentative distribution of women in the sample. The methodology report should describe which weighting variable is appropriate for which type of analysis and how the weights were constructed. In case of adjustment weights, the report should describe the characteristics that are used to construct the weights, the origins of the weighting targets and the weighting method (e.g., iterative proportional fitting). It is also advisable to include the descriptive statistics for each weight and a comparison of the weighted and unweighted results with respect to the weighting criteria.

3.1.7. Data Protection

Survey research is mainly based on the voluntary participation of individuals that is gained through informed consent. This basis is different in official statistics such as in censuses where participation is obligatory (for an introduction, see Wirth [41]). This means that individuals should have received sufficient information on the voluntary nature and the aims and possible risks of participation in the research project. Individuals should also be capable of making this consensual decision before participation [42]. How the informed consent was obtained from the respondents by the researchers or a fieldwork company should be an essential part of the survey documentation. Ideally, a copy of the consent form (or the text used to gain informed consent) should be presented. The procuring of consent should also be described, e.g., if it was only given orally, or in written form.

Furthermore, the researcher should describe how the individual-level data was processed and if the data were altered in any way to protect individuals’ privacy (e.g., anonymization procedures, see [17]. Documentation of privacy protection is even more important when sensitive issues such as sexual behavior or delinquent activities [43] or vulnerable individuals such as children or patients [44] are surveyed.

3.2. Requirements for Interviewer-Administered Surveys

For in-person and telephone surveys, interviewers have many important tasks (such as identifying sample members, motivating them to participate, and administering complex questionnaires) that can have lasting effects on the quality of the survey data that are collected [45]. For interviewer-administered surveys, researchers should make sure to include the number of active interviewers, the number of contact attempts, and the contact times (weekday and time).

Interviewer effects are generally understood as systematic differences between the obtained results that arise from the characteristics or behavior of the interviewer [46]. Therefore, special attention should be paid in the methodological report to the documentation of the training, management and supervision of the interviewers. This documentation includes the contents of the training, the experience and the socio-demographic characteristics of the employed interviewers. In addition, the number of realized interviews per interviewer provides important information on cluster effects. Furthermore, a reputable survey organization takes a series of measures to ensure the quality of the interviews and detect fraudulent interviewer behavior. The tests that are conducted by professional organizations vary and include the monitoring of the interviewers over the fieldwork period and ex-post controls of interviews. These quality assurance measures and their results should be part of the report. For example, it should be documented how many follow-up contacts with respondents were conducted and whether any suspicious interviews were excluded due to interviewer fraud or cases of complete fabrication.

4. Summary

Methodological reports allow secondary data users to assess the analytical potential and the quality of a dataset for their own research. In this article, we proposed basic requirements for the documentation of survey data that is implied by the TSE approach. We contributed to the ongoing discussion of open science practices by introducing survey documentation as an important aspect of survey data quality and provided clear guidelines on what methodological information should be disclosed.

Our recommendations have many practical implications for survey researchers in the social and behavioral sciences as well as data-holding institutions. Researchers should recognize survey documentation as an essential part of the quality of survey data. We consistently experienced that parts of the reports were adapted almost identically from the field reports of survey organizations. This suggests that much of the disclosure items we have proposed are, in principle, available from survey organizations. However, researchers must become more sensitive to this information and accustomed to requesting it on a regular basis. To ensure survey documentation quality, data-holding institutions must also urge data depositors to report methodological information in a more structured way. Checklists of desirable disclosure items and a clearer demand by the repositories for this information may feed back into projects’ documentation work.

Although we believe that all our proposed disclosure items should find their way into a basic methodological report, we are aware of the notion that the reasons for including these items might vary between a methodological and a user-oriented point of view. However, exploring these different reasons would go beyond the scope of this article. Similarly, we do not provide any recommendations regarding the question of whether the ingredients of a methodological report can be ranked according to general degrees of priority, as this would make some parts mandatory, while others optional. This question might be something researchers, field organizations and data-holding institutions need to negotiate in the future, alongside with several other empirical questions (e.g., the appropriate amount of information in a methodological report, who produces what in a report, etc.).

Finally, weak documentation does not necessarily mean that the quality of available survey data is poor, but rather that their quality is difficult to assess. Thus, it is an open question as to whether poorly elaborated survey reports actually indicate shortcomings in data quality in a way that is useful for researchers who conduct a secondary analysis. Future research should therefore aim at empirically testing whether or not researchers adhere to the basic requirements for the documentation of survey data. For example, a sample of survey reports that were provided by major data archives would provide additional insights into the current state of survey data documentation. Nevertheless, we see our guidelines as a starting point to improve the actual practice of survey documentation. We believe that it is most important to base good research on good data, and good data is distinguished by meaningful methodological documentation that adheres to designated standards.

Author Contributions

Conceptualization, A.J.; writing—original draft preparation, A.J., O.W. and A.F.; writing—review and editing, A.J., O.W. and A.F.

Funding

This research received no external funding.

Acknowledgments

An earlier version of this paper was presented at the 2017 Conference of the European Survey Research Association (ESRA) in Lisbon, Portugal. We are grateful to Reiner Mauer, Boris Heizmann and the two anonymous reviewers for their helpful comments and suggestions.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Table A1. Proposed methodology report checklist.

Report Section	Disclosure Items
Objective and Design	Explain the background of the survey, and state the specific research goals.
	Provide information on the funding of the study.
Target Population and Sampling	Define the target population and eligibility criteria.
	Describe the sampling frame and how sampling units were selected at each sampling stage (type of probability or nonprobability sampling) including the oversampling of segments of the target population.
	If the sample involves clustering and/or stratification, describe the clusters and the stratification criteria.
	In the case of (probability or nonprobability) Internet panels, describe how the panel members were recruited and what measures are taken to ensure panel quality.
Mode of Data Collection	Describe the data collection mode (self-administered compared with interviewer-administered and computer-assisted compared with not computer-assisted).
	If the survey employs a mixed-mode design, indicate which mode was used at which phase or for which part of the respondents.
Survey Instrument	Describe the general topics of the questionnaire.
	If derived variables are constructed (scales, indices), explain their construction.
	If available, give psychometric information on the dimensionality, reliability and validity of scales and indices.
	If special instruments are used, include a self-contained description.
	If a pretest is conducted, report the fieldwork dates, mode of data collection, sampling method, interview duration, number of interviews and interviewers, and outcomes, e.g., changes in the final questionnaire.
Fieldwork	Provide information on the field dates, number of interviews, and interview duration.
	If the survey was interviewer-administered, provide additional information on the number, experience, and characteristics of the interviewers.
	Provide information on contact attempts and times (time and day).
	Describe the content of interviewer training.
	If applicable, describe interviewer monitoring and the measures of ex-post checks of interviews.
Response Rate	Report the appropriate contact rates, cooperation rates, response rates and refusal rates. Report the recruitment rate, the profile rate, the completion rate, and the cumulative response rate for probability-based Internet panels. Report the participation rate for nonprobability Internet panels. In the case of longitudinal surveys document the initial response rate, the wave-specific response rate, and panel attrition.
	If possible, compare the sample characteristics with the known characteristics of the population.
Data Processing	Describe how the data were edited and cleaned up. Describe any problems and corrections that have been undertaken.
	Report how open answers were coded, and document category schemes and inter-coder reliability.
	Describe the creation of weights, and provide descriptive statistics on any weighting variable.
Data Protection and Ethical Issues	Describe the proceedings of obtaining informed consent from the research subjects by the researchers or a fieldwork company and the way that personal information and data were handled in the project or by a fieldwork company.

References

OECD. OECD Principles and Guidelines for Access to Research Data from Public Funding; OECD Publications: Paris, France, 2007. [Google Scholar]
Open Science Collaboration. Estimating the reproducibility of psychological science. Science 2015, 349, aac4716. [Google Scholar] [CrossRef] [Green Version]
Nosek, B.A.; Alter, G.; Banks, G.C.; Borsboom, D.; Bowman, S.D.; Breckler, S.J.; Buck, S.; Chambers, C.D.; Chin, G.; Christensen, G.; et al. Promoting an open research culture. Science 2015, 348, 1422–1425. [Google Scholar] [CrossRef] [PubMed]
Houtkoop, B.L.; Chambers, C.; Macleod, M.; Bishop, D.V.M.; Nichols, T.E.; Wagenmakers, E.-J. Data Sharing in Psychology: A Survey on Barriers and Preconditions. Adv. Methods Pract. Psychol. Sci. 2018, 1, 70–85. [Google Scholar] [CrossRef] [Green Version]
Key, E.M. How Are We Doing? Data Access and Replication in Political Science. PS Political Sci. Politics 2016, 49, 268–272. [Google Scholar] [CrossRef]
Kozlowski, W. Funding Agency Responses to Federal Requirements for Public Access to Research Results. Bull. Am. Soc. Inf. Sci. Technol. 2014, 40, 26–30. [Google Scholar] [CrossRef]
Levenstein, M.C.; Lyle, J.A. Data: Sharing Is Caring. Adv. Methods Pract. Psychol. Sci. 2018, 1, 95–103. [Google Scholar] [CrossRef]
Biemer, P.P.; Lyberg, L.E. Introduction to Survey Quality; Wiley: Hoboken, NJ, USA, 2003. [Google Scholar]
Blasius, J.; Thiessen, V. Assessing the Quality of Survey Data; Sage: London, UK, 2012. [Google Scholar]
Weisberg, H.F. The Total Survey Error Approach: A Guide to the New Science of Survey Research; University of Chicago Press: Chicago, IL, USA, 2005. [Google Scholar]
Callegaro, M.; Villar, A.; Yeager, D.S.; Krosnick, J.A. A Critical Review of Studies Investigating the Quality of Data Obtained with Online Panels Based on Probability and Nonprobability Samples. In Online Panel Research: A Data Quality Perspective; Callegaro, M., Baker, R., Bethlehem, J., Göritz, A.S., Krosnick, J.A., Lavrakas, P.J., Eds.; Wiley: Chichester, UK, 2014; pp. 23–53. [Google Scholar]
Smith, T.W. Refining the Total Survey Error Perspective. Int. J. Public Opin. Res. 2011, 23, 464–484. [Google Scholar] [CrossRef]
APA Publications and Communications Board Working Group on Journal Article Reporting Standards. Reporting Standards for Research in Psychology: Why do we Need Them? What Might they Be? Am. Psychol. 2008, 63, 839–851. [Google Scholar] [CrossRef] [PubMed]
American Association for Public Opinion Research (AAPOR). The AAPOR Code of Professional Ethics and Practices; American Association for Public Opinion Research: Lenexa, KS, USA, 2015. [Google Scholar]
Von Elm, E.; Altman, D.G.; Egger, M.; Pocock, S.J.; Gøtzsche, P.C.; Vandenbroucke, J.P.; Strobe, I. The Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) Statement: Guidelines for Reporting Observational Studies. PLoS Med. 2007, 4, e296. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Eurostat. ESS Handbook for Quality Reports—2014 Edition; Publications Office of the European Union: Luxembourg, 2015. [Google Scholar]
Groves, R.M.; Fowler, F.J.; Couper, M.; Lepkowski, J.M.; Singer, E.; Tourangeau, R. Survey Methodology, 2nd ed.; Wiley: Hoboken, NJ, USA, 2009. [Google Scholar]
Vardigan, M.B.; Granda, P. Archiving, Documentation, and Dissemination. In Handbook of Survey Research, 2nd ed.; Marsden, P.V., Wright, J.D., Eds.; Emerald Group: Bingley, UK, 2010; pp. 707–729. [Google Scholar]
Taylor, M.F. Dissemination Issues for Panel Studies: Metadata and Documentation. In Researching Social and Economic Change: The Uses of Household Panel Studies; Rose, D., Ed.; Routledge: London, UK, 2000; pp. 146–162. [Google Scholar]
Mohler, P.P.; Pennell, B.-E.; Hubbard, F. Survey Documentation: Toward Professional Knowledge Management in Sample Surveys. In International Handbook of Survey Methodology; de Leeuw, E.D., Hox, J.J., Dillman, D.A., Eds.; Erlbaum: New York, NY, USA, 2008; pp. 403–420. [Google Scholar]
Corti, L.; Van den Eynden, V.; Bishop, L.; Woollard, M. Managing and Sharing Research Data: A Guide to Good Practice; Sage: London, UK, 2014. [Google Scholar]
Valliant, R.; Dever, J.A.; Kreuter, F. Practical Tools for Designing and Weighting Survey Samples; Springer: New York, NY, USA, 2013. [Google Scholar]
American Association for Public Opinion Research (AAPOR). Standard Definitions: Final Dispositions of Case Codes and Outcome Rates for Surveys, 9th ed.; American Association for Public Opinion Research: Lenexa, KS, USA, 2016. [Google Scholar]
Baker, R.; Blumberg, S.J.; Brick, J.M.; Couper, M.P.; Courtright, M.; Dennis, J.M.; Dillman, D.; Frankel, M.R.; Garland, P.; Groves, R.M.; et al. AAPOR Report on Online Panels. Public Opin. Q. 2010, 74, 711–781. [Google Scholar] [CrossRef]
Couper, M.P.; Bosnjak, M. Internet Surveys. In Handbook of Survey Research, 2nd ed.; Marsden, P.V., Wright, J.D., Eds.; Emerald Group: Bingley, UK, 2010; pp. 527–550. [Google Scholar]
ESOMAR. 28 Questions to Help Reseach Buyers of Onlime Samples; European Society for Opinion and Market Research: Amsterdam, The Netherlands, 2012. [Google Scholar]
Hillygus, D.S.; Jackson, N.M.; Young, M. Professional Respondents in Nonprobability Online Panels. In Online Panel Research: A Data Quality Perspective; Callegaro, M., Baker, R., Bethlehem, J., Göritz, A.S., Krosnick, J.A., Lavrakas, P.J., Eds.; Wiley: Chichester, UK, 2014; pp. 219–237. [Google Scholar]
Fowler, F.J. Survey Research Methods, 4th ed.; Sage: Thousand Oaks, CA, USA, 2009. [Google Scholar]
De Leeuw, E.D. To Mix or Not to Mix Data Collection Modes in Surveys. J. Off. Stat. 2005, 21, 233–255. [Google Scholar]
Dillman, D.A.; Messer, B.L. Mixed-Mode Surveys. In Handbook of Survey Research, 2nd ed.; Marsden, P.V., Wright, J.D., Eds.; Emerald Group: Bingley, UK, 2010; pp. 551–574. [Google Scholar]
Alwin, D.F. How Good is Survey Measurement? Assessing the Reliability and Validity of Survey Measures. In Handbook of Survey Research, 2nd ed.; Marsden, P.V., Wright, J.D., Eds.; Emerald Group: Bingley, UK, 2010; pp. 405–434. [Google Scholar]
Campanelli, P. Testing Survey Questions. In International Handbook of Survey Methodology; de Leeuw, E.D., Hox, J.J., Dillman, D.A., Eds.; Erlbaum: New York, NY, USA, 2008; pp. 176–200. [Google Scholar]
Willis, G.B. Cognitive Interviewing: A Tool for Improving Questionnaire Design; Sage: Thousand Oaks, CA, USA, 2005. [Google Scholar]
Koch, A.; Blom, A.G.; Stoop, I.A.L.; Kappelhof, J. Data Collection Quality Assurance in Cross-National Surveys: The Example of the ESS. Methods Data Anal. 2009, 3, 219–247. [Google Scholar] [CrossRef]
Malter, F. Fieldwork Management and Monitoring in SHARE Wave Four. In SHARE Wave 4: Innovations and Methodology; Börsch-Supan, A., Malter, F., Eds.; Munich Center for the Economics of Aging (MEA): Munich, Germany, 2013; pp. 124–139. [Google Scholar]
Groves, R.M.; Peytcheva, E. The Impact of Nonresponse Rates on Nonresponse Bias: A Meta-Analysis. Public Opin. Q. 2008, 72, 167–189. [Google Scholar] [CrossRef]
Lynn, P.; Lugtig, P.J. Total Survey Error for Longitudinal Surveys. In Total Survey Error in Practice; Biemer, P., de Leeuw, E., Eckman, S., Edwards, B., Kreuter, F., Lyberg, L.E., Tucker, N.C., West, B.T., Eds.; Wiley: Hoboken, NJ, USA, 2017; pp. 279–298. [Google Scholar] [Green Version]
Callegaro, M.; Disogra, C. Computing Response Metrics for Online Panels. Public Opin. Q. 2008, 72, 1008–1032. [Google Scholar] [CrossRef] [Green Version]
ICPSR. Guide to Social Science Data Preparation and Archiving, 5th ed.; Inter-University Consortium for Political and Social Research: Ann Arbor, MI, USA, 2012. [Google Scholar]
Biemer, P.P.; Christ, S.L. Weighting Survey Data. In International Handbook of Survey Methodology; de Leeuw, E.D., Hox, J.J., Dillman, D.A., Eds.; Erlbaum: New York, NY, USA, 2008; pp. 317–341. [Google Scholar]
Wirth, H. Analytical Potential Versus Data Confidentiality—Finding the Optimal Balance. In The Sage Handbook of Survey Methodology; Wolf, C., Joye, D., Smith, T.W., Fu, Y.-C., Eds.; Sage: London, UK, 2016; pp. 488–501. [Google Scholar]
Faden, R.R.; Beauchamp, T.L. A History and Theory of Informed Consent; Oxford University Press: New York, NY, USA, 1986. [Google Scholar]
McNeeley, S. Sensitive Issues in Surveys: Reducing Refusals While Increasing Reliability and Quality of Responses to Sensitive Survey Items. In Handbook of Survey Methodology for the Social Sciences; Gideon, L., Ed.; Springer: New York, NY, USA, 2012; pp. 377–396. [Google Scholar]
European Commission. European Textbook on Ethics in Research; Publications Office of the European Union: Luxembourg, 2010. [Google Scholar]
Schaeffer, N.C.; Dykema, J.; Maynard, D.W. Interviewers and Interviewing. In Handbook of Survey Research, 2nd ed.; Marsden, P.V., Wright, J.D., Eds.; Emerald Group: Bingley, UK, 2010; pp. 437–470. [Google Scholar]
Groves, R.M. Survey Errors and Survey Costs; Wiley: New York, NY, USA, 1989. [Google Scholar]

1	Depending on the source and usage of weights, this step might alternatively be included in the data processing section of the report (see below).

Table 1. Key questions in preparing a methodological report.

Question	Report Section	Sources of Error
For what purpose were the data collected?	Objective and Design
How were the respondents selected?	Target Population and Sampling	Coverage error Sampling error Unit nonresponse
How were the data collected?	Mode of Data Collection	Mode effects
What information was collected?	Survey Instrument	Respondent-related measurement error Item nonresponse
Who has collected the data when and where?	Fieldwork	Interviewer-related measurement error House effects
How were the data edited, coded, and weighted?	Data Processing	Processing errors Adjustment errors
Were provisions of data protection laws respected?	Data Protection	Ignoring legal issues

Note. Sources of error are largely based on Weisberg [10].

© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Jedinger, A.; Watteler, O.; Förster, A. Improving the Quality of Survey Data Documentation: A Total Survey Error Perspective. Data 2018, 3, 45. https://doi.org/10.3390/data3040045

AMA Style

Jedinger A, Watteler O, Förster A. Improving the Quality of Survey Data Documentation: A Total Survey Error Perspective. Data. 2018; 3(4):45. https://doi.org/10.3390/data3040045

Chicago/Turabian Style

Jedinger, Alexander, Oliver Watteler, and André Förster. 2018. "Improving the Quality of Survey Data Documentation: A Total Survey Error Perspective" Data 3, no. 4: 45. https://doi.org/10.3390/data3040045

Article Menu

Improving the Quality of Survey Data Documentation: A Total Survey Error Perspective

Abstract

1. Introduction

2. Survey Documentation and Survey Quality

3. Assessing the Quality of Survey Documentation

3.1. Basic Requirements

3.1.1. Target Population and Sampling

3.1.2. Mode of Data Collection

3.1.3. Survey Instrument

3.1.4. Fieldwork

3.1.5. Response Rates

3.1.6. Data Processing

3.1.7. Data Protection

3.2. Requirements for Interviewer-Administered Surveys

4. Summary

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI