*3.1. Surveying*

From the 1800s through the 1990s, the primary method that federal, state, and local governments have used to compile large education datasets is surveying. Historians estimate that early Sumerian societies around 3200 B.C. conducted censuses of their population to distribute resources and plan for levees and canals to ensure adequate water and food supply [22]. In modern societies, many countries have censuses written into their founding government documents, including the United States, Australia, several South American countries, and most of the European Union [22]. Other developing and developed countries, such as China and India, began mass collecting and publishing census data in the 1990s, and globally, nearly all forms of the census have included questions related to educational attainment level and the number of school-aged children in the household [22].

However, multiple issues arise when promoting equitable data collection for developing educational datasets through survey methods. First, organizations such as the OECD and the European Union now gather data online through Internet-based questionnaires and other methods using Internet technologies. By contrast, many developing nations do not have access to high-speed Internet—or any Internet—to facilitate effective and efficient data collection, especially in rural areas. Moreover, many developing nations have large swaths of people spread across rural, sparsely populated areas of the country, rendering robust and equitable data collection nearly impossible in countries within the Latin American and Caribbean region, Burundi, Uganda, and Nepal where rural populations comprise over 80% of the overall citizenry in each country [7].

Beyond geographic and technological limitations, many developing nations' governments do not have the human or financial resources to staff survey developers, census takers, or data architects to administer the work and disseminate its results. For instance, the United States begins its hiring process for its ten-year census two years before its administration, usually hiring over 200,000 temporary workers to complete the work [23]. Moreover, the U.S. Department of Education specifically created the National Center for Education Statistics to help liaise with schools to gather and disseminate educational data [15]. In these cases, many developing nations do not have the resources to create such offices and protocols to gather consistent, representative, reliable education data at any time interval, much less on a yearly basis as is the status quo in the United States and many other developed nations.

Finally, a wealth of education data is often tied to government funding or grant administration, requiring educational organizations to report data to their funding agency, usually a local-, state-, or federal-level entity. Although this method is not surveying in a typical sense, there are yearly reports that institutions of higher education must complete that often arrive in the form of a questionnaire. For instance, in the United States context, the process for distributing federal student aid to postsecondary students is mediated by the U.S. Department of Education through a program called Title IV, which authorizes U.S. institutions of higher education to administer financial aid programs through federal funds. As federal student aid is responsible for most of the student financing in the United States, there are over 6000 Title-IV-participating institutions of higher education in the United States. To participate, institutions must regularly report education data to the U.S. Department of Education related to the amount and type of aid that their students are receiving, as well as students' academic progress indicators [24]. Here, nationally representative educational datasets are being created in part by federal programming that requires institutional data reporting, yet many countries may not have these policy mechanisms in place through federal programs to gather such data.

#### *3.2. Technologically Mediated Data Sharing Agreements*

One of the largest international data-sharing platforms is the Statistical Data and Metadata eXchange (SDMX), sponsored by seven international organizations: the Bank for International Settlements (BIS), the European Central Bank (ECB), Eurostat (Statistical

Office of the European Union), the International Monetary Fund (IMF), the Organization for Economic Cooperation and Development (OECD), the United Nations Statistical Division (UNSD), and the World Bank. SDMX is a technology and data-sharing initiative that "aims at standardising and modernising the mechanisms and processes for the exchange of statistical data and metadata among international organisations and their member countries" [25] (para. 3). Extending the survey work performed by individual nations, the SDMX allows for larger, international organizations to integrate their data into an even larger repository, allowing for unique collaborations, such as the OECD working with the International Monetary Fund, to better understand how international monetary policies may affect low-GDP nations.

However, developing nations that cannot perform the national-level survey work to lay the foundation for international data sharing thus cannot reap the benefits of international platforms such as SDMX. In this case, educational datasets across nations may be further stratified by efforts such as the ones by SDMX, with developed nations already able to gather their own national-level data in addition to reaping the benefits of international data sharing, collaboration, and joint policy development. As a result, it is critical for developed nations to scaffold the efforts of developing nations to begin the national-level survey work to allow for developing nations to participate in international data-sharing agreements, such as SDMX.
