Web Mining of Online Resources for German Labor Market Research and Education: Finding the Ground Truth?
Abstract
:1. Introduction
2. Related Work
3. Method
3.1. Data Schemata
3.2. Web Resources
Listing 1. GET-request for a page with occupations from BERUFENET. |
berufe=$(curl -m 60 \ -H "X-API-Key: d672172b-f3ef-4746-b659-227c39d95acf" \ "https://rest.arbeitsagentur.de/infosysbub/"\ "bnet/pc/v1/berufe?suchwoerter=*&page=0") |
Listing 2. GET-request for a details on a occupation from BERUFENET. |
berufeinfo=$(curl -m 60 \ -H "X-API-Key: d672172b-f3ef-4746-b659-227c39d95acf" \ "https://rest.arbeitsagentur.de/infosysbub/"\ "bnet/pc/v1/berufe/15322") |
- BERUFENET (https://github.com/bundesAPI/berufenet-api);
- -
- AUSBILDUNGSSUCHE (https://github.com/bundesAPI/ausbildungssuche-api);
- -
- WEITERBILDUNGSSUCHE (https://github.com/bundesAPI/weiterbildungssuche-api);
- -
- STUDIENSUCHE (https://github.com/bundesAPI/studiensuche-api);
- -
- SPRACHFÖRDERUNG (https://github.com/bundesAPI/berufssprachkurssuche-api);
- -
- COACHINGUNDAKTIVIERUNG (https://github.com/bundesAPI/coachingangebote-api).
- JOBSUCHE (https://github.com/bundesAPI/jobsuche-api);
- BEWERBERBÖRSE (https://github.com/bundesAPI/bewerberboerse-api);
- ENTGELTATLAS (https://github.com/bundesAPI/entgeltatlas-api);
- NEWPLAN (https://github.com/bundesAPI/newskills-api).
- Ausbildungsordnungen (https://www.bibb.de/dienst/berufesuche/de/index_berufesuche.php/) with classification (KldB 2010), statistical data, and a genealogy of occupations;
- Fortbildungsordnungen (https://www.bibb.de/dienst/berufesuche/de/index_berufesuche.php/; see Figure 3 for an example) with classification (KldB 2010) and statistical data;
- Berufesuche (https://www.bibb.de/de/40.php).
3.3. Methods
- “Fachinformatiker/in—Daten- und Prozessanalyse”/KldB/DKZ-code “B 43112-919”;
- “Data Scientist”/KldB/DKZ-code “B 43104-132”.
4. Results
4.1. BIBB Education Pathways
4.2. BA Education Pathways
- “IT Project Management”;
- “IT Service Management; IT Infrastructure Library (ITIL)”;
- “Marketing”;
- “Sales”;
- “Controlling”;
- “Business organization, work study”;
- “Employee leadership, teamwork, leadership”.
- “Acquisition”;
- “Business administration”;
- “Controlling”;
- “Information technology, computer technology”;
- “IT coordination”;
- “IT organization”;
- “Calculation”;
- “Cost and performance accounting”;
- “Customer service, care”;
- “Marketing”,
- “Human resource”.
4.3. Interoperability
- The most efficient way of relating information from BERUFENET to other data sources or to classification systems seems to be the KldB/DKZ codes (eight digits), which are stored in BERUFENET and many other data sources (e.g., AUSBILDUNGSSUCHE; see Figure 1); data sources of the BA that do not contain a KldB/DKZ code can often be related to a KldB/DKZ code by matching short occupation titles (although short occupation titles, unlike BERUFENET IDs or the eight-digit variants of KldB/DKZ codes do not differ for training and for the occupational activity).
- Results of the JOBSUCHE have an attribute “beruf” that contains occupational titles that could be matched with BERUFENET’ short occupation titles; the JOBSUCHE API does not seem to provide KldB-/DKZ-codes.
- Results of the AUSBILDUNGSSUCHE have an attribute “abschlussbezeichnung” that contains training job designations that could (after removing HTML tags) be matched with BERUFENET’s short occupation titles.
- Results of the STUDIENSUCHE have an attribute “Studienfaecher”, which contains one or more course designations that could be matched with BERUFENET’s short occupation titles.
5. Conclusions and Outlook
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
BA | German Federal Employment Agency (Bundesagentur für Arbeit) |
BIBB | Federal Institute for Vocational Education and Training (Bundesinstitut für Berufsbildung) |
CVET | Continuing Vocational Education and Training |
DKZ | Documentation number, “Dokumentationskennziffer” |
ESCO | European Skills/Competences, Qualifications and Occupations |
GLMO | German Labor Market Ontology |
ISCO | International Standard Classification of Occupations |
KldB | German Classification of Occupations, “Klassifikation der Berufe” |
KSAO | Knowledge, Skills, Abilities, and Other components |
OJA | Online Job Advertisements |
VET | Vocational Education and Training |
References
- Fischer, A.; Hecker, K.; Wittig, W. Arbeitsmarktbedarfsanalyse zu beruflichen Kompetenzen und Teilqualifikationen: Eine repräsentative Unternehmensbefragung; Forschungsinstitut Betriebliche Bildung (F-BB): Nürnberg, Germany, 2020. [Google Scholar]
- Fischer, A.; Jöchner, A.; Pabst, C.; Lorenz, S.; Schley, T. KI-Basierte Personalisierung Berufsbezogener Weiterbildung: Ein Praxisleitfaden für Bildungsanbieter; wbv-Verlag: Nürnberg, Germany, 2023. [Google Scholar]
- ESCO Handbook: European Skills, Competences, Qualifications and Occupations, 2nd ed.; Publication Office of the European Union: Luxembourg, 2019. [CrossRef]
- Helmrich, R.; Tiemann, M.; Troltsch, K.; Lukowski, F.; Neuber-Pohl, C.; Lewalder, A.C.; Gunturk-Kuhl, B. Digitalisierung der Arbeitslandschaften: Keine Polarisierung der Arbeitswelt, Aber Beschleunigter Strukturwandel und Arbeitsplatzwechsel; Wissenschaftliche Diskussionspapiere; Federal Institute for Vocational Education and Training (BIBB): Bonn, Germany, 2016; Number 180. [Google Scholar]
- Fischer, A.; Hilse, P.; Schütt-Sayed, S. Rahmenlehrpläne–Spiegel der Bedeutung nachhaltiger Entwicklung. In Zum Konzept der Nachhaltigkeit in Arbeit, Beruf und Bildung—Stand in Forschung und Praxis; Barbara Budrich: Nürnberg, Germany, 2023; pp. 281–302. [Google Scholar]
- Graf, L.; Lohse, A.P. Advanced skill formation between vocationalization and academization: The governance of professional schools and dual study programmes in Germany. In Governance Revisited: Challenges and Opportunities for Vocational Education and Training; Gonon, P., Bürgi, R., Eds.; Peter Lang Group AG: Lausanne, Switzerland, 2021. [Google Scholar]
- Dikau, J. Rechtliche und organisatorische Bedingungen der beruflichen Weiterbildung. In Handbuch der Berufsbildung; Springer: Wiesbaden, Germany, 1995; pp. 427–440. [Google Scholar]
- Bauer, R.; Bauer, R. Die Debatte über die Zukunft der dualen Berufsausbildung. In Verberuflichung von Weiterbildung und die Zukunft der Dualen Berufsausbildung: Eine Berufssoziologische Analyse am Beispiel des Kraftfahrzeuggewerbes; Springer: Wiesbaden, Germany, 2000; pp. 21–84. [Google Scholar]
- Dutt, A.; Ismail, M.A.; Herawan, T. A systematic review on educational data mining. IEEE Access 2017, 5, 15991–16005. [Google Scholar] [CrossRef]
- Mohamad, S.K.; Tasir, Z. Educational data mining: A review. Procedia-Soc. Behav. Sci. 2013, 97, 320–324. [Google Scholar] [CrossRef]
- Romero, C.; Ventura, S. Educational data mining: A survey from 1995 to 2005. Expert Syst. Appl. 2007, 33, 135–146. [Google Scholar] [CrossRef]
- Kovalev, S.; Kolodenkova, A.; Muntyan, E. Educational data mining: Current problems and solutions. In Proceedings of the 2020 V International Conference on Information Technologies in Engineering Education (Inforino), Moscow, Russia, 14–17 April 2020; pp. 1–5. [Google Scholar]
- Marlis, B.; Buchs, H.; Ann-Sophie, G. Occupational Inequality in Wage Returns to Employer Demand for Types of Information and Communications Technology (ICT) Skills: 1991–2017. Kölner Z. Soziol. Sozialpsychol. 2020, 72, 455–482. [Google Scholar]
- Settelmeyer, A.; Bremser, F.; Lewalder, A.C. Migrationsbedingte Mehrsprachigkeit—Ein „Plus“ beim Übergang von der Schule in den Beruf. In Interkulturelle und Sprachliche Bildung im Mehrsprachigen Übergang Schule-Beruf; Waxman: Münster, Germany, 2017; pp. 135–150. [Google Scholar]
- Ningrum, P.K.; Pansombut, T.; Ueranantasun, A. Text mining of online job advertisements to identify direct discrimination during job hunting process: A case study in Indonesia. PLoS ONE 2020, 15, e0233746. [Google Scholar] [CrossRef] [PubMed]
- Smirnov, I. Estimating educational outcomes from students’ short texts on social media. EPJ Data Sci. 2020, 9, 27. [Google Scholar] [CrossRef]
- Ortmann, T.T.; Bönke, D.H.; Hammer, L. Bessere Perspektiven bei Jobwechseln. Zur Ähnlichkeit Beruflicher Übergänge; Gieselmann: Gütersloh, Germany, 2023. [Google Scholar]
- Degenhardt, S. Kompetenzen für eine digitalisierte Arbeitswelt–Anforderungen an Aus-und Weiterbildung. In Digitaler Wandel in der Sozialwirtschaft; Nomos Verlagsgesellschaft mbH & Co.: Baden-Baden, Germany, 2018; pp. 259–272. [Google Scholar]
- Kreuzer, C. Visualisierung der Opportunity Recognition-Kompetenz von Industriekaufleuten. Z. Berufs Wirtsch. 2018, 114, 247–271. [Google Scholar] [CrossRef]
- Beręsewicz, M.; Pater, R. Inferring Job Vacancies from Online Job Advertisements; Publications Office of the European Union: Luxembourg, 2021. [Google Scholar]
- Khaouja, I.; Kassou, I.; Ghogho, M. A survey on skill identification from online job ads. IEEE Access 2021, 9, 118134–118153. [Google Scholar] [CrossRef]
- Carnevale, A.P.; Jayasundera, T.; Repnikov, D. Understanding Online Job Ads Data; Technical Report; Center on Education and the Workforce, Georgetown University: Washington, DC, USA, 2014. [Google Scholar]
- Ros, R.; Van Erp, M.; Rijpma, A.; Zijdeman, R. Mining Wages in Nineteenth-Century Job Advertisements: The Application of Language Resources and Language Technology to study Economic and Social Inequality. In Proceedings of the Workshop about Language Resources for the SSH Cloud; Association for Computational Linguistics: Stroudsburg, PA, USA, 2020; pp. 27–32. [Google Scholar]
- Gnehm, A.S.; Clematide, S. Text zoning and classification for job advertisements in German, French and English. In Proceedings of the Fourth Workshop on Natural Language Processing and Computational Social Science; Association for Computational Linguistics: Stroudsburg, PA, USA, 2020; pp. 83–93. [Google Scholar]
- Buchmann, M.; Buchs, H.; Busch, F.; Clematide, S.; Gnehm, A.S.; Müller, J. Swiss Job Market Monitor: A Rich Source of Demand-Side Micro Data of the Labour Market. Eur. Sociol. Rev. 2022, 38, 1001–1014. [Google Scholar] [CrossRef]
- Hermes, J.; Schandock, M. Stellenanzeigenanalyse in der Qualifikationsentwicklungsforschung. In Die Nutzung Maschineller Lernverfahren zur Klassifikation von Textabschnitten; Forschungsinstitut Betriebliche Bildung (F-BB): Nürnberg, Germany, 2016. [Google Scholar]
- Ziegler, M.; Horstmann, K.; Wehner, C. Machbarkeitsstudie: Teilqualifikationen in Online-Jobanzeigen (OJA); Humboldt-Universität zu Berlin: Berlin, Germany, 2022. [Google Scholar]
- Janser, M. The Greening of Jobs in Germany: First Evidence from a Text Mining Based Index and Employment Register Data; Technical report, IAB-Discussion Paper; Institut für Arbeitsmarkt- und Berufsforschung (IAB): Nürnberg, Germany, 2018. [Google Scholar]
- Binnewitt, J.; Schnepf, T. Join us to turn the wor(l)d greener!—Investigating online apprenticeship advertisements’ reference to environmental sustainability. In Zum Konzept der Nachhaltigkeit in Arbeit, Beruf und Bildung—Stand in Forschung und Praxis; Federal Institute for Vocational Education and Training (BIBB): Bonn, Germany, 2022. [Google Scholar]
- Ziegler, P. Zur Verwendung von Berufsinformation im Hinblick auf Matching in Deutschland und Österreich; Technical report, AMS Info; Leibniz Information Centre for Economics: Hamburg, Germany, 2012. [Google Scholar]
- Li, N.; Kang, B.; De Bie, T. SkillGPT: A RESTful API service for skill extraction and standardization using a Large Language Model. arXiv 2023, arXiv:2304.11060. [Google Scholar]
- Bhola, A.; Halder, K.; Prasad, A.; Kan, M.Y. Retrieving skills from job descriptions: A language model based extreme multi-label classification framework. In Proceedings of the 28th International Conference on Computational Linguistics, Online, 9–13 December 2020; pp. 5832–5842. [Google Scholar]
- Khaouja, I.; Mezzour, G.; Carley, K.M.; Kassou, I. Building a soft skill taxonomy from job openings. Soc. Netw. Anal. Min. 2019, 9, 1–19. [Google Scholar] [CrossRef]
- International Labour Office. The Feasibility of Using Big Data in Anticipating and Matching Skills Needs; International Labour Office: Geneva, Switzerland, 2020. [Google Scholar]
- Stops, M.; Bächmann, A.C.; Glassner, R.; Janser, M.; Matthes, B.; Metzger, L.J.; Müller, C.; Seitz, J. Machbarkeitsstudie Kompetenz-Kompass: Teilprojekt 2: Beobachtung von Kompetenzanforderungen in Stellenangeboten; Bundesministerium für Arbeit und Soziales: Berlin, Germany, 2020. [Google Scholar]
- Fischer, A.; Neubert, J.C. The multiple faces of complex problems: A model of problem solving competency and its implications for training and assessment. J. Dyn. Decis. Mak. 2015, 1, 6. [Google Scholar]
- Fischer, A.; Hecker, K.; Pfeiffer, I. Berufliche Kompetenzen von Geflüchteten erkennen? Exemplarische Befunde zur Kompetenzmessung im Bereich der Metallbearbeitung und Metallverarbeitung. Z. Weiterbildungsforschung 2019, 42, 115–131. [Google Scholar] [CrossRef]
- Bundesagentur für Arbeit. Band 1: Systematischer und Alphabetischer Teil mit Erläuterungen; Bundesagentur für Arbeit: Nuremberg, Germany, 2010. [Google Scholar]
- Paulus, W.; Matthes, B. The German classification of occupations 2010–structure, coding and conversion table. FDZ-Methodenreport 2013, 8, 2013. [Google Scholar]
- Dörpinghaus, J.; Binnewitt, J.; Hein, K. Lessons from Continuing Vocational Training Courses for Computer Science Education. In Proceedings of the ITiCSE 2023: Innovation and Technology in Computer Science Education, Turku, Finland, 7–12 July 2023; p. 636. [Google Scholar]
- Dörpinghaus, J.; Binnewitt, J.; Winnige, S.; Hein, K.; Krüger, K. Towards a German labor market ontology: Challenges and applications. Appl. Ontol. 2023, 18, 343–365. [Google Scholar] [CrossRef]
- Dörpinghaus, J.; Samray, D.; Helmrich, R. Challenges of Automated Identification of Access to Education and Training in Germany. Information 2023, 14, 524. [Google Scholar] [CrossRef]
- Fechner, R.; Dörpinghaus, J.; Firll, A. Classifying Industrial Sectors from German Textual Data with a Domain Adapted Transformer. In Proceedings of the 2023 18th Conference on Computer Science and Intelligence Systems (FedCSIS), Warsaw, Poland, 17–20 September 2023; pp. 463–470. [Google Scholar]
- le Vrang, M.; Papantoniou, A.; Pauwels, E.; Fannes, P.; Vandensteen, D.; De Smedt, J. Esco: Boosting job matching in europe with semantic interoperability. Computer 2014, 47, 57–64. [Google Scholar] [CrossRef]
- González, L.; García-Barriocanal, E.; Sicilia, M.A. Entity Linking as a Population Mechanism for Skill Ontologies: Evaluating the Use of ESCO and Wikidata. In Proceedings of the 14th International Conference, MTSR 2020, Madrid, Spain, 2–4 December 2020; pp. 116–122. [Google Scholar]
- Kitto, K.; Sarathy, N.; Gromov, A.; Liu, M.; Musial, K.; Buckingham Shum, S. Towards skills-based curriculum analytics: Can we automate the recognition of prior learning? In Proceedings of the LAK ’20: 10th International Conference on Learning Analytics and Knowledge, Frankfurt, Germany, 23–27 March 2020; pp. 171–180. [Google Scholar]
- Fareri, S.; Melluso, N.; Chiarello, F.; Fantoni, G. SkillNER: Mining and mapping soft skills from any text. Expert Syst. Appl. 2021, 184, 115544. [Google Scholar] [CrossRef]
- Neutel, S.; de Boer, M.H. Towards Automatic Ontology Alignment using BERT. In Proceedings of the AAAI 2021 Spring Symposium on Combining Machine Learning and Knowledge Engineering (AAAI-MAKE 2021), Palo Alto, CA, USA, 22–24 March 2021. [Google Scholar]
- Fischer, A. Toot 111039750735796601 on Chaos.Social. Technical Report. 2023. Available online: https://chaos.social/@AFischer1985/111039750735796601 (accessed on 1 January 2024).
- Schimpl-Neimanns, B. Mikrodaten-Tools: Umsetzung der Berufsklassifikation von Blossfeld auf die Mikrozensen 1973–1998; GESIS—Leibniz-Institut für Sozialwissenschaften: Mannheim, Germany, 2003. [Google Scholar]
- Brauns, H.; Steinmann, S.; Haun, D. Die Konstruktion des Klassenschemas nach Erikson, Goldthorpe und Portocarero (EGP) am Beispiel nationaler Datenquellen aus Deutschland, Großbritannien und Frankreich. Zuma Nachrichten 2000, 24, 8–63. [Google Scholar]
- Ganzeboom, H. Questions and Answers about ISEI-08. Stand 2010, 13, 2016. [Google Scholar]
- Ganzeboom, H.B.; Treiman, D.J. Internationally comparable measures of occupational status for the 1988 International Standard Classification of Occupations. Soc. Sci. Res. 1996, 25, 201–239. [Google Scholar] [CrossRef]
- Güntürk-Kuhl, B. Die Taxonomie der Arbeitsmittel des BIBB; Federal Institute for Vocational Education and Training (BIBB): Bonn, Germany, 2017. [Google Scholar]
- Kuppe, A.M.; Lorig, B.; Schwarz, H.; Stöhr, A. Ausbildungsordnungen und wie sie Entstehen; Federal Institute for Vocational Education and Training (BIBB): Bonn, Germany, 2015. [Google Scholar]
Research Question | Literature |
---|---|
Common data structures | |
Specific structures | [17,33,38,39] |
General structures, ontologies | [41] |
Methods for data extraction | |
German OJAs | [24,25,42] |
Skill extraction | [36,37] |
Qualification development | [26,27] |
Greening of job index (GOJI) | [28,29] |
Other metadata like industrial sectors | [43] |
Data Set | Source | Initial Format | Historical Data | Data Records |
---|---|---|---|---|
BERUFENET (API) | BA | JSON | 3569 | |
AUSBILDUNGSSUCHE (API) | BA | JSON | >24 K | |
STUDIENSUCHE (API) | BA | JSON | >15 K | |
WEITERBILDUNGSSUCHE (API) | BA | JSON | >5 M | |
SPRACHFÖRDERUNG (API) | BA | JSON | >35 K | |
COACHINGUNDAKTIVIERUNG (API) | BA | JSON | >100 K | |
JOBSUCHE (API) | BA | JSON | >900 K | |
BEWERBERBÖRSE (API) | BA | JSON | >1.7 M | |
ENTGELTATLAS (API) | BA | JSON | <3569 | |
NEWPLAN (API) | BA | JSON | <3569 | |
Higher Education Degrees [DBA] | BA | CSV | 797 | |
Occupations (KldB) [DBA] | BA | CSV, XML | X | 33,802 |
Continuing professional development [DBA] | BA | CSV | 542 | |
Continuing professional development [BS] | BIBB | CSV, PDF | X | 218 |
Vocational Education [BS] | BIBB | CSV, PDF | X | 357 |
Skills [DBA] | BA | XML | 9078 | |
Tools [54] | BIBB | CSV | 10,978 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Fischer, A.; Dörpinghaus, J. Web Mining of Online Resources for German Labor Market Research and Education: Finding the Ground Truth? Knowledge 2024, 4, 51-67. https://doi.org/10.3390/knowledge4010003
Fischer A, Dörpinghaus J. Web Mining of Online Resources for German Labor Market Research and Education: Finding the Ground Truth? Knowledge. 2024; 4(1):51-67. https://doi.org/10.3390/knowledge4010003
Chicago/Turabian StyleFischer, Andreas, and Jens Dörpinghaus. 2024. "Web Mining of Online Resources for German Labor Market Research and Education: Finding the Ground Truth?" Knowledge 4, no. 1: 51-67. https://doi.org/10.3390/knowledge4010003
APA StyleFischer, A., & Dörpinghaus, J. (2024). Web Mining of Online Resources for German Labor Market Research and Education: Finding the Ground Truth? Knowledge, 4(1), 51-67. https://doi.org/10.3390/knowledge4010003