Synthetic Subject Generation with Coupled Coherent Time Series Data †
Abstract
:1. Introduction
2. Related Work
3. Proposed Approaches
3.1. Synthetic Subjects with Real Time Series Data
3.2. Synthetic Subjects with Synthetic Time Series Data
4. Implementation
4.1. Dataset Selection
4.2. EDA and Data Preprocessing
4.3. Time Series Feature Extraction
4.4. Synthetic Subject Generation
4.5. Time Series Coupling with Synthetic Subject
Algorithm 1. Coupling method. |
def coupler(synth_row): i = ‘null’ Tid = ‘null’ for row in stats: curr = stats(row) temp = curr − synth_row dst = sum(absolute(temp)) if dst < i or i == ‘null’: i = dst Tid = stats (‘ID_test’) return Tid assigned_ID = [] for row in synth_stats: Test = coupler (synth_stats (row)) assigned_ID.append(Test) synth_info [‘ID_test’] = assigned_ID |
4.6. Validation of Subjects
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Fu, T. A review on time series data mining. Eng. Appl. Artif. Intell. 2011, 24, 164–181. [Google Scholar] [CrossRef]
- VITALISE H2020. Available online: https://vitalise-project.eu/about/ (accessed on 6 March 2022).
- Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016—on the Protection of Natural Persons with Regard to the Processing of Personal Data and on the Free Movement of Such Data, and Repealing Directive 95/46/EC (General Data Protection Regulation). Available online: https://eur-lex.europa.eu/eli/reg/2016/679/oj (accessed on 22 March 2022).
- Hernandez, M.; Epelde, G.; Beristain, A.; Álvarez, R.; Molina, C.; Larrea, X.; Alberdi, A.; Timoleon, M.; Bamidis, P.; Konstantinidis, E. Incorporation of Sythetic Data Generation Techniques within a Controlled Data Processing Workflow in the Health and Wellbeing Domain. Electronics 2022, 11, 812. [Google Scholar] [CrossRef]
- Emam, K.E.; Hoptroff, R. The synthetic data paradigm for using and sharing data. Cut. Exec. Update 2019, 19, 6. [Google Scholar]
- Hernandez, M.; Epelde, G.; Alberdi, A.; Cilla, R.; Rankin, D. Synthetic data generation for tabular health records: A systematic review. Neurocomputing 2022, 493, 28–45. [Google Scholar] [CrossRef]
- Norgaard, S.; Saeedi, R.; Sasani, K.; Gebremedhin, A.H. Synthetic Sensor Data Generation for Health Applications: A Supervised Deep Learning Approach. In Proceedings of the 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Honolulu, HI, USA, 18–21 July 2018; pp. 1164–1167. [Google Scholar]
- Yoon, J.; Jarrett, D.; van der Schaar, M. Time-series Generative Adversarial Networks. In Advances in Neural Information Processing Systems; Wallach, H., Larochelle, H., Beygelzimer, A., Alché-Buc, F.d’, Fox, E., Garnett, R., Eds.; Curran Asociates, Inc.: New York, NY, USA, 2019. [Google Scholar]
- Dash, S.; Yale, A.; Guyon, I.; Bennett, K.P. Medical Time-Series Data Generation Using Generative Adversarial Networks. In Artificial Intelligence in Medicine; Michalowski, M., Mokovitch, R., Eds.; Springer International Publishing: Cham, Switzerland, 2020; pp. 382–391. [Google Scholar]
- Wang, S.; Rudolph, C.; Nepal, S.; Grobler, M.; Chen, S. PART-GAN: Privacy-preserving time-series sharing. In Artificial Neural Networks and Machine Learning—ICANN 2020, Proceedings of the 29th International Conference on Artificial Neural Networks, Bratislava, Slovakia, 15–18 September 2020, Part I; Springer: Cham, Switzerland, 2020; pp. 578–593. [Google Scholar]
- Patki, N.; Wedge, R.; Veeramachaneni, K. The Synthetic Data Vault. In Proceedings of the 2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA), Montreal, QC, Canada, 17–19 October 2016; pp. 399–410. [Google Scholar]
- Li, Z.; Ma, C.; Shi, X.; Zhang, D.; Li, W.; Wu, L. TSA-GAN: A Robust Generative Adversarial Networks for Time Series Augmentation. In Proceedings of the 2021 International Joint Conference on Neural Networks (IJCNN), Shenzhen, China, 18–22 July 2021; pp. 1–8. [Google Scholar]
- Hyun, J.; Lee, Y.; Son, H.M.; Lee, S.H.; Pham, V.; Park, J.U.; Chung, T.-M. Synthetic Data Generation System for AI-Based Diabetic Foot Diagnosis. SN Comput. Sci. 2021, 2, 345. [Google Scholar] [CrossRef]
- Li, X.; Metsis, V.; Wang, H.; Ngu, A.H.H. TTS-GAN: A Transformer-based Time-Series Generative Adversarial Network. arXiv 2022, arXiv:2202.02691. [Google Scholar]
- Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative Adversarial Nets. In Advances in Neural Information Processing Systems; Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N., Weinberger, K.Q., Eds.; Curran Associates, Inc.: New York, NY, USA, 2014. [Google Scholar]
- Saito, M.; Matsumoto, E.; Saito, S. Temporal Generative Adversarial Nets with Singular Value Clipping. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 2849–2858. [Google Scholar]
- Taylor, S.J.; Letham, B. Forecasting at Scale. Am. Stat. 2018, 72, 37–45. [Google Scholar] [CrossRef]
- Schiff, S.; Gehrke, M.; Möller, R. Efficient Enriching of Synthesized Relational Patient Data with Time Series Data. Procedia Comput. Sci. 2018, 141, 531–538. [Google Scholar] [CrossRef]
- Mongin, D.; García Romero, J.; Alvero Cruz, J.R. Treadmill Maximal Exercise Tests from the Exercise Physiology and Human Performance Lab of the University of Malaga (version 1.0.1). PhysioNet 2021. [Google Scholar] [CrossRef]
- Mongin, D.; Chabert, C.; Courvoisier, D.S.; García-Romero, J.; Alvero-Cruz, J.R. Heart rate recovery to assess fitness: Comparison of different calculation methods in a large cross-sectional study. Res. Sports Med. 2021, 1–14. [Google Scholar] [CrossRef] [PubMed]
- Goldberger, A.L.; Amaral, L.A.N.; Glass, L.; Hausdorff, J.M.; Ivanov PCh Mark, R.G.; Mietus, J.E.; Moody, G.B.; Peng, C.-K.; Stanley, H.E. PhysioBank, PhysioToolkit, and PhysioNet: Components of a New Research Resource for Complex Physiologic Signals. Circulation 2000, 101, e215–e220. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Hernandez, M.; Epelde, G.; Alberdi, A.; Cilla, R.; Rankin, D. Standardised Metrics and Methods for Synthetic Tabular Data Evaluation. TechRxiv 2021. [Google Scholar] [CrossRef]
Age | Weight | Height | Humidity | Temperature | Sex | ID | ID_test |
Time | Speed | HR | VO2 | VCO2 | RR | VE | ID_test | ID |
Variable | Real Data | SD |
---|---|---|
Age | 28.95 ± 10.19 | 27.67 ± 9.94 |
Weight | 73.14 ± 11.96 | 72.12 ± 11.56 |
Height | 174.82 ± 7.99 | 174.39 ± 7.73 |
Humidity | 48.14 ± 8.54 | 45.4 ± 6.86 |
Temperature | 22.82 ± 2.79 | 23.92 ± 1.5 |
Sex | Male (n = 806) Female (n = 104) | Male (n = 810) Female (n = 110) |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Larrea, X.; Hernandez, M.; Epelde, G.; Beristain, A.; Molina, C.; Alberdi, A.; Rankin, D.; Bamidis, P.; Konstantinidis, E. Synthetic Subject Generation with Coupled Coherent Time Series Data. Eng. Proc. 2022, 18, 7. https://doi.org/10.3390/engproc2022018007
Larrea X, Hernandez M, Epelde G, Beristain A, Molina C, Alberdi A, Rankin D, Bamidis P, Konstantinidis E. Synthetic Subject Generation with Coupled Coherent Time Series Data. Engineering Proceedings. 2022; 18(1):7. https://doi.org/10.3390/engproc2022018007
Chicago/Turabian StyleLarrea, Xabat, Mikel Hernandez, Gorka Epelde, Andoni Beristain, Cristina Molina, Ane Alberdi, Debbie Rankin, Panagiotis Bamidis, and Evdokimos Konstantinidis. 2022. "Synthetic Subject Generation with Coupled Coherent Time Series Data" Engineering Proceedings 18, no. 1: 7. https://doi.org/10.3390/engproc2022018007
APA StyleLarrea, X., Hernandez, M., Epelde, G., Beristain, A., Molina, C., Alberdi, A., Rankin, D., Bamidis, P., & Konstantinidis, E. (2022). Synthetic Subject Generation with Coupled Coherent Time Series Data. Engineering Proceedings, 18(1), 7. https://doi.org/10.3390/engproc2022018007