Multi-Output Regression with Generative Adversarial Networks (MOR-GANs)
Abstract
:1. Introduction
1.1. Related Work
1.2. Contributions and Outline
2. Methods
2.1. Data Generation
2.2. Gaussian Process Regression
2.3. Generative Adversarial Networks
2.4. Wasserstein Generative Adversarial Networks
2.5. Regression with WGAN
- Random input: Random variables are assigned to the latent space from which the generator of the WGAN yields a realistic output of a n-tuple of the independent and dependent variables associated with regression problem. By sampling the generator many times, this can be used to assess the probability density function learned by the generator. The value of the independent variable(s) cannot be controlled, however, as they are an output of the generator. Although random inputs allow us to see the distribution learned by the generator, having the facility to constrain the independent variables is an important feature.
- Constrained input: An algorithm is used in conjunction with the (trained) WGAN to find predictions for given value(s) of the dependent variable(s). This results in a property similar to a GPR, where, for example, the independent variables are inputs of the GPR (and can be prescribed) and the outputs are dependent variables. An inherent property of a trained WGAN is that both independent and dependent variables are contained within the output the generator. Using the constrained input method described here, a WGAN can therefore make a prediction for any combination of known and unknown variables, with the independent variables being treated in the same way as dependent variables. A GPR, however, can only make predictions for the particular set of dependent variables that it was trained on, given the set of independent variables that it was trained on.
Algorithm 1 Prediction Function. Built to be used in conjunction with the trained WGAN, to constrain the independent variable of the regression problem. | |
Require: The desired value of the independent variable , initial values of the latent variables , trained generator G, number of iterations N. | |
for i = 1, …, N do | |
▹ Output of GAN from latent space of iteration i | |
▹ Work out mismatch between GAN output and desired value | |
▹ Adjust latent space by backpropagating mistmatch | |
end for |
2.6. WGAN Architectures
2.7. Visualisation
2.8. Statistical Analysis
3. Results from Synthetic Datasets
3.1. Training
Algorithm 2 WGAN with gradient penalty and sample-wise optimisation. All experiments in the paper used the default values This algorithm is a modified version of the one displayed in the paper by Gulrajani et al. [13] | |
Require: The gradient penalty coefficient , the number of critic iterations per generator iteration , the batch size m, Adam hyperparameters . Require: initial critic parameters , initial generator parameters . while has not converged do for do for i = 1, …, m do real data , latent variable , a random number . end for end for Sample a batch of latent variables end while |
3.2. Single-Output Regression with Random Input Values
3.2.1. 1D Uni-Modal Examples
3.2.2. 1D Multi-Modal Examples
3.2.3. Confidence of Solutions from the Critic
3.2.4. 2D Uni- and Multi-Modal Examples
3.3. Single-Output Regression with Constrained Input Values
3.4. Multi-Output Regression with MOR-GAN
3.4.1. 1D Eye Dataset with Covariance
3.4.2. Co-Varying Spiral Dataset
4. Silver Nanoparticle Data
5. Execution Time of Method
6. Conclusions and Future Work
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A. Nomenclature
Section 2 and Algorithm 2 from Section 3 | |
---|---|
G, D | generator and discriminator (or critic) networks (for GANs, D is referred to as the discriminator, for WGANs it is referred to as the critic) |
latent variables | |
the loss function and function describing the two player min-max game | |
, | samples from real and generated data |
, | distributions for the real data and the generated data |
distribution of the latent variables | |
W | Wasserstein distance between distributions |
a joint distribution | |
set of 1-Lipschitz functions | |
a linear combination of a real sample and a generated sample (at which the gradient penalty will be imposed) | |
gradient penalty | |
mismatch between desired (partial) output of GAN and actual (partial) output of GAN | |
x, y | independent and dependent variables |
, | particular values of the independent and dependent variables |
random number | |
Uniform probability distribution | |
learning rate | |
, | optimiser hyperparameters |
number of iterations of the critic | |
m | batch size |
N | number of iterations |
Section 3 | |
x, y, z | independent and dependent variables |
, , , | independent and dependent variables |
angle | |
a scalar controlling the amount of noise | |
random variable (noise) sampled from a Gaussian distribution with mean and standard deviation | |
h | distance function |
References
- Borchani, H.; Varando, G.; Bielza, C.; Larrañaga, P. A survey on multi-output regression. WIREs Data Min. Knowl. Discov. 2015, 5, 216–233. [Google Scholar] [CrossRef]
- Xu, D.; Shi, Y.; Tsang, I.W.; Ong, Y.S.; Gong, C.; Shen, X. Survey on Multi-Output Learning. IEEE Trans. Neural Netw. Learn. Syst. 2020, 31, 2409–2429. [Google Scholar] [CrossRef]
- Rasmussen, C.E. Gaussian Processes in machine learning. In Advanced Lectures on Machine Learning; Springer: Berlin/Heidelberg, Germany, 2004; Volume 3176, pp. 63–71. [Google Scholar] [CrossRef]
- Goodfellow, I.J.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative Adversarial Nets. Technical report. arXiv 2014, arXiv:1406.2661v1. [Google Scholar]
- Kazeminia, S.; Baur, C.; Kuijper, A.; van Ginneken, B.; Navab, N.; Albarqouni, S.; Mukhopadhyay, A. GANs for Medical Image Analysis. Artif. Intell. Med. 2020, 109, 101938. [Google Scholar] [CrossRef]
- Wang, K.; Gou, C.; Duan, Y.; Lin, Y.; Zheng, X.; Wang, F.Y. Generative adversarial networks: Introduction and outlook. IEEE/CAA J. Autom. Sin. 2017, 4, 588–598. [Google Scholar] [CrossRef]
- Radford, A.; Metz, L.; Chintala, S. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. arXiv 2015, arXiv:1511.06434. [Google Scholar] [CrossRef]
- Kunfeng, W.; Yue, L.; Yutong, W.; Fei-Yue, W. Parallel imaging: A unified theoretical framework for image generation. In Proceedings of the 2017 Chinese Automation Congress, CAC 2017, Jinan, China, 20–22 October 2017; pp. 7687–7692. [Google Scholar] [CrossRef]
- Zhang, K.; Kang, Q.; Wang, X.; Zhou, M.; Li, S. A visual domain adaptation method based on enhanced subspace distribution matching. In Proceedings of the ICNSC 2018—15th IEEE International Conference on Networking, Sensing and Control, Zhuhai, China, 27–29 March 2018; pp. 1–6. [Google Scholar] [CrossRef]
- Jolaade, M.; Silva, V.L.; Heaney, C.E.; Pain, C.C. Generative Networks Applied to Model Fluid Flows. In Proceedings of the International Conference on Computational Science, London, UK, 21–23 June 2022; Springer: Berlin/Heidelberg, Germany, 2022; pp. 742–755. [Google Scholar] [CrossRef]
- Salimans, T.; Goodfellow, I.; Zaremba, W.; Cheung, V.; Radford, A.; Chen, X. Improved Techniques for Training GANs. Technical report. arXiv 2017, arXiv:1606.03498. [Google Scholar]
- Arjovsky, M.; Chintala, S.; Bottou, L. Wasserstein GAN. Technical report. arXiv 2017, arXiv:1701.07875. [Google Scholar]
- Gulrajani, I.; Ahmed, F.; Arjovsky, M.; Dumoulin, V.; Courville, A. Improved Training of Wasserstein GANs Montreal Institute for Learning Algorithms. Technical report. arXiv 2017, arXiv:1704.00028. [Google Scholar]
- Barnett, S.A. Convergence Problems with Generative Adversarial Networks (GANs) A dissertation presented for CCD Dissertations on a Mathematical Topic. Technical report. arXiv 2018, arXiv:1806.11382. [Google Scholar]
- Aggarwal, K.; Kirchmeyer, M.; Yadav, P.; Keerthi, S.S.; Gallinari, P. Regression with Conditional GAN. Technical report. arXiv 2019, arXiv:1905.12868. [Google Scholar] [CrossRef]
- McDermott, M.B.A.; Yan, T.; Naumann, T.; Hunt, N.; Suresh, H.; Szolovits, P.; Ghassemi, M. Semi-Supervised Biomedical Translation with Cycle Wasserstein Regression GANs. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence and Thirtieth Innovative Applications of Artificial Intelligence Conference and Eighth AAAI Symposium on Educational Advances in Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018. [Google Scholar]
- Schulz, E.; Speekenbrink, M.; Krause, A. A tutorial on Gaussian process regression: Modelling, exploring, and exploiting functions. J. Math. Psychol. 2018, 85, 1–16. [Google Scholar] [CrossRef]
- Rasmussen, C.; Williams, C. Gaussian Process for Machine Learning; MIT Press: Cambridge, MA, USA, 2006. [Google Scholar]
- Silva, V.L.; Heaney, C.E.; Li, Y.; Pain, C.C. Data Assimilation Predictive GAN (DA-PredGAN): Applied to determine the spread of COVID-19. arXiv 2021, arXiv:2105.07729. [Google Scholar]
- Wang, S.; Tarroni, G.; Qin, C.; Mo, Y.; Dai, C.; Chen, C.; Glocker, B.; Guo, Y.; Rueckert, D.; Bai, W. Deep generative model-based quality control for cardiac MRI segmentation. In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Lima, Peru, 4–8 October 2020; Springer: Berlin/Heidelberg, Germany, 2020; pp. 88–97. [Google Scholar]
- Le, Q.V.; Smola, A.J.; Canu, S. Heteroscedastic Gaussian process regression. In Proceedings of the ICML 2005—The 22nd International Conference on Machine Learning, Bonn, Germany, 7–11 August 2005; ACM Press: New York, NY, USA, 2005; pp. 489–496. [Google Scholar] [CrossRef]
- Kim, H.C.; Lee, J. Clustering based on Gaussian processes. Neural Comput. 2007, 19, 3088–3107. [Google Scholar] [CrossRef]
- Kolmogorov, A.N. Interpolation and extrapolation of stationary random sequences. In Selected Works of A. N. Kolmogorov; Springer: Dordrecht, The Netherlands, 1992. [Google Scholar]
- Wiener, N. Extrapolation, Interpolation and Smoothing of Stationary Time Series; MIT Press: Cambridge, MA, USA, 1949. [Google Scholar]
- Sacks, J.; William, J.; Welch, T.J.M.; Wynn, H.P. Design and Analysis of Computer Experiments; Institute of Mathematical Statistics: Hayward, CA, USA, 1989. [Google Scholar] [CrossRef]
- GPy. GPy: A Gaussian Process Framework in Python. 2012. Available online: http://github.com/SheffieldML/GPy (accessed on 20 December 2020).
- Chollet, F. Keras. 2015. Available online: https://github.com/fchollet/keras (accessed on 20 December 2020).
- Smirnov, N.V. On the estimation of the discrepancy between empirical curves of distribution for two independent samples. Bull. Math. Univ. Moscou 1939, 2, 3–14. [Google Scholar]
- Mann, H.B.; Whitney, D.R. On a test of whether one of two random variables is stochastically larger than the other. Ann. Math. Stat. 1947, 18, 50–60. [Google Scholar] [CrossRef]
- Virtanen, P.; Gommers, R.; Oliphant, T.E.; Haberland, M.; Reddy, T.; Cournapeau, D.; Burovski, E.; Peterson, P.; Weckesser, W.; Bright, J.; et al. SciPy 1.0: Fundamental Algorithms for Scientific Computing in Python. Nat. Methods 2020, 17, 261–272. [Google Scholar] [CrossRef]
- Gulrajani, I.; Ahmed, F.; Arjovsky, M.; Dumoulin, V.; Courville, A.C. Improved Training of Wasserstein GANs. In Advances in Neural Information Processing Systems 30; Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R., Eds.; Curran Associates, Inc.: Nice, France, 2017; pp. 5767–5777. [Google Scholar]
- Michaeloudes, C.; Seiffert, J.; Chen, S.; Ruenraroengsak, P.; Bey, L.; Theodorou, I.G.; Ryan, M.; Cui, X.; Zhang, J.; Shaffer, M.; et al. Effect of silver nanospheres and nanowires on human airway smooth muscle cells: Role of sulfidatio. Nanoscale Adv. 2020, 2, 5635–5647. [Google Scholar] [CrossRef]
- Quadros, M.E.; Marr, L.C. Silver nanoparticles and total aerosols emitted by nanotechnology-related consumer spray products. Environ. Sci. Technol. 2011, 45, 10713–10719. [Google Scholar] [CrossRef]
- Benn, T.; Cavanagh, B.; Hristovski, K.; Posner, J.D.; Westerhoff, P. The Release of Nanosilver from Consumer Products Used in the Home. J. Environ. Qual. 2010, 39, 1875–1882. [Google Scholar] [CrossRef]
- Silva, V.L.S.; Heaney, C.E.; Pain, C.C. GAN for time series prediction, data assimilation and uncertainty quantification. arXiv 2021, arXiv:2105.13859. [Google Scholar] [CrossRef]
Distribution of Dataset | Input Type | Section | ||||
---|---|---|---|---|---|---|
Dataset | Dimension | Noise | Type | Output | ||
sine wave | 1D | ✓ | uni-modal | single output | random | 3.2.1 |
heteroscedastic | 1D | ✓ | uni-modal | single output | random | 3.2.1 |
circle | 1D | ✓ | multi-modal | single output | random | 3.2.2 |
sine wave with lines | 1D | ✓ | multi-modal | single output | random | 3.2.2 |
distance | 2D | ✗ | uni-modal | single output | random | 3.2.4 |
helix | 2D | ✓ | multi-modal | single output | random | 3.2.4 |
sine wave | 1D | ✓ | uni-modal | single output | constrained | 3.3 |
heteroscedastic | 1D | ✓ | uni-modal | single output | constrained | 3.3 |
circle | 1D | ✓ | multi-modal | single output | constrained | 3.3 |
eye | 1D | ✗ | multi-modal | multi-output | constrained | 3.4.1 |
spiral | 2D | ✗ | uni-modal | multi-output | constrained | 3.4.2 |
Hyperparameters | Single-Output | Multi-Output |
---|---|---|
Learning rate | ||
Number of Critic iterations per Generator iterations | 5 | 5 |
Batch size | 100 | 32 |
Latent Space Dimension | 3 | 3 ( used for spiral problem) |
Adam optimiser hyperparameters (decay rates of moving averages) | ||
Gradient penalty hyperparameter | 10 | 10 |
Layer | Kernel Size | Strides | Padding | Use Bias |
---|---|---|---|---|
Conv2D_1 | same | True | ||
Conv2D_2 | same | True | ||
Conv2D_transpose_1 | same | False | ||
Conv2D_transpose_2 | same | False | ||
Conv2D_transpose_3 | same | False | ||
Conv2D_3 | same | True | ||
Conv2D_4 | same | True |
Molecule | Specific Surface Area m g |
---|---|
Ag+ | 4.4 |
s-AgNWs | 4.6 |
50 nm AgNSs | 6 |
20 nm AgNSs | 40.4 |
Dataset | GPR | WGAN—Random | WGAN—Constrained |
---|---|---|---|
sine wave | s | s | s |
heteroscedastic | s | s | s |
circle | s | s | s |
helix | s | s | s |
silver nanoparticle | s | s | s |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Phillips, T.R.F.; Heaney, C.E.; Benmoufok, E.; Li, Q.; Hua, L.; Porter, A.E.; Chung, K.F.; Pain, C.C. Multi-Output Regression with Generative Adversarial Networks (MOR-GANs). Appl. Sci. 2022, 12, 9209. https://doi.org/10.3390/app12189209
Phillips TRF, Heaney CE, Benmoufok E, Li Q, Hua L, Porter AE, Chung KF, Pain CC. Multi-Output Regression with Generative Adversarial Networks (MOR-GANs). Applied Sciences. 2022; 12(18):9209. https://doi.org/10.3390/app12189209
Chicago/Turabian StylePhillips, Toby R. F., Claire E. Heaney, Ellyess Benmoufok, Qingyang Li, Lily Hua, Alexandra E. Porter, Kian Fan Chung, and Christopher C. Pain. 2022. "Multi-Output Regression with Generative Adversarial Networks (MOR-GANs)" Applied Sciences 12, no. 18: 9209. https://doi.org/10.3390/app12189209
APA StylePhillips, T. R. F., Heaney, C. E., Benmoufok, E., Li, Q., Hua, L., Porter, A. E., Chung, K. F., & Pain, C. C. (2022). Multi-Output Regression with Generative Adversarial Networks (MOR-GANs). Applied Sciences, 12(18), 9209. https://doi.org/10.3390/app12189209