TaijiGNN: A New Cycle-Consistent Generative Neural Network for High-Quality Bidirectional Transformation between RGB and Multispectral Domains
Abstract
:1. Introduction
1.1. The Background
1.2. The Contributions
- We proposed a new neural network, TaijiGNN (Taiji Generative Neural Network), which takes advantage of cycled neural network architecture to convert the problem of comparing two different domain images into comparing the same domain images, which can supply a solid theory foundation to solve the two distinct domain image translation problems.
- We effectively designed a multilayer perceptron neural network to substitute the convolutional neural network when implementing the generators. Additionally, we used solid experiments to prove that our MLP generators perform better than the traditional generators based on CNN during spectral super-resolution.
- We innovatively eliminated the traditional CycleGAN’s identity loss to tackle the different domain images with different dimensions and induced two consistency losses in TaijiGNN to enhance the training performance.
2. Related Work
3. The Proposed Method
3.1. The Problem Analysis
3.2. The Proposed Approach
3.3. The Detailed Implementation of the Approach
3.3.1. The Model Architecture
3.3.2. The Model Training
4. Experiments
4.1. The Experiment Requisites
4.2. The CAVE Dataset
4.2.1. The Dataset Processing and the Hyperparameters Setting
4.2.2. The Qualitative Evaluation
4.2.3. The Quantitative Measurement
4.3. The ICVL Dataset
4.3.1. The Dataset Processing and the Hyperparameters Setting
4.3.2. The Qualitative Evaluation
4.3.3. The Quantitative Measurement
5. Limitations
6. Conclusions and Future Work
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A. TaijiGNN Performance Summary
CAVE (CPU) | CAVE (GPU) | ICVL (CPU) | ICVL (GPU) | ||
---|---|---|---|---|---|
Training image size | RGB (16, 512, 512, 3) MSI (16, 512, 512, 31) | RGB (16, 512, 512, 3) MSI (16, 512, 512, 31) | RGB (16, 512, 512, 3) MSI (16, 512, 512, 31) | RGB (16, 512, 512, 3) MSI (16, 512, 512, 31) | |
Average one epoch training time (s) | 113 | 8 | 7 | 113 | |
Inferencing image size | RGB (16, 512, 512, 3) MSI (16, 512, 512, 31) | RGB (16, 512, 512, 3) MSI (16, 512, 512, 31) | RGB (1, 1392, 1300, 3) MSI (1, 1392, 1300, 31) | RGB (1, 1392, 1300, 3) MSI (1, 1392, 1300, 31) | |
Average image inferencing time (s) | RGB2MSI | 9.5 | 1.1 | 4.1 | 0.4 |
MSI2RGB | 9.2 | 1.1 | 4.1 | 0.4 |
References
- Wikipedia. Multispectral Image. Available online: https://en.wikipedia.org/wiki/Multispectral_image (accessed on 4 August 2021).
- Dwight, J.G.; Weng, C.Y.; Coffee, R.E.; Pawlowski, M.E.; Tkaczyk, T.S. Hyperspectral image mapping spectrometry for retinal oximetry measurements in four diseased eyes. Int. Ophthalmol. Clin. 2016, 56, 25. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Verdú, S.; Vásquez, F.; Grau, R.; Ivorra, E.; Sánchez, A.J.; Barat, J.M. Detection of adulterations with different grains in wheat products based on the hyperspectral image technique: The specific cases of flour and bread. Food Control 2016, 62, 373–380. [Google Scholar] [CrossRef]
- Edelman, G.; Gaston, E.; van Leeuwen, T.; Cullen, P.; Aalders, M. Hyperspectral imaging for non-contact analysis of forensic traces. Forensic Sci. Int. 2012, 223, 28–39. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Edelman, G.; Aalders, M. Photogrammetry using visible, infrared, hyperspectral and thermal imaging of crime scenes. Forensic Sci. Int. 2018, 292, 181–189. [Google Scholar] [CrossRef] [PubMed]
- Sun, T.; Jung, C.; Fu, Q.; Han, Q. NIR to RGB Domain Translation Using Asymmetric Cycle Generative Adversarial Networks. IEEE Access 2019, 7, 112459–112469. [Google Scholar] [CrossRef]
- Parkkinen, J.; Jaaskelainen, T.; Kuittinen, M. Spectral representation of color images. In Proceedings of the 9th International Conference on Pattern Recognition, Rome, Italy, 14 May–17 November 1988; Volume 2, pp. 933–935. [Google Scholar]
- Chen, H.; Liu, Y. Chapter 2—Teeth. In Advanced Ceramics for Dentistry; Shen, J.Z., Kosmač, T., Eds.; Butterworth-Heinemann: Oxford, UK, 2014; pp. 5–21. [Google Scholar] [CrossRef]
- Wikipedia. Taiji (Philosophy). 2020. Available online: https://en.wikipedia.org/wiki/Taiji_(philosophy) (accessed on 4 August 2021).
- Zhu, J.Y.; Park, T.; Isola, P.; Efros, A.A. Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017. [Google Scholar]
- Perera, P.; Abavisani, M.; Patel, V.M. In2I: Unsupervised Multi-Image-to-Image Translation Using Generative Adversarial Networks. In Proceedings of the 2018 24th International Conference on Pattern Recognition (ICPR), Beijing, China, 20–24 August 2018. [Google Scholar]
- Stiebel, T.; Koppers, S.; Seltsam, P.; Merhof, D. Reconstructing Spectral Images from RGB-Images Using a Convolutional Neural Network. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Salt Lake City, UT, USA, 18–22 June 2018; pp. 1061–10615. [Google Scholar]
- Can, Y.B.; Timofte, R. An efficient CNN for spectral reconstruction from RGB images. arXiv 2018, arXiv:1804.04647. [Google Scholar]
- Nguyen, R.M.; Prasad, D.K.; Brown, M.S. Training-based spectral reconstruction from a single RGB image. In Proceedings of the European Conference on Computer Vision, Zurich, Switzerland, 6–12 September 2014; Springer: Zurich, Switzerland, 2014; pp. 186–201. [Google Scholar]
- Arad, B.; Ben-Shahar, O. Sparse Recovery of Hyperspectral Signal from Natural RGB Images. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 11–14 October 2016; Springer: Cham, Switzerland, 2016; pp. 19–34. [Google Scholar]
- Choi, I.; Jeon, D.S.; Nam, G.; Gutierrez, D.; Kim, M.H. High-quality Hyperspectral Reconstruction Using a Spectral Prior. ACM Trans. Graph. 2017, 36, 218. [Google Scholar] [CrossRef] [Green Version]
- Xiong, Z.; Shi, Z.; Li, H.; Wang, L.; Liu, D.; Wu, F. HSCNN: CNN-Based Hyperspectral Image Recovery from Spectrally Undersampled Projections. In Proceedings of the 2017 IEEE International Conference on Computer Vision Workshops (ICCVW), Venice, Italy, 22–29 October 2017; pp. 518–525. [Google Scholar]
- Shi, Z.; Chen, C.; Xiong, Z.; Liu, D.; Wu, F. HSCNN+: Advanced CNN-Based Hyperspectral Recovery from RGB Images. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Salt Lake City, UT, USA, 18–22 June 2018; pp. 1052–10528. [Google Scholar]
- Arad, B.; Ben-Shahar, O.; Timofte, R.; Van Gool, L.; Zhang, L.; Yang, M.H. NTIRE 2018 challenge on spectral reconstruction from RGB images. In Proceedings of the 31st Meeting of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2018, Salt Lake City, UT, USA, 18–22 June 2018; pp. 1042–1051. [Google Scholar]
- Kaya, B.; Can, Y.B.; Timofte, R. Towards Spectral Estimation from a Single RGB Image in the Wild. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), Seoul, Korea, 27–28 October 2019. [Google Scholar]
- Gwn Lore, K.; Reddy, K.K.; Giering, M.; Bernal, E.A. Generative Adversarial Networks for Spectral Super-Resolution and Bidirectional RGB-To-Multispectral Mapping. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Long Beach, CA, USA, 16–17 June 2019; pp. 926–933. [Google Scholar]
- Arad, B.; Timofte, R.; Ben-Shahar, O.; Lin, Y.T.; Finlayson, G.; Givati, S.; Li, J.; Wu, C.; Song, R.; Li, Y.; et al. NTIRE 2020 Challenge on Spectral Reconstruction from an RGB Image. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Seattle, WA, USA, 14–19 June 2020. [Google Scholar]
- Li, J.; Wu, C.; Song, R.; Li, Y.; Liu, F. Adaptive Weighted Attention Network with Camera Spectral Sensitivity Prior for Spectral Reconstruction from RGB Images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Seattle, WA, USA, 14–19 June 2020. [Google Scholar]
- Wikia CIE 1931 Color Space. Available online: https://psychology.wikia.org/wiki/CIE_1931_color_space (accessed on 4 August 2021).
- Google. Tensorflow Tutorials; Google Inc.: Mountain View, CA, USA; Available online: https://www.tensorflow.org/tutorials/generative/cyclegan (accessed on 4 August 2021).
- Yasuma, F.; Mitsunaga, T.; Iso, D.; Nayar, S. Generalized Assorted Pixel Camera: Post-Capture Control of Resolution, Dynamic Range and Spectrum; Technical Report; Columbia University: New York, NY, USA, 2008. [Google Scholar]
- Ng, A. Machine Learning Yearning. Available online: https://github.com/ajaymache/machine-learning-yearning (accessed on 4 August 2021).
Symbol | Meaning |
---|---|
x | Original sample from X domain |
y | Original sample from Y domain |
Sample generated by Generator F and paired with y | |
Sample generated by Generator G and paired with x | |
X | Domain X |
Y | Domain Y |
Sample generated by Generator F and unpaired with y | |
Sample generated by Generator G and unpaired with x | |
G | Generator, translate samples from domain X into domain Y |
F | Generator, translate samples from domain Y into domain X |
Discriminator of domain X | |
Discriminator of domain Y |
Layer | Input | Output | Activation Function | |
---|---|---|---|---|
Dense | RGB(3) | 512 | Leaky Relu | |
Dense | 512 | 512 | Leaky Relu | |
Dense | 512 | MSI(31) | tanh |
Layer | Input | Output | Activation Function | |
---|---|---|---|---|
Dense | MSI(31) | 512 | Leaky Relu | |
Dense | 512 | 512 | Leaky Relu | |
Dense | 512 | RGB(3) | tanh |
Hardware | Models |
---|---|
Central Processing Unit | Xeon(Intel) E5-1620 (3.5 GHz) |
Graphics Processing Unit | TITAN-Xp (NVIDIA) |
RAM | DDR4 32 GB (Samsung) |
Mechanical Hard Disk Drive | Westwood 4TB |
SSD | 512 GB (Intel) |
Software | Version |
---|---|
Operating system | Ubuntu 18.04.5 LTS |
Programming language | Python 3.6.9 |
Machine learning framework | Tensorflow with GPU 2.3.1 |
GPU programming framework | Nvidia CUDA 11.2.0 |
Berk | Kin | Arad | Ours | |
---|---|---|---|---|
Method | CNNs | cGANs | Sparse coding | TaijiGNN |
Percentage of training and testing | - | 50%:50% | - | 50%:50% |
RMSE∼(0–255) | - | 8.0622 | 5.4 | 5.656 |
RMSE∼(0–1) | 0.038 | - | - | 0.022 |
SSIM | 0.94 | - | - | 0.975 |
PSNR | 28.78 | - | - | 33.91 |
Berk | Kin | Ours | |
---|---|---|---|
Method | CNNs | cGANs | TaijiGNN |
Percentage of training and testing | - | 50%:50% | 50%:50% |
RMSE∼(0–255) | 2.55 | 5.649 | 1.727 |
RMSE∼(0–1) | 0.038 | - | 0.0068 |
SSIM | 0.94 | - | 0.99 |
PSNR | 28.78 | - | 45.07 |
Author | Arad et al. [15] | Zhiwe et al. [17] | Ours |
Methods | Sparse coding | HSCNN | TaijiGNN |
Percentage of training and testing | - | 141:59 | 32(train):68(validation):101(test) |
Evaluation metric | rRMSE | rRMSE | rRMSE |
Park subset | 0.0589 | 0.0371 | 0.0347 |
Urban subset | 0.0617 | 0.0388 | 0.0335 |
Indoor subset | 0.0507 | 0.0638 | 0.0495 |
Plant-life subset | 0.0469 | 0.0445 | 0.0434 |
Rural subset | 0.0354 | 0.0331 | 0.0369 |
Subset average | 0.0507 | 0.0435 | 0.0396 |
Complete set | 0.0756 | 0.0388 | 0.0358 |
TaijiGNN Train(32 Samples):Test(169 Samples) | |||||
---|---|---|---|---|---|
Metrics | SSIM | PSNR | rRMSE | RMSE | RMSE_INT |
MSI to RGB | 1.000 | 54.994 | 0.004 | 0.002 | 0.492 |
RGB to MSI | 0.999 | 44.665 | 0.029 | 0.007 | 1.700 |
PSNR | SSIM | RMSE (0–1) | RMSE (0–255) | |
---|---|---|---|---|
0.01 | 20.149 | 0.6944 | 0.1073 | 27.368 |
0.1 | 33.383 | 0.9692 | 0.0247 | 6.302 |
0.5 | 33.821 | 0.9731 | 0.0232 | 5.915 |
1 | 33.683 | 0.9727 | 0.0234 | 6.010 |
2 | 33.782 | 0.9728 | 0.0234 | 5.957 |
3 | 33,767 | 0.9728 | 0.0232 | 5.966 |
CAVE dataset, train_epoch = 300, batch_size = 10,240 |
PSNR | SSIM | RMSE (0–1) | RMSE (0–255) | |
---|---|---|---|---|
0.01 | 16.672 | 0.7793 | 0.1569 | 40.013 |
0.1 | 38.261 | 0.9784 | 0.0136 | 3.467 |
0.5 | 46.318 | 0.9919 | 0.0054 | 1.381 |
1 | 46.273 | 0.9911 | 0.0056 | 1.423 |
2 | 50.557 | 0.9924 | 0.0034 | 0.875 |
3 | 50.955 | 0.9953 | 0.0032 | 0.809 |
CAVE dataset, train_epoch = 300, batch_size = 10,240 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Liu, X.; Gherbi, A.; Li, W.; Wei, Z.; Cheriet, M. TaijiGNN: A New Cycle-Consistent Generative Neural Network for High-Quality Bidirectional Transformation between RGB and Multispectral Domains. Sensors 2021, 21, 5394. https://doi.org/10.3390/s21165394
Liu X, Gherbi A, Li W, Wei Z, Cheriet M. TaijiGNN: A New Cycle-Consistent Generative Neural Network for High-Quality Bidirectional Transformation between RGB and Multispectral Domains. Sensors. 2021; 21(16):5394. https://doi.org/10.3390/s21165394
Chicago/Turabian StyleLiu, Xu, Abdelouahed Gherbi, Wubin Li, Zhenzhou Wei, and Mohamed Cheriet. 2021. "TaijiGNN: A New Cycle-Consistent Generative Neural Network for High-Quality Bidirectional Transformation between RGB and Multispectral Domains" Sensors 21, no. 16: 5394. https://doi.org/10.3390/s21165394