A Light-Field Video Dataset of Scenes with Moving Objects Captured with a Plenoptic Video Camera
Abstract
:1. Introduction
Contributions
- Video Sequence Dataset: Unlike typical light-field image datasets, which focus on individual frames, our dataset includes 300-frame sequences for each light-field video. This not only aids in interpolation by providing more temporal data but also allows for the development of algorithms that leverage consecutive frames, an aspect which is absent in image-based datasets.
- Uncontrolled Speed Features: Our dataset captures different objects with various uncontrolled speed features. In ML and deep learning, accurate prediction is crucial. Using controlled speed scenes can make predictions easier but less reliable. Our dataset’s inclusion of scenes with uncontrolled speeds and behaviours results in more robust and valid algorithm performance, as it mirrors real-world unpredictability.
2. Related Work
2.1. Light-Field Datasets
2.2. Light-Field Quality Assessment
3. Capture Configuration
4. Content Characterisation Methodologies
4.1. Motion Vector
4.2. SI, TI, and CF
5. Analysis of the Content Characterisation Results
5.1. Motion Vectors Results Analysis
5.2. SI, TI, and CF Results Analysis
6. Light-Field Dataset Objective Quality Assessment
6.1. Light-Field Video Encoding Procedure
6.2. PSNR and SSIM Quality Metrics
6.3. Objective Quality Assessment Discussion
7. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
AV1 | AOMedia Video 1 |
AVC | Advanced Video Coding |
CF | Colourfulness |
fps | frames per second |
HEVC | High Efficiency Video Coding |
HSV | Hue-Saturation-Value |
HSL | Hue-Saturation-Lightness |
Mbps | Megabits per second |
ML | Machine Learning |
PSNR | Peak Signal-to-Noise Ratio |
RGB | Red, Green, Blue |
SI | Spatial Information |
SR | Super Resolution |
SSIM | Structural Similarity Index |
TI | Temporal Information |
VVC | Versatile Video Coding |
VP9 | Video Codec 9 |
References
- Wu, G.; Masia, B.; Jarabo, A.; Zhang, Y.; Wang, L.; Dai, Q.; Chai, T.; Liu, Y. Light field image processing: An overview. IEEE J. Sel. Top. Signal Process. 2017, 11, 926–954. [Google Scholar] [CrossRef]
- Levoy, M.; Hanrahan, P. Light field rendering. In Proceedings of the 23rd Annual Conference on Computer Graphics and Interactive Techniques, New Orleans, LA, USA, 4–9 August 1996; pp. 31–42. [Google Scholar]
- Wang, Y.; Wang, L.; Liang, Z.; Yang, J.; Timofte, R.; Guo, Y.; Jin, K.; Wei, Z.; Yang, A.; Guo, S.; et al. NTIRE 2023 challenge on light field image super-resolution: Dataset, methods and results. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 18–22 June 2023; pp. 1320–1335. [Google Scholar]
- Schambach, M.; Heizmann, M. A multispectral light field dataset and framework for light field deep learning. IEEE Access 2020, 8, 193492–193502. [Google Scholar] [CrossRef]
- Jin, J.; Hou, J.; Chen, J.; Kwong, S. Light field spatial super-resolution via deep combinatorial geometry embedding and structural consistency regularization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 2260–2269. [Google Scholar]
- ITU-T Recommendation H.264: Advanced Video Coding for Generic Audiovisual Services. 2003. Available online: https://www.itu.int/rec/T-REC-H.264 (accessed on 20 April 2024).
- ITU-T Recommendation H.265: High Efficiency Video Coding. 2013. Available online: https://www.itu.int/rec/T-REC-H.265 (accessed on 20 April 2024).
- Internet Engineering Task Force (IETF) RFC 7741, WebM Project: VP9 Bitstream Specification. 2016. Available online: https://www.webmproject.org/vp9/ (accessed on 20 April 2024).
- Alliance for Open Media: AV1 Bitstream and Decoding Process Specification. 2021. Available online: https://aomedia.org/av1-features/ (accessed on 20 April 2024).
- Wieckowski, A.; Brandenburg, J.; Hinz, T.; Bartnik, C.; George, V.; Hege, G.; Helmrich, C.; Henkel, A.; Lehmann, C.; Stoffers, C.; et al. VVenC: An Open And Optimized VVC Encoder Implementation. In Proceedings of the IEEE International Conference on Multimedia Expo Workshops (ICMEW), Abu Dhabi, United Arab Emirates, 25–28 October 2020; pp. 1–2. [Google Scholar] [CrossRef]
- Wieckowski, A.; Hege, G.; Bartnik, C.; Lehmann, C.; Stoffers, C.; Bross, B.; Marpe, D. Towards A Live Software Decoder Implementation For The Upcoming Versatile Video Coding (VVC) Codec. In Proceedings of the 2020 IEEE International Conference on Image Processing (ICIP), Abu Dhabi, United Arab Emirates, 25–28 October 2020; pp. 3124–3128. [Google Scholar] [CrossRef]
- Guillo, L.; Jiang, X.; Lafruit, G.; Guillemot, C. Light Field Video Dataset Captured by a R8 Raytrix Camera (with Disparity Maps); Technical Report; International Telecommunication Union (ITU): Geneva, Switzerland, 2018. [Google Scholar]
- Wang, B.; Peng, Q.; Wang, E.; Han, K.; Xiang, W. Region-of-interest compression and view synthesis for light field video streaming. IEEE Access 2019, 7, 41183–41192. [Google Scholar] [CrossRef]
- Shafiee, E.; Martini, M.G. Datasets for the quality assessment of light field imaging: Comparison and future directions. IEEE Access 2023, 11, 15014–15029. [Google Scholar] [CrossRef]
- Wanner, S.; Meister, S.; Goldluecke, B. Datasets and benchmarks for densely sampled 4D light fields. Vision Model. Vis. 2013, 13, 225–226. [Google Scholar]
- Tao, M.W.; Hadap, S.; Malik, J.; Ramamoorthi, R. Depth from combining defocus and correspondence using light-field cameras. In Proceedings of the IEEE International Conference on Computer Vision, Sydney, Australia, 1–8 December 2013; pp. 673–680. [Google Scholar]
- Li, N.; Ye, J.; Ji, Y.; Ling, H.; Yu, J. Saliency detection on light field. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 2806–2813. [Google Scholar]
- Yoon, Y.; Jeon, H.G.; Yoo, D.; Lee, J.Y.; So Kweon, I. Learning a deep convolutional network for light-field image super-resolution. In Proceedings of the IEEE International Conference on Computer Vision Workshops, Santiago, Chile, 7–13 December 2015; pp. 24–32. [Google Scholar]
- Rerabek, M.; Ebrahimi, T. New light field image dataset. In Proceedings of the 8th International Conference on Quality of Multimedia Experience (QoMEX), Lisbon, Portugal, 6–8 June 2016. [Google Scholar]
- Paudyal, P.; Olsson, R.; Sjöström, M.; Battisti, F.; Carli, M. SMART: A Light Field Image Quality Dataset. In Proceedings of the 7th International Conference on Multimedia Systems, Wörthersee, Austria, 10–13 May 2016; pp. 1–6. [Google Scholar]
- Sabater, N.; Boisson, G.; Vandame, B.; Kerbiriou, P.; Babon, F.; Hog, M.; Gendrot, R.; Langlois, T.; Bureller, O.; Schubert, A.; et al. Dataset and pipeline for multi-view light-field video. In Proceedings of the EEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, HI, USA, 21–26 July 2017; pp. 30–40. [Google Scholar]
- Honauer, K.; Johannsen, O.; Kondermann, D.; Goldluecke, B. A dataset and evaluation methodology for depth estimation on 4D light fields. In Computer Vision–ACCV 2016, Proceedings of the 13th Asian Conference on Computer Vision, Taipei, Taiwan, 20–24 November 2016; Revised Selected Papers, Part III 13; Springer: Berlin/Heidelberg, Germany, 2017; pp. 19–34. [Google Scholar]
- Shekhar, S.; Kunz Beigpour, S.; Ziegler, M.; Chwesiuk, M.; Paleń, D.; Myszkowski, K.; Keinert, J.; Mantiuk, R.; Didyk, P. Light-field intrinsic dataset. In Proceedings of the he British Machine Vision Conference 2018 (BMVC), Newcastle, UK, 3–6 September 2018. [Google Scholar]
- de Faria, S.M.; Filipe, J.N.; Pereira, P.M.; Tavora, L.M.; Assuncao, P.A.; Santos, M.O.; Fonseca-Pinto, R.; Santiago, F.; Dominguez, V.; Henrique, M. Light field image dataset of skin lesions. In Proceedings of the 2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Berlin, Germany, 23–27 July 2019. [Google Scholar]
- Hu, X.; Wang, C.; Pan, Y.; Liu, Y.; Wang, Y.; Liu, Y.; Zhang, L.; Shirmohammadi, S. 4DLFVD: A 4D light field video dataset. In Proceedings of the 12th ACM Multimedia Systems Conference, Chengdu, China, 20–24 October 2021; pp. 287–292. [Google Scholar]
- Sheng, H.; Cong, R.; Yang, D.; Chen, R.; Wang, S.; Cui, Z. UrbanLF: A comprehensive light field dataset for semantic segmentation of urban scenes. IEEE Trans. Circuits Syst. Video Technol. 2022, 32, 7880–7893. [Google Scholar] [CrossRef]
- Wang, T.C.; Zhu, J.Y.; Kalantari, N.K.; Efros, A.A.; Ramamoorthi, R. Light field video capture using a learning-based hybrid imaging system. ACM Trans. Graph. 2017, 36, 1–13. [Google Scholar] [CrossRef]
- Sakamoto, T.; Kodama, K.; Hamamoto, T. A study on efficient compression of multi-focus images for dense light-field reconstruction. In Proceedings of the 2012 Visual Communications and Image Processing, IEEE, San Diego, CA, USA, 27–30 November 2012; pp. 1–6. [Google Scholar]
- Marwah, K.; Wetzstein, G.; Bando, Y.; Raskar, R. Compressive light field photography using overcomplete dictionaries and optimized projections. ACM Trans. Graph. 2013, 32, 1–12. [Google Scholar] [CrossRef]
- Tambe, S.; Veeraraghavan, A.; Agrawal, A. Towards motion aware light field video for dynamic scenes. In Proceedings of the IEEE International Conference on Computer Vision, Sydney, Australia, 1–8 December 2013; pp. 1009–1016. [Google Scholar]
- Perra, C. On the coding of plenoptic raw images. In Proceedings of the 2014 22nd Telecommunications Forum Telfor (TELFOR), IEEE, Belgrade, Serbia, 25–27 November 2014; pp. 850–853. [Google Scholar]
- Choudhury, C.; Tarun, Y.; Rajwade, A.; Chaudhuri, S. Low bit-rate compression of video and light-field data using coded snapshots and learned dictionaries. In Proceedings of the 2015 IEEE 17th International Workshop on Multimedia Signal Processing (MMSP), Xiamen, China, 6–11 June 2015; pp. 1–6. [Google Scholar]
- Thomaz, L.A.; Santos, J.M.; Astola, P.; de Faria, S.M.; Assunçao, P.A.; de Carvalho, M.B.; da Silva, E.A.; Pagliari, C.L.; Tabus, I.; Pereira, M.P.; et al. Visually lossless compression of light fields. In Proceedings of the European Light Field Imaging Workshop, Borovets, Bulgaria, 4–6 June 2019. [Google Scholar]
- Hedayati, E.; Havens, T.C.; Bos, J.P. Light field compression by residual CNN-assisted JPEG. In Proceedings of the 2021 International Joint Conference on Neural Networks (IJCNN), Virtual Conference, 18–22 July 2021; pp. 1–9. [Google Scholar]
- Shidanshidi, H.; Safaei, F.; Li, W. Estimation of signal distortion using effective sampling density for light field-based free viewpoint video. IEEE Trans. Multimed. 2015, 17, 1677–1693. [Google Scholar] [CrossRef]
- Singh, M.; Rameshan, R.M. Learning-based practical light field image compression using a disparity-aware model. In Proceedings of the 2021 Picture Coding Symposium (PCS), Virtual Conference, 29 June–2 July 2021; pp. 1–5. [Google Scholar]
- Avramelos, V.; De Praeter, J.; Van Wallendael, G.; Lambert, P. Light field image compression using versatile video coding. In Proceedings of the 2019 IEEE 9th International Conference on Consumer Electronics (ICCE-Berlin), Berlin, Germany, 8–11 September 2019; pp. 70–75. [Google Scholar]
- JPEG Pleno Light Field Datasets. Available online: https://plenodb.jpeg.org/lf/pleno_lf (accessed on 14 May 2024).
- Amirpour, H.; Pinheiro, A.; Pereira, M.; Lopes, F.J.; Ghanbari, M. Efficient light field image compression with enhanced random access. ACM Trans. Multimed. Comput. Commun. Appl. 2022, 18, 1–18. [Google Scholar] [CrossRef]
- Barina, D.; Solony, M.; Chlubna, T.; Dlabaja, D.; Klima, O.; Zemcik, P. Comparison of light field compression methods. Multimed. Tools Appl. 2022, 81, 2517–2528. [Google Scholar] [CrossRef]
- IEEE P3333.1.4-WG; IEEE P3333.1.4 Recommended Practice on the Quality Assessment of Light Field Imaging. IEEE Standards Association: Piscataway, NJ, USA, 2022.
- Kara, P.A.; Tamboli, R.R.; Shafiee, E.; Martini, M.G.; Simon, A.; Guindy, M. Beyond perceptual thresholds and personal preference: Towards novel research questions and methodologies of quality of experience studies on light field visualization. Electronics 2022, 11, 953. [Google Scholar] [CrossRef]
- Raytrix Light Field Camera. Available online: http://www.raytrix.de/ (accessed on 20 April 2024).
- Raytrix Light Field Camera Sensor Details. Available online: https://raytrix.de/products/ (accessed on 20 May 2024).
- Raytrix Light Field Camera Field of View Details. Available online: https://raytrix.de/examples/ (accessed on 20 May 2024).
- Raytrix Light Field Camera Depth Resolution. Available online: https://raytrix.de/technology-2/ (accessed on 20 May 2024).
- Lucas, B.D.; Kanade, T. An iterative image registration technique with an application to stereo vision. In Proceedings of the IJCAI’81: 7th International Joint Conference on Artificial Intelligence, Vancouver, BC, Canada, 24–28 August 1981; Volume 2. [Google Scholar]
- Danielsson, P.E. Euclidean distance mapping. Comput. Graph. Image Process. 1980, 14, 227–248. [Google Scholar] [CrossRef]
- Javidi, K.; Martini, M.G.; Kara, P.A. KULF-TT53: A Display-Specific Turntable-Based Light Field Dataset for Subjective Quality Assessment. Electronics 2023, 12, 4868. [Google Scholar] [CrossRef]
- Zhang, R.; Regunathan, S.L.; Rose, K. Video coding with optimal inter/intra-mode switching for packet loss resilience. IEEE J. Sel. Areas Commun. 2000, 18, 966–976. [Google Scholar] [CrossRef]
- ITU-T Study Group 16. Working Practices Using Objective Metrics for Evaluation of Video Coding Efficiency Experiments; Technical Report; International Telecommunication Union: Geneva, Switzerland, 2020. [Google Scholar]
- Wang, Z.; Simoncelli, E.P.; Bovik, A.C. Multiscale structural similarity for image quality assessment. In Proceedings of the The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, Pacific Grove, CA, USA, 9–12 November 2003; IEEE: Piscataway, NJ, USA; Volume 2, pp. 1398–1402. [Google Scholar]
- Martini, M. A simple relationship between SSIM and PSNR for DCT-based compressed images and video: SSIM as content-aware PSNR. In Proceedings of the 2023 IEEE 25th International Workshop on Multimedia Signal Processing (MMSP), Poitiers, France, 27–29 September 2023; pp. 1–5. [Google Scholar]
- Viola, I.; Ebrahimi, T. VALID: Visual quality assessment for light field images dataset. In Proceedings of the 2018 Tenth International Conference on Quality of Multimedia Experience (QoMEX), Cagliari, Italy, 29 May–1 June 2018; pp. 1–3. [Google Scholar]
- Zizien, A.; Fliegel, K. LFDD: Light field image dataset for performance evaluation of objective quality metrics. In Proceedings of the Applications of Digital Image Processing XLIII, SPIE, Online, 24 August–4 September 2020; Volume 11510, pp. 671–683. [Google Scholar]
- Kiran Adhikarla, V.; Vinkler, M.; Sumin, D.; Mantiuk, R.K.; Myszkowski, K.; Seidel, H.P.; Didyk, P. Towards a quality metric for dense light fields. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 58–67. [Google Scholar]
- Ahmad, W.; Palmieri, L.; Koch, R.; Sjöström, M. Matching light field datasets from plenoptic cameras 1.0 and 2.0. In Proceedings of the 2018-3DTV-Conference: The True Vision-Capture, Transmission and Display of 3D Video (3DTV-CON), Helsinki, Finland, 3–5 June 2018; pp. 1–4. [Google Scholar]
- Shi, L.; Zhao, S.; Zhou, W.; Chen, Z. Perceptual evaluation of light field image. In Proceedings of the 2018 25th IEEE International Conference on Image Processing (ICIP), Athens, Greece, 7–10 October 2018; pp. 41–45. [Google Scholar]
- Feng, X.; Ma, Y.; Gao, L. Compact light field photography towards versatile three-dimensional vision. Nat. Commun. 2022, 13, 3333. [Google Scholar] [CrossRef] [PubMed]
Reference | Dataset Format | Plenoptic/Nonplenoptic | Camera |
---|---|---|---|
Javidi et al. [49] | Images | Nonplenoptic | Canon 77D |
Rerabek et al. [19] | Images | Plenoptic | Lytro Illum |
Sabater et al. [21] | Videos | Camera rig | IDS CMOSIS CMV2000 |
Shekhar et al. [23] | Images | Nonplenoptic | Canon EOS 6D and 5D as well as Sony Alpha 7 R |
De Faria et al. [24] | Images | Plenoptic | Raytrix R42 |
Hu et al. [25] | Videos | Camera matrix | monocular video cameras |
Guillo et al. [12] | Videos | Plenoptic | Raytrix R8 |
Paudyal et al. [20] | Images | Plenoptic | Lytro Illum |
Viola et al. [54] | Images | Plenoptic | Lytro Illum |
Zizien et al. [55] | Images | Plenoptic | Raytrix R5 |
Adhikarla et al. [56] | Images | Nonplenoptic | Canon EOS 5D |
Ahmad et al. [57] | Images | Plenoptic | Lytro Illum and Raytrix R29 |
KULFR8 | Videos | Plenoptic | Raytrix R8 |
Reference | Characterisation | AVC | HEVC | VP9 | AV1 | VVC | PSNR | SSIM |
---|---|---|---|---|---|---|---|---|
Javidi et al. [49] | ✔ | ✔ | ✔ | ✔ | ✔ | - | ✔ | ✔ |
Rerabek et al. [19] | - | - | - | - | - | - | - | - |
Sabater et al. [21] | - | - | - | - | - | - | - | - |
Shekhar et al. [23] | - | - | - | - | - | - | - | - |
De Faria et al. [24] | ✔ | - | - | - | - | - | - | - |
Hu et al. [25] | - | - | - | - | - | - | - | - |
Guillo et al. [12] | - | - | - | - | - | - | - | - |
Paudyal et al. [20] | ✔ | - | - | - | - | - | - | - |
Viola et al. [54] | - | - | ✔ | ✔ | - | - | ✔ | ✔ |
Shi et al. [58] | ✔ | - | ✔ | - | - | - | - | - |
Zizien et al. [55] | - | ✔ | ✔ | ✔ | ✔ | - | ✔ | ✔ |
Adhikarla et al. [56] | - | - | ✔ | - | - | - | - | - |
Ahmad et al. [57] | - | - | - | - | - | - | - | - |
KULFR8 | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Javidi, K.; Martini, M.G. A Light-Field Video Dataset of Scenes with Moving Objects Captured with a Plenoptic Video Camera. Electronics 2024, 13, 2223. https://doi.org/10.3390/electronics13112223
Javidi K, Martini MG. A Light-Field Video Dataset of Scenes with Moving Objects Captured with a Plenoptic Video Camera. Electronics. 2024; 13(11):2223. https://doi.org/10.3390/electronics13112223
Chicago/Turabian StyleJavidi, Kamran, and Maria G. Martini. 2024. "A Light-Field Video Dataset of Scenes with Moving Objects Captured with a Plenoptic Video Camera" Electronics 13, no. 11: 2223. https://doi.org/10.3390/electronics13112223
APA StyleJavidi, K., & Martini, M. G. (2024). A Light-Field Video Dataset of Scenes with Moving Objects Captured with a Plenoptic Video Camera. Electronics, 13(11), 2223. https://doi.org/10.3390/electronics13112223