Hierarchical Fusion Using Subsets of Multi-Features for Historical Arabic Manuscript Dating
Abstract
:1. Introduction
- A novel approach for fusing multifeatures: the fusion approach is proposed describing a hierarchical structure based on subsets of the multifeatures;
- Exploring the type of subset selection: we try to cover some of the states of the selected features. A comparison of the states is reported in the paper;
- The first attempt: this work is the first attempt to conduct an investigation after introducing the KERTAS dataset to the best of the authors’ knowledge;
- Improved accuracy for historical manuscript dating: we show that the proposed techniques deliver better performance compared with that of the dating methods based on traditional feature fusions. Additionally, our approach obtains promising results compared to the same fusion method, while all features are considered simultaneously.
2. Related Works
2.1. Datasets
2.2. Automated Date Estimation from Handwriting
2.3. Fusion Methods
3. Methodology
3.1. Database
3.2. Preprocessing and Feature Extraction Methods
3.3. Hierarchical Fusion Approach
Algorithm 1 Feature extraction algorithm based on hierarchical fusion approach. |
Input: Raw features (views) , , regularization parameter , . Output: Fused features .
|
3.4. Classification
4. Experimental Results
4.1. Setting
4.2. Results
The Impact of Different Setups
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Aiolli, F.; Ciula, A. A case study on the System for Paleographic Inspections (SPI): Challenges and new developments. Comput. Intell. Bioeng. 2009, 196, 53–66. [Google Scholar]
- Hamid, A.; Bibi, M.; Siddiqi, I.; Moetesum, M. Historical manuscript dating using textural measures. In Proceedings of the 2018 International Conference on Frontiers of Information Technology (FIT), Islamabad, Pakistan, 17–19 December 2018; pp. 235–240. [Google Scholar]
- Feng, C.M.; Xu, Y.; Li, Z.; Yang, J. Robust Classification with Sparse Representation Fusion on Diverse Data Subsets. arXiv 2019, arXiv:1906.11885. [Google Scholar]
- Le Bourgeois, F.; Kaileh, H. Automatic metadata retrieval from ancient manuscripts. In International Workshop on Document Analysis Systems; Springer: Berlin/Heidelberg, Germany, 2004; pp. 75–89. [Google Scholar]
- Feuerverger, A.; Hall, P.; Tilahun, G.; Gervers, M. Using statistical smoothing to date medieval manuscripts. In Beyond Parametrics in Interdisciplinary Research: Festschrift in Honor of Professor Pranab K. Sen; Institute of Mathematical Statistics: London, UK, 2008; pp. 321–331. [Google Scholar]
- Fecker, D.; Asi, A.; Pantke, W.; Märgner, V.; El-Sana, J.; Fingscheidt, T. Document writer analysis with rejection for historical arabic manuscripts. In Proceedings of the 2014 14th International Conference on Frontiers in Handwriting Recognition, Hersonissos, Greece, 1–4 September 2014; pp. 743–748. [Google Scholar]
- Garain, U.; Parui, S.; Paquet, T.; Heutte, L. Machine dating of handwritten manuscripts. In Proceedings of the Ninth International Conference on Document Analysis and Recognition (ICDAR 2007), Curitiba, Brazil, 23–26 September 2007; Volume 2, pp. 759–763. [Google Scholar]
- Tilahun, G. Statistical Methods for Dating Collections of Historical Documents; University of Toronto: Toronto, ON, Canada, 2011. [Google Scholar]
- He, S.; Sammara, P.; Burgers, J.; Schomaker, L. Towards style-based dating of historical documents. In Proceedings of the 2014 14th International Conference on Frontiers in Handwriting Recognition, Hersonissos, Greece, 1–4 September 2014; pp. 265–270. [Google Scholar]
- Sulaiman, A.; Omar, K.; Nasrudin, M.F. A database for degraded Arabic historical manuscripts. In Proceedings of the 2017 6th International Conference on Electrical Engineering and Informatics (ICEEI), Langkawi, Malaysia, 25–27 November 2017; pp. 1–6. [Google Scholar]
- Wahlberg, F.; Wilkinson, T.; Brun, A. Historical manuscript production date estimation using deep convolutional neural networks. In Proceedings of the 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR), Shenzhen, China, 23–26 October 2016; pp. 205–210. [Google Scholar]
- Cloppet, F.; Eglin, V.; Helias-Baron, M.; Kieu, C.; Vincent, N.; Stutzmann, D. Icdar2017 competition on the classification of medieval handwritings in latin script. In Proceedings of the 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Kyoto, Japan, 9–15 November 2017; Volume 1, pp. 1371–1376. [Google Scholar]
- Fiel, S.; Kleber, F.; Diem, M.; Christlein, V.; Louloudis, G.; Nikos, S.; Gatos, B. Icdar2017 competition on historical document writer identification (historical-wi). In Proceedings of the 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Kyoto, Japan, 9–15 November 2017; Volume 1, pp. 1377–1382. [Google Scholar]
- Shor, P. The leon levy dead sea scrolls digital library. the digitization project of the dead sea scrolls. In Digital Humanities in Biblical, Early Jewish and Early Christian Studies; Brill: Leiden, The Netherlands, 2014; pp. 9–20. [Google Scholar]
- Rahiche, A.; Hedjam, R.; Al-maadeed, S.; Cheriet, M. Historical documents dating using multispectral imaging and ordinal classification. J. Cult. Herit. 2020, 45, 71–80. [Google Scholar] [CrossRef]
- Adam, K.; Baig, A.; Al-Maadeed, S.; Bouridane, A.; El-Menshawy, S. KERTAS: Dataset for automatic dating of ancient Arabic manuscripts. Int. J. Doc. Anal. Recognit. (IJDAR) 2018, 21, 283–290. [Google Scholar] [CrossRef] [Green Version]
- He, S.; Samara, P.; Burgers, J.; Schomaker, L. A multiple-label guided clustering algorithm for historical document dating and localization. IEEE Trans. Image Process. 2016, 25, 5252–5265. [Google Scholar] [CrossRef]
- He, S.; Samara, P.; Burgers, J.; Schomaker, L. Image-based historical manuscript dating using contour and stroke fragments. Pattern Recognit. 2016, 58, 159–171. [Google Scholar] [CrossRef]
- Wahlberg, F.; Mårtensson, L.; Brun, A. Large scale style based dating of medieval manuscripts. In Proceedings of the 3rd International Workshop on Historical Document Imaging and Processing, Nancy, France, 22 August 2015; pp. 107–114. [Google Scholar]
- Li, Y.; Genzel, D.; Fujii, Y.; Popat, A.C. Publication date estimation for printed historical documents using convolutional neural networks. In Proceedings of the 3rd International Workshop on Historical Document Imaging and Processing, Nancy, France, 22 August 2015; pp. 99–106. [Google Scholar]
- Vincent, L. Google book search: Document understanding on a massive scale. In Proceedings of the Ninth International Conference on Document Analysis and Recognition (ICDAR 2007), Curitiba, Brazil, 23–26 September 2007; Volume 2, pp. 819–823. [Google Scholar]
- Hamid, A.; Bibi, M.; Moetesum, M.; Siddiqi, I. Deep Learning Based Approach for Historical Manuscript Dating. In Proceedings of the 2019 International Conference on Document Analysis and Recognition (ICDAR), Sydney, NSW, Australia, 20–25 September 2019; pp. 967–972. [Google Scholar]
- Studer, L.; Alberti, M.; Pondenkandath, V.; Goktepe, P.; Kolonko, T.; Fischer, A.; Liwicki, M.; Ingold, R. A comprehensive study of ImageNet pre-training for historical document image analysis. In Proceedings of the 2019 International Conference on Document Analysis and Recognition (ICDAR), Sydney, NSW, Australia, 20–25 September 2019; pp. 720–725. [Google Scholar]
- Clanuwat, T.; Bober-Irizar, M.; Kitamoto, A.; Lamb, A.; Yamamoto, K.; Ha, D. Deep learning for classical Japanese literature. arXiv 2018, arXiv:1812.01718. [Google Scholar]
- Simistira, F.; Seuret, M.; Eichenberger, N.; Garz, A.; Liwicki, M.; Ingold, R. Diva-hisdb: A precisely annotated large dataset of challenging medieval manuscripts. In Proceedings of the 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR), Shenzhen, China, 23–26 October 2016; pp. 471–476. [Google Scholar]
- Dhali, M.A.; Jansen, C.N.; de Wit, J.W.; Schomaker, L. Feature-extraction methods for historical manuscript dating based on writing style development. Pattern Recognit. Lett. 2020, 131, 413–420. [Google Scholar] [CrossRef]
- Shao, L.; Liu, L.; Yu, M. Kernelized multiview projection for robust action recognition. Int. J. Comput. Vis. 2016, 118, 115–129. [Google Scholar] [CrossRef] [Green Version]
- Li, J.; Zhang, B.; Lu, G.; Zhang, D. Generative multi-view and multifeature learning for classification. Inf. Fusion 2019, 45, 215–226. [Google Scholar] [CrossRef]
- Kan, M.; Shan, S.; Zhang, H.; Lao, S.; Chen, X. Multi-view discriminant analysis. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 38, 188–194. [Google Scholar] [CrossRef] [PubMed]
- Wang, W.; Arora, R.; Livescu, K.; Bilmes, J. On deep multi-view representation learning. In Proceedings of the International Conference on Machine Learning, Lille, France, 6–11 July 2015; pp. 1083–1092. [Google Scholar]
- Bahrampour, S.; Nasrabadi, N.M.; Ray, A.; Jenkins, W.K. Multimodal task-driven dictionary learning for image classification. IEEE Trans. Image Process. 2015, 25, 24–38. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Abavisani, M.; Patel, V.M. Multimodal sparse and low-rank subspace clustering. Inf. Fusion 2018, 39, 168–177. [Google Scholar] [CrossRef]
- Yang, M.; Zhang, L.; Zhang, D.; Wang, S. Relaxed collaborative representation for pattern classification. In Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, 16–21 June 2012; pp. 2224–2231. [Google Scholar]
- Yuan, X.T.; Liu, X.; Yan, S. Visual classification with multitask joint sparse representation. IEEE Trans. Image Process. 2012, 21, 4349–4360. [Google Scholar] [CrossRef] [PubMed]
- Li, J.; Zhang, D.; Li, Y.; Wu, J.; Zhang, B. Joint similar and specific learning for diabetes mellitus and impaired glucose regulation detection. Inf. Sci. 2017, 384, 191–204. [Google Scholar] [CrossRef]
- Li, J.; Zhang, B.; Zhang, D. Joint discriminative and collaborative representation for fatty liver disease diagnosis. Expert Syst. Appl. 2017, 89, 31–40. [Google Scholar] [CrossRef]
- Liu, H.; Liu, L.; Le, T.D.; Lee, I.; Sun, S.; Li, J. Nonparametric sparse matrix decomposition for cross-view dimensionality reduction. IEEE Trans. Multimed. 2017, 19, 1848–1859. [Google Scholar] [CrossRef]
- Gui, J.; Tao, D.; Sun, Z.; Luo, Y.; You, X.; Tang, Y.Y. Group sparse multiview patch alignment framework with view consistency for image classification. IEEE Trans. Image Process. 2014, 23, 3126–3137. [Google Scholar]
- Li, B.; Yuan, C.; Xiong, W.; Hu, W.; Peng, H.; Ding, X.; Maybank, S. Multi-view multi-instance learning based on joint sparse representation and multi-view dictionary learning. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 2554–2560. [Google Scholar] [CrossRef] [Green Version]
- Zhao, Z.; Lu, H.; Deng, C.; He, X.; Zhuang, Y. Partial multi-modal sparse coding via adaptive similarity structure regularization. In Proceedings of the 24th ACM International Conference on Multimedia, Amsterdam, The Netherlands, 15–19 October 2016; ACM: New York, NY, USA, 2016; pp. 152–156. [Google Scholar]
- Li, S.Y.; Jiang, Y.; Zhou, Z.H. Partial multi-view clustering. In Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, Québec City, QC, Canada, 27–31 July 2014. [Google Scholar]
- Lee, D.D.; Seung, H.S. Learning the parts of objects by non-negative matrix factorization. Nature 1999, 401, 788. [Google Scholar] [CrossRef]
- Djeddi, C.; Siddiqi, I.; Souici-Meslati, L.; Ennaji, A. Text-independent writer recognition using multi-script handwritten texts. Pattern Recognit. Lett. 2013, 34, 1196–1202. [Google Scholar] [CrossRef]
- Bulacu, M.; Schomaker, L. Text-independent writer identification and verification using textural and allographic features. IEEE Trans. Pattern Anal. Mach. Intell. 2007, 29, 701–717. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Brink, A.; Smit, J.; Bulacu, M.; Schomaker, L. Writer identification using directional ink-trace width measurements. Pattern Recognit. 2012, 45, 162–171. [Google Scholar] [CrossRef]
- Siddiqi, I.; Vincent, N. Text independent writer recognition using redundant writing patterns with contour-based orientation and curvature features. Pattern Recognit. 2010, 43, 3853–3865. [Google Scholar] [CrossRef]
- Dalal, N.; Triggs, B. Histograms of oriented gradients for human detection. In Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), San Diego, CA, USA, 20–25 June 2005; Volume 1, pp. 886–893. [Google Scholar] [CrossRef] [Green Version]
- Chen, J.; Cao, H.; Prasad, R.; Bhardwaj, A.; Natarajan, P. Gabor features for offline Arabic handwriting recognition. In Proceedings of the 9th IAPR International Workshop on Document Analysis Systems, Boston, MA, USA, 9–11 June 2010; pp. 53–58. [Google Scholar]
- Assayony, M.O.; Mahmoud, S.A. Recognition of Arabic handwritten words using Gabor-based bag-of-features framework. Int. J. Comput. Digit. Syst. 2018, 7, 35–42. [Google Scholar] [CrossRef]
- Elleuch, M.; Hani, A.; Kherallah, M. Arabic handwritten script recognition system based on HOG and gabor features. Int. Arab J. Inf. Technol. 2017, 14, 639–646. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Cotter, S.F.; Rao, B.D.; Engan, K.; Kreutz-Delgado, K. Sparse solutions to linear inverse problems with multiple measurement vectors. IEEE Trans. Signal Process. 2005, 53, 2477–2488. [Google Scholar] [CrossRef]
- Zhang, H.; Zhang, Y.; Nasrabadi, N.M.; Huang, T.S. Joint-structured-sparsity-based classification for multiple-measurement transient acoustic signals. IEEE Trans. Syst. Man Cybern. Part B (Cybern.) 2012, 42, 1586–1598. [Google Scholar] [CrossRef]
- Rakotomamonjy, A. Surveying and comparing simultaneous sparse approximation (or group-lasso) algorithms. Signal Process. 2011, 91, 1505–1526. [Google Scholar] [CrossRef] [Green Version]
- Parikh, N.; Boyd, S. Proximal algorithms. Found. Trends Optim. 2014, 1, 127–239. [Google Scholar] [CrossRef]
- Van der Maaten, L.; Hinton, G. Visualizing data using t-sne. J. Mach. Learn. Res. 2008, 9, 2579–2605. [Google Scholar]
Key Century | Number of Documents | Training | Testing |
---|---|---|---|
1 | 60 | 48 | 12 |
2 | 47 | 37 | 10 |
3 | 144 | 116 | 28 |
4 | 592 | 474 | 118 |
5 | 164 | 132 | 32 |
6 | 119 | 95 | 24 |
7 | 184 | 147 | 37 |
8 | 110 | 88 | 22 |
9 | 153 | 123 | 30 |
10 | 73 | 59 | 14 |
11 | 169 | 135 | 34 |
12 | 147 | 118 | 29 |
13 | 119 | 95 | 24 |
14 | 17 | 14 | 3 |
Methods | Unsupervised MAE (%) | Accuracy (%) | Supervise MAE (%) | Accuracy (%) |
---|---|---|---|---|
Gabor | 50.40 | 45.71 | 35.65 | 66.66 |
Hinge | 49.21 | 47.61 | 37.31 | 61.90 |
Hog | 52.80 | 43.80 | 37.35 | 61.90 |
ResNet | 43.80 | 55.23 | 33.30 | 69.52 |
Concatenated | 39.35 | 61.90 | 31.50 | 71.42 |
features | ||||
Ours | 31.95 | 71.25 | 26.90 | 82.50 |
State | Unsupervised (%) | Supervise (%) |
---|---|---|
A1 | 64.28 | 75.47 |
A2 | 64.95 | 75.45 |
A3 | 67.65 | 76.85 |
A4 | 69.22 | 80.95 |
A5 | 71.25 | 82.50 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Adam, K.; Al-Maadeed, S.; Akbari, Y. Hierarchical Fusion Using Subsets of Multi-Features for Historical Arabic Manuscript Dating. J. Imaging 2022, 8, 60. https://doi.org/10.3390/jimaging8030060
Adam K, Al-Maadeed S, Akbari Y. Hierarchical Fusion Using Subsets of Multi-Features for Historical Arabic Manuscript Dating. Journal of Imaging. 2022; 8(3):60. https://doi.org/10.3390/jimaging8030060
Chicago/Turabian StyleAdam, Kalthoum, Somaya Al-Maadeed, and Younes Akbari. 2022. "Hierarchical Fusion Using Subsets of Multi-Features for Historical Arabic Manuscript Dating" Journal of Imaging 8, no. 3: 60. https://doi.org/10.3390/jimaging8030060
APA StyleAdam, K., Al-Maadeed, S., & Akbari, Y. (2022). Hierarchical Fusion Using Subsets of Multi-Features for Historical Arabic Manuscript Dating. Journal of Imaging, 8(3), 60. https://doi.org/10.3390/jimaging8030060