Research on Algorithm for Authenticating the Authenticity of Calligraphy Works Based on Improved EfficientNet Network

Wang, Weijun; Jiang, Xuyao; Yuan, Hai; Chen, Jinyuan; Wang, Xintong; Huang, Zucheng

doi:10.3390/app14010295

Open AccessArticle

Research on Algorithm for Authenticating the Authenticity of Calligraphy Works Based on Improved EfficientNet Network

by

Weijun Wang

,

Xuyao Jiang

,

Hai Yuan

^*,

Jinyuan Chen

,

Xintong Wang

and

Zucheng Huang

^*

Guangzhou Institute of Advanced Technology, Guangzhou 511458, China

^*

Authors to whom correspondence should be addressed.

Appl. Sci. 2024, 14(1), 295; https://doi.org/10.3390/app14010295

Submission received: 29 November 2023 / Revised: 20 December 2023 / Accepted: 26 December 2023 / Published: 28 December 2023

(This article belongs to the Special Issue Applications of Artificial Intelligence in Digital Cultural Heritage Analysis)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Calligraphy works have high artistic value, but there is the rampant problem of forgery. Indeed, the authentication of traditional calligraphy heavily relies on calligraphers’ subjective judgment. Therefore, spurred by the recent development of neural networks, this paper proposes a method for authenticating calligraphy works based on an improved EfficientNet network. Specifically, the developed method utilizes the character box algorithm to efficiently extract individual calligraphy characters, which are then augmented and used as the training set for the model. The training process employs CBAM and Self-Attention modules to enhance the attention mechanism of the EfficientNet network. The trained network model is used to judge the calligraphy works’ similarity; tested on authentic works, imitated works, and works from other calligraphers; and compared with other networks. The experimental results demonstrate that the proposed method effectively achieves the authentication of calligraphy works, and the improved CBAM-EfficientNet network and SA-EfficientNet network achieve better authentication performance.

Keywords:

calligraphy work authentication; neural networks; attention mechanism

1. Introduction

Calligraphy, as one of the traditional art forms in China, possesses a high collectible value in the field of art, especially the works of renowned calligraphy masters, whose collection value is difficult to estimate. In June 2019, Guanzhong’s work “Lion Grove” was sold for 143.75 million CNY, setting a new record for calligraphy and painting auction prices [1]. However, as the prices of calligraphy works are high, counterfeiting in calligraphy art continues to emerge [2]. Indeed, the production level of counterfeit modern calligraphy artworks is very high, significantly harming the calligraphy collection market.

Traditional methods for distinguishing the authenticity of calligraphy work mainly rely on manual and physical authentication. Calligraphers rely on experience to determine authenticity, which is significantly subjective and cannot be quantified, making it prone to errors [3]. The physical methods mainly include seal authentication and paper composition analysis. Seal authentication involves determining whether the seal on the work is consistent with previous works [4], while paper composition analysis involves detecting whether the internal oxidizable components of the paper undergo the corresponding chemical reactions over time. However, with the continuous advancement of counterfeiting technology, computer scanning and the mechanical reproduction of seals can almost perfectly replicate authentic seals, rendering the seal authentication method ineffective. Counterfeiters can also avoid paper composition analysis by purchasing paper from the same era to create forged calligraphy works.

Currently, research on calligraphy focuses on the literature and arts. In computer science, research on calligraphy works primarily focuses on calligraphy recognition and generation, while research on the authenticity identification of calligraphy works is relatively scarce. For instance, Jing utilizes seven invariant moments to achieve calligraphy authenticity identification [5], and Zijun extracts calligraphy skeletons using generative adversarial networks [6]. Besides, Genting recognizes calligraphy characters based on an improved DenseNet network [7]. Wenhao generates calligraphy characters using generative adversarial networks [8]. Xiaoyan achieves calligraphy content and style recognition using deep learning and label power sets [9]. Moreover, Kang extracts calligraphy character features using high-resolution networks and detects calligraphy character areas using scale prediction branches and spatial information prediction branches to classify calligraphy characters and their boundaries, achieving calligraphy character detection [10]. Pan utilizes graph neural networks to compare the similarity between two calligraphy characters [11].

With the development and wide application of neural networks, using neural networks for handwriting recognition has become a research focus. Calligraphy has strong artistic qualities, diverse forms, and distinctive personal characteristics compared to ballpoint and pencil script. From the perspective of character morphology, calligraphy strokes have varying tilt and thicknesses that differentiate them from hard pen script. After years of training, calligraphers develop relatively stable and unique font form variations. Significant differences exist in the writing techniques of the same stroke among different calligraphers. This writing technique is the calligrapher’s stylistic characteristic and the source of artisticity in calligraphy works. By utilizing these unique font form variations, one can effectively differentiate their works from others. Imitators of calligraphy works also focus on imitating these specific writing techniques.

This paper proposes a method for authenticating calligraphy works based on the original works and imitations of two famous calligraphers from a well-known art museum. In the proposed method, first, the character box algorithm, which is improved based on the projection method and centroid algorithm, efficiently captures the individual calligraphy characters. The captured characters are then subjected to data augmentation to create a calligraphy training dataset. Second, the network structure of EfficientNetv2-S is enhanced using the CBAM module and Self-Attention module to improve the attention mechanism of the network. The improved CBAM-EfficientNet and SA-EfficientNet networks are trained, and finally, the trained networks are used to determine if the works possess the calligraphers’ distinctive stroke based on some features and similarity scores. The results are compared with mainstream image neural networks such as EfficientNetv2-S, revealing that the improved network models effectively authenticate calligraphy works and have significant implications in calligraphy authentication.

2. Calligraphy Dataset

The size of the calligraphy dataset directly influences the performance of the network’s judgment. However, there is a relatively limited number of calligraphy training datasets available, especially for datasets specific to individual calligraphers. Therefore, the production of the dataset is particularly important. This paper employs individual calligraphy characters as a training dataset for the network model. The character box algorithm extracts individual calligraphy characters from calligraphy works. Data augmentation is applied using small angular rotation, scaling, magnification, noise, and adjustment of the binary threshold. This enables the network to obtain a sufficient quantity of training data sets.

2.1. Character Box Algorithm

In order to efficiently recognize the position of each character in calligraphy works and use the font box to select and extract the characters, this study adopts a character segmentation method for character recognition. Traditional character segmentation methods include projection, connected component analysis, character matching, and neural networks. However, as an artistic font, calligraphy has a wide range of variations in its form, making it difficult to use fixed character matching. Moreover, the scarcity of calligraphy datasets makes it challenging to employ neural-network-based methods for segmentation. Therefore, this paper proposes a font box algorithm that combines connected component analysis with the projection method, utilizing the centroid points of connected components.

2.1.1. Projection Algorithm

The writing method of calligraphy differs from that of other works of art, as calligraphy works have a relatively monochromatic color and are mostly written vertically. By using these characteristics, the projection algorithm can effectively extract the character box from calligraphy works. Binary processing can effectively separate the foreground and background of the artwork. Then, individual calligraphy characters can be segmented from the artwork by accumulating the projection values of the foreground in the vertical and horizontal directions of the image.

Using the feature of vertical writing in calligraphy, the foreground of the image is projected onto the horizontal X-axis of the picture, and the accumulated value function

\sum X

in the X-axis direction is obtained. Figure 1a shows that the projected accumulated value exhibits periodicity. This periodic feature of the function can be utilized for image segmentation. Assuming the image size is n × M, the image function

f (x_{i}, y_{i})

is represented as Equation (1):

\begin{matrix} f (x_{i}, y_{i}) = \{\begin{matrix} 1, f r o n t g r o u n d \\ 0, b a c k g r o u n d \end{matrix}, \end{matrix}

(1)

The X-axis foreground projection cumulative value function

\sum X

is Equation (2):

\sum X = \sum_{i = 0}^{m} f (x, y_{i}),

(2)

For function

\sum X

, the coordinates of the zeros where the derivative is greater than 0 are recorded as

U_{x_{i}}

, and the coordinates of the zeros where the derivative is less than 0 are recorded as

W_{x_{i}}

. To prevent the radicals of the characters from being separated, a deviation value

b_{x}

is set. If

U_{x_{i + 1}} - W_{x_{i}} < b_{x}

, then

U_{x_{i + 1}}

and

W_{x_{i}}

are not considered as segmentation coordinates.

U_{x_{i}}

and

W_{x_{i}}

are used as the starting and ending coordinates of the segmentation points in the image, as shown in Figure 1a, resulting in Figure 1b after image segmentation.

Similarly, projecting the foreground values of the segmented image onto the Y-axis of the screen yields the cumulative foreground value function

\sum Y

, as shown in Equation (3):

\sum Y = \sum_{j = U_{x_{i}}}^{W_{x_{i}}} f (x_{j}, y),

(3)

The coordinates of the zero points where all the derivatives of the function

\sum Y

are greater than 0 are recorded as

U_{y_{j}}

, and the coordinates of the zero points where the derivatives are less than 0 are recorded as

W_{y_{j}}

. Then, a deviation value is set. If

U_{y_{j + 1}} - W_{y_{j}} < b_{y}

, then

U_{y_{j + 1}}

and

W_{y_{j}}

are not included as segmentation coordinates.

U_{y_{j + 1}}

and

W_{y_{j}}

are used as the starting and ending coordinates of the segmentation points to segment the image, as depicted in Figure 1b.

From Figure 1a,b, we obtain the minimum foreground value coordinate

y_{m i n} = U_{y_{j}}

in the Y-axis direction for each calligraphy character, as well as the maximum value

y_{m a x} = W_{y_{j}}

in the vertical direction.

As depicted in Figure 1c, the minimum foreground value coordinate

x_{m i n}

and the maximum coordinate

x_{m a x}

in the X-axis direction of the graph are calculated to obtain the character box coordinates

(x_{m i n}, y_{m i n})

and

(x_{m a x}, y_{m a x})

.

Although the projection algorithm has a simple calculation principle, it efficiently divides the required calligraphy characters from the entire work, which is very suitable for splitting calligraphy works with vertical writing scenes. However, the projection algorithm also has limitations and cannot completely solve the problem of calligraphy character extraction. This is because many calligraphy works are not arranged neatly, and calligraphers often write casually, so it is common to have large character deformations and span variations. Figure 2 illustrates an example where a work has a large stroke span and overlapping letters in the vertical direction. When calculating the foreground accumulation value function

\sum X

in the X-axis direction, the character box algorithm cannot find the corresponding zero point and fails to accurately locate the character box coordinates of individual calligraphy characters.

2.1.2. Connected Domain Algorithm

Connected component analysis is an image processing technique that involves grouping adjacent pixels into regions and analyzing these pixel groups as one. There are various methods for connected component analysis, including thresholding, region growing, graph-based segmentation, and morphology-based approaches, where each method analyzes the image based on different criteria such as pixel intensity, region growth, graph theory, or morphology.

Morphology-based connected component analysis is an effective image segmentation algorithm that groups pixels in the image into connected regions with the same attributes. Then, it analyzes their properties, such as area, perimeter, centroid, and shape, to achieve image segmentation. The connected component analysis has the advantages of speed and simplicity and, therefore, has been widely used in practical applications.

Figure 3 illustrates the morphological connected component segmentation based on distances between the centroids of different connected regions, which is used to obtain the font box (Figure 3b). The morphology-based connected component analysis algorithm also performs well in segmenting calligraphy characters, as it is unaffected by character deformation and can adapt to the complex morphology requirements of calligraphy character segmentation.

2.2. Centroid Algorithm

The disadvantage of the projection-based character box algorithm lies in losing geometric distance information on the plane based on the cumulative foreground value when analyzing the coordinates of individual characters. This leads to a significant offset in the strokes of individual characters, affecting the overall partitioning effect. To overcome this problem, this paper proposes the centroid algorithm to supplement the calligraphy character segmentation.

The centroid algorithm is a supplement to the projection algorithm. Its principle relies on the centroid points of individual strokes of a character being relatively close, while the centroid points of different characters’ strokes are relatively far apart. By calculating the centroid points of each stroke, setting a nearest point distance threshold and a relative distance threshold, and considering strokes within the specified threshold as part of the same character, calligraphy characters can be effectively segmented.

2.2.1. Centroid Calculation

This is a multi-staged process where the character box algorithm first categorizes images. Then, binarization and the Canny edge detection operator are applied for edge recognition at the image regions that cannot be completely segmented using the character box algorithm. Thus, the edge point information of all strokes m is obtained, and the area

A_{m}

of each stroke is individually calculated using the Gaussian shoelace formula as Equation (4):

A_{m} = \frac{1}{2} |\sum_{i = 1}^{n} (x_{m_{i}} y_{m_{i + 1}} - x_{m_{i + 1}} y_{m_{i}})|,

(4)

where m is the stroke, with a value range of

1, 2, \dots, m

, and n represents the number of edge points of the m-th stroke, with a value range of

1, 2, \dots, n

. When i+1 is equal to n+1, we let i+1=1, the stroke’s edge points are calculated in a loop. For example,

(x_{m_{n}}, y_{m_{n}})

represents the coordinates of the n-th edge point of the m-th stroke. According to Equations (5) and (6), the centroid point

(C_{x_{m}}, C_{y_{m}})

of the m-th stroke is:

C_{x_{m}} = \frac{1}{6 A_{m}} |\sum_{i = 1}^{n} (x_{m_{i}} + x_{m_{i + 1}}) (x_{m_{i}} y_{m_{i + 1}} - x_{m_{i + 1}} y_{m_{i}})|,

(5)

C_{y_{m}} = \frac{1}{6 A_{m}} |\sum_{i = 1}^{n} (y_{m_{i}} + y_{m_{i + 1}}) (x_{m_{i}} y_{m_{i + 1}} - x_{m_{i + 1}} y_{m_{i}})|,

(6)

The effect is illustrated in Figure 4a, where white dots represent the centroid points.

2.2.2. Character Segmentation

The distance between each centroid point in the figure is calculated as Equation (7):

\begin{matrix} D_{(a, b)} = \sqrt{{(C_{x_{a}} - C_{x_{b}})}^{2} + {(C_{y_{a}} - C_{y_{b}})}^{2}}, \end{matrix}

(7)

For the distance between a shape’s heart points, the heart point distance of strokes from the same character is closer in most cases. However, there are also situations where strokes connected by many strokes result in a bigger distance between the heart points. By setting the closest point distance threshold

d_{1}

and the relative distance threshold

d_{2}

, only one point that is closest but smaller than

d_{1}

and all points smaller than

d_{2}

are taken as strokes of the same character. Typically,

d_{2} < d_{1}

.

As depicted in Figure 4, the following operations are performed:

Only take the nearest point to the centroid that is less than distance $d_{1}$ , and recognize it as the centroid point of the stroke of the same character. There may be cases where points are mutually nearest to each other. Figure 4b illustrates the connection of centroid points.
Recognize all centroid points based on threshold $d_{2}$ . If the centroid points are located within a distance less than $d_{2}$ from each other, they are recognized as centroid points of the stroke of the same character. Figure 4c illustrates the connection of centroid points.
Combine the recognition results from (1) and (2) to obtain a set of centroid points that meet the conditions for a single character, representing strokes of the same character. Based on the recognition results, calculate the minimum foreground value $(x_{m i n}, y_{m i n})$ and maximum foreground value $(x_{m a x}, y_{m a x})$ , and obtain the coordinates for the character box for all strokes of the same character, as depicted in Figure 4d.

The centroid algorithm can effectively utilize the geometric information of foreground values to divide characters, preventing the failure of the character box recognition algorithm caused by large character deformation. However, the algorithm itself also has disadvantages. When processing images with a large number of characters, it is challenging to determine suitable threshold values

d_{1}

and

d_{2}

for the Shape Center Algorithm. After testing, it is proven that this algorithm has a good effect when there are not many characters in the image, and thus, it can be used to supplement the character box algorithm.

This paper uses the projection algorithm combined with the Shape Center Algorithm in data processing. Specifically, the projection algorithm first calculates the character box and sets the character box threshold width as W and height as H. If there is a character box width

x_{m a x} - x_{m i n} > W

or height

y_{m a x} - y_{m i n} > H

, shape center point calculation is applied to that character box, as depicted in Figure 5. Compared to using only the character box algorithm, the Shape Center Algorithm effectively improves segmentation accuracy. Combining these two algorithms can achieve better segmentation results for calligraphy characters.

According to Figure 5, the font box algorithm, which is solely based on the projection method, demonstrates poor performance, capturing only 36 out of 277 characters. The font box algorithm based on connected component analysis achieves a relatively better result, capturing 173 characters. However, the proposed font box algorithm combining the projection method and centroid algorithm significantly captures 233 characters. This effectively enhances the efficiency of generating the required dataset for neural network training.

2.3. Data Augmentation

Model training requires a large amount of data. Although we segment single characters for data production, many calligraphy works still struggle to meet the data volume required to train a model. Therefore, we utilize various data augmentation methods to expand the dataset and eliminate the influence caused by slight rotation, scaling, and different paper types used by calligraphers during their creative process.

As illustrated in Figure 6, the data augmentation methods used include rotation, amplification, reduction, salt noise, Gaussian noise, and binary adjustment. Image zooming refers to enlarging the image and adjusting it back to its original size to eliminate the influence of character edge noise caused by image zooming during training. Binary adjustment adjusts the image based on the background color depth of calligraphy, which works in the range of 0–255 while ensuring the clear shape of the character to exclude the character edge interference caused by different binary thresholds. Furthermore, we wrote a data augmentation program based on OpenCV. The generated dataset is a black background with white characters and single-channel images of size 224 × 224.

The calligraphy work data used in this paper include works of well-known Calligrapher A in regular script, works of Calligrapher B in regular script and cursive script, works of other calligraphers, and a large number of regular script and cursive script works collected through the internet. After data augmentation, the amounts of Calligrapher A’s regular script data, Calligrapher B’s regular script data, Calligrapher B’s cursive script data, regular script data by other calligraphers, and cursive script data by other calligraphers are 300,752 characters, 85,406 characters, 128,811 characters, 443,015 characters, and 348,472 characters. These datasets are used as training sets to train a binary classification network to authenticate the authenticity of calligraphy works.

3. Network Design

EfficientNet [12], proposed by the Google Brain team in 2019, is a network that has demonstrated appealing performance in image classification projects in recent years. This paper uses the EfficientNetv2-S network and calligraphy dataset for training. Our modifications involve improving the attention mechanism and introducing the CBAM and Self-Attention modules to enhance the model’s generalization ability and achieve higher accuracy in calligraphy authentication.

EfficientNetv2-S Network and Attention Module Improvement

The characteristics of ResNet inspire EfficientNetv2-S, and thus, it employs MBConv blocks and Fused-MBConv blocks for higher accuracy and faster inference speed. The MBConv block includes dilated, depth-wise separable convolution and the SE attention module. Multiple Fused-MBConv blocks and MBConv blocks are stacked to form the EfficientNetv2-S network.

Zhang’s improved CNN models for calligraphy style classification tasks using the CBAM attention module have achieved better classification results than the SE attention module [13]. Therefore, based on the EfficientNetv2-S network, the CBAM attention module and Self-Attention module are further utilized to improve the SE attention module in the MBConv blocks, resulting in the CBAM-EfficientNet network and SA-EfficientNet network. The CBAM module is a commonly used attention mechanism module in image processing that adaptively weights each channel and spatial position of the image, allowing the model to learn the most important features better and exhibit good image generalization and processing performance [14]. The Self-Attention module can identify the relationship between the features of each position in the image and all other positions. Furthermore, it calculates the correlation scores between each position and other positions in the image and weights the features of other positions. The advantage of the Self-Attention module is that it can establish global dependencies between different positions in the image, which is beneficial for the network to learn the overall features of calligraphy characters [15]. The network structures of both modules are depicted in Figure 7. Note that the attention mechanism of the original EfficientNet_v2-S network is SE attention, while the new network employs Self-Attention and CBAM.

This paper employs the regular script (Kaishu) and cursive script (Caoshu) calligraphy works of two calligraphers to create a training set. We train the EfficientNetv2-S [16], CBAM-EfficientNet, and SA-EfficientNet networks and horizontally compare InceptionResNetV2 [17], InceptionV3 [18], MobileNetV3L [19], ResNet50 [20], and MobileNet [21]. The training platform is an i9-13900k CPU, RTX4090 GPU, with 128 GB of memory. The batch size for training is set to 64 and runs for 20 epochs in a loop. The optimizer used is Adam, and the loss function is binary cross-entropy.

4. Experimental Results

According to the research institute, the forgery of calligraphy works usually focuses on imitating the handwriting style of the calligrapher. There are fundamental differences between the forgery writer’s and the author’s handwriting styles, and thus, the forgery writer cannot guarantee a stable imitation effect. Given that the written characters are typically unnatural and not smooth [22], the forgery of calligraphy works may present some characters similar to the authentic ones, while others may have differences. This paper uses a model to determine the probability of each character being authentic and calculates the average probability of authentic characters in the entire work as the probability of authentic work. This strategy detects whether the calligraphy style of the inspected work remains within a high probability and relatively stable range. The probability of authentic work is defined as Equation (8):

A = \frac{1}{n} \sum_{i = 1}^{n} a_{i} \times 100 %,

(8)

where

A

represents the overall authenticity probability of the work,

a_{i}

denotes the authenticity probability of an individual calligraphy character as determined by the model, and n represents the number of characters in the work.

To verify the model’s effectiveness, the authentic works of Calligraphers A and B were employed as the authentic test set, and the works of other calligraphers were employed as the counterfeit test set. Calligrapher C was invited to imitate authentic works to test regular and cursive script models.

4.1. Calligrapher A’s Regular Script Model Testing Results

This paper collected three authentic works from Calligrapher A for Calligrapher A’s regular script model. Two works were a authentic 243-character work (authentic work 1) and a 287-character work (authentic work 2). A 20-character work (authentic work 3) in the daikai style was also collected as an authentic sample. Regarding the negative samples, a 51-character imitation work by another calligrapher and a 28-character work by a different calligrapher was collected.

Table 1 reveals that for the calligraphy model test of Calligrapher A, authentic work 1 and authentic work 2 are small regular script works of Calligrapher A, while authentic work 3 is a large regular script work of Calligrapher A. The test results show that since most regular script works in the training set are small, all networks effectively identify small regular script works. Moreover, SA-EfficientNet has a higher score on authentic work 2 than other networks, and all models achieve good authenticity identification for imitations and works of other calligraphers.

The large regular script is a type of regular script with some differences in character shape compared to the small regular script, but the overall calligraphic style is similar to the small regular script. It is usually used in works like couplets with larger characters. The training set in this paper focuses on a small regular script, so whether the models can identify the large regular script works of the same calligrapher is also a test of whether they have learned the calligrapher’s writing style. Table 1 highlights that SA-EfficientNet can better discriminate large regular script works than other models, with an accuracy of 59.233%. The model’s generalization performance is also outstanding.

Among the tested models, except for the unsatisfactory discriminative effect of ResNet50, the other models can learn Calligrapher A’s calligraphic style well. In particular, SA-EfficientNet can not only achieve good discrimination ability, but the model’s generalization ability is also relatively good, resulting in good discrimination performance even when facing large regular script styles that have not been learned. Moreover, the SA-EfficientNet network has improved based on the Self-Attention module, has better generalization ability, and performs well in calligraphy authenticity identification.

4.2. Calligrapher B’s Regular Script Model Testing Results

For Calligrapher B’s regular script model, this paper collected one authentic work by Calligrapher B, which consists of 1015 characters (authentic work 1), and one imitation work by another calligrapher with 18 characters. Two works by different calligraphers were collected, one with 24 characters (other work 1) and the other with 28 characters (other work 2).

In Table 2, Authentic 1 refers to the authentic calligraphy work by Calligrapher B in regular script style. As a comparison, the imitation piece is a work by Calligrapher C imitating the writing style of Calligrapher B, while the remaining two works are calligraphy works by other calligraphers.

According to the test results from Table 2, all models except for MobileNetv3-L converge well during training. Among them, the improved SA-EfficientNet model achieves a similarity of 90.997% when identifying Authentic 1, which is a significant improvement compared to the 81.762% and 85.717% similarities achieved by EfficientNetv2-S and CBAM-EfficientNet, respectively. Furthermore, compared to InceptionResNetv2 and MobileNet, while exhibiting comparable performance in identifying Authentic 1, SA-EfficientNet has better discrimination ability when identifying imitations. When faced with imitation works, SA-EfficientNet only achieves a similarity of 5.556%, effectively distinguishing between authentic and imitation pieces, while InceptionResNetv2 and MobileNet achieve 10.999% and 16.674%, respectively.

In conclusion, in the testing of Calligrapher B’s regular script model, the improved CBAM-EfficientNet shows better discrimination ability than EfficientNetv2-S, while SA-EfficientNet further improves the discrimination performance based on the CBAM-EfficientNet network.

4.3. Calligrapher B Tests the Effect of the Cursive Model

For Calligrapher B’s cursive script model, this paper collected two authentic works by Calligrapher B, with 49 characters (authentic work 1) and 84 characters (authentic work 2), respectively. For comparison, two works by different calligraphers were also collected, one with 147 characters (other work 1) and the other with 395 characters (other work 2).

In Table 3, Authentic 1 and Authentic 2 are calligraphy works in cursive style by Calligrapher B, while the others are works by other calligraphers. Calligraphers are more casual when writing in cursive script than when writing regular script. Although cursive script works show a more distinct personal style, the deformation of the cursive script characters is greater, and the characteristics of the character shapes are more variable and abstract, posing greater challenges for online authentication.

Based on the test results in Table 3, except for the MobileNetv3-L model, which failed to converge, the other models could train well. In the test for Authentic 1, CBAM-EfficientNet showed a slight improvement compared to EfficeintNetv2-S, while SA-EfficientNet achieved better authentication results than the previous two models, achieving accuracies of 78.411% and 85.562% in the authentication of Authentic 1.

In summary, the improved CBAM-EfficientNet and SA-EfficientNet models both performed well and could accurately learn the abstract features of the images from the dataset for the challenging task of distinguishing the authenticity of calligraphy works. The CBAM module had more parameters than the Self-Attention module, resulting in the overall size of the CBAM-EfficientNet model reaching 5.5 GB, while the SA-EfficientNet model size is only 891 MB. The attention structure of the Self-Attention module achieved better results with fewer parameters than CBAM.

Among the tested models, the MobileNetV3L model is lightweight with the lowest number of parameters, and its model size is only 36 MB. When training the regular script model of Calligrapher A, there was enough data for the model to learn features, and the model was able to converge. However, when training the regular script model and cursive script model of Calligrapher B, the training data were relatively limited, and the models could not converge, making them unable to complete training and testing. Moreover, in the authentication of authentic works by Calligrapher B, the model’s accuracy was lower than that of the authentic works by Calligrapher A’s model. Therefore, distinguishing the authenticity of calligraphy works requires a certain number of network model parameters and sufficient training data.

5. Conclusions

The authentication of calligraphy works is a task that heavily relies on the subjective view of calligraphers and does not have clear mathematical indicators to quantify the similarity of calligraphy works. Furthermore, calligraphers intentionally add variations in character forms during creation to pursue artistic value in the overall work. Therefore, using neural networks to identify the authenticity of calligraphy works is difficult and poses a significant challenge to model algorithms.

This paper proposes an algorithm for identifying the authenticity of calligraphy works. It uses projection and centroid algorithms to extract individual calligraphy characters from the works and expands the calligraphy dataset using data augmentation. Based on the EfficientNetv2-S model, the CBAM and Self-Attention modules improve network attention. The improved CBAM-EfficientNet and SA-EfficientNet networks are trained, and the effects are compared with EfficientNetv2-S, InceptionResNetv2, Inceptionv3, MobileNetv3-L, ResNet50, and MobileNet networks. The effects of the Kai script and Cao script models are tested using authentic calligraphy, other calligraphy works, and imitations of other calligraphers as the test set.

The experimental results show that the model can learn the personal style characteristics of calligraphers from the calligraphy dataset and identify authentic works, imitations, and works by other calligraphers in the test set. Among them, the improved CBAM-EfficientNet and SA-EfficientNet based on the Efficientv2-S model achieve better authentication results compared to the original model in this project. Additionally, SA-EfficientNet achieves better results than CBAM-EfficientNet with a smaller model size and, overall, outperforms other networks. Therefore, the improvement based on the attention module has a good effect on identifying calligraphy authenticity.

The methods described in this paper mainly aim at works with more obvious personal styles like calligraphy. However, these methods can be extended and applied to more fields. Using neural networks to identify individual handwriting has good application prospects in judiciary, criminal investigation, banking, and cultural relic authentication.

Author Contributions

Conceptualization, X.J., J.C. and Z.H.; methodology, W.W. and H.Y.; software, X.W. and Z.H.; validation, X.W., H.Y. and Z.H.; formal analysis, X.W., H.Y. and J.C.; investigation, J.C., X.J. and Z.H.; resources, X.W., J.C. and Z.H.; data curation, J.C., Z.H. and W.W.; writing—original draft preparation, X.J., J.C. and Z.H.; writing—review and editing, W.W., Z.H. and H.Y.; visualization, X.W. and J.C.; supervision, W.W., H.Y. and X.W.; project administration, W.W. and Z.H.; funding acquisition, W.W. and H.Y.; All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded in part by the National Key Research and Development Project of China (grant number: 2018YFA0902900), the Basic Research Program of Guangzhou City of China (grant number 202201011692), and the Guangdong Water Conservancy Science and Technology Innovation Project (grant number 2023-03).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are contained within the article.

Acknowledgments

The authors would like to express their thanks to the Guangzhou Institute of Advanced Technology for helping them with the experimental characterization.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Meng, Y. Ancient painting and calligraphy collecting traditions and current cultural consumption. Art Obs. 2019, 12, 76–77. [Google Scholar]
Zhu, J. Research on the Prediction of Calligraphy and Painting Market Index Based on Internet Search Data. Master’s Thesis, Zhejiang University of Finance and Economics, Hangzhou, China, 2019. [Google Scholar]
Pang, S. Computer-Aided Identification of Authenticity of Chinese Calligraphy. Master’s Thesis, Xi’an University of Architecture and Technology, Xi’an, China, 2018. [Google Scholar]
Liang, S. Analyze the role of seal recognition in the identification of calligraphy and painting. Cult. Relics Identif. Apprec. 2021, 23, 96–98. [Google Scholar] [CrossRef]
Ji, J. Research on calligraphy authenticity identification method based on invariant moment. J. Eng. Math. 2022, 39, 196–208. [Google Scholar]
Zhang, Z.; Chen, J.; Qian, X. Calligraphic character skeleton extraction based on improved conditional generation adversarial network. Comput. Eng. 2023, 49, 272–279. [Google Scholar] [CrossRef]
Mai, G.; Liang, Y.; Pan, J. Calligraphy font recognition algorithm based on improved DenseNet network. Comput. Syst. Appl. 2022, 31, 253–259. [Google Scholar] [CrossRef]
Dai, W. Research on Calligraphy Text Generation Based on Generative Adversarial Network. Master’s Thesis, Chongqing University of Technology, Chongqing, China, 2021. [Google Scholar] [CrossRef]
Ji, X. Research on Calligraphy Character Content and Style Recognition Based on Deep Learning. Master’s Thesis, Xi’an University of Electronic Science and Technology, Xi’an, China, 2023. [Google Scholar] [CrossRef]
Kang, J.; Wu, Y.; Xia, Z.; Feng, X. Application of Deep Convolution Neural Network Algorithm in Detecting Traditional Calligraphy Characters. In Proceedings of the 2022 International Conference on Image Processing and Media Computing (ICIPMC), Xi’an, China, 27–29 May 2022; pp. 12–16. [Google Scholar]
Pan, G.; Yang, Y.; Li, M.; Hu, X.; Huang, W.; Wang, J.; Wang, Y. A Graph based Calligraphy Similarity Compare Model. In Proceedings of the 2021 IEEE 21st International Conference on Software Quality, Reliability and Security Companion (QRS-C), Haikou, China, 6–10 December 2021; pp. 395–400. [Google Scholar]
Tan, M.; Le, Q. Efficientnet: Rethinking model scaling for convolutional neural networks. In Proceedings of the International Conference on Machine Learning, PMLR, Long Beach, CA, USA, 9–15 June 2019; pp. 6105–6114. [Google Scholar]
Zhang, J.; Yu, W.; Wang, Z.; Li, J.; Pan, Z. Attention-enhanced CNN for chinese calligraphy styles classification. In Proceedings of the 2021 IEEE 7th International Conference on Virtual Reality (ICVR), Foshan, China, 20–22 May 2021; pp. 352–358. [Google Scholar]
Woo, S.; Park, J.; Lee, J.Y.; Kweon, I.S. Cbam: Convolutional block attention module. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. In Proceedings of the Advances in Neural Information Processing Systems (NIPS), Long Beach, CA, USA, 3–9 December 2017. [Google Scholar] [CrossRef]
Tan, M.; Le, Q. Efficientnetv2: Smaller models and faster training. In Proceedings of the International Conference on Machine Learning, PMLR, Online, 18–24 July 2021; pp. 10096–10106. [Google Scholar]
Peng, C.; Liu, Y.; Yuan, X.; Chen, Q. Research of image recognition method based on enhanced inception-ResNet-V2. Multimed. Tools Appl. 2022, 81, 34345–34365. [Google Scholar] [CrossRef]
Stefenon, S.F.; Yow, K.C.; Nied, A.; Meyer, L.H. Classification of distribution power grid structures using inception v3 deep neural network. Electr. Eng. 2022, 104, 4557–4569. [Google Scholar] [CrossRef]
Zhao, L.; Wang, L. A new lightweight network based on MobileNetV3. KSII Trans. Internet Inf. Syst. 2022, 16, 1–15. [Google Scholar]
Alzamily, J.Y.I.; Ariffin, S.B.; Abu-Naser, S.S. Classification of Encrypted Images Using Deep Learning–Resnet50. J. Theor. Appl. Inf. Technol. 2022, 100, 6610–6620. [Google Scholar]
Kim, M.; Kwon, Y.; Kim, J.; Kim, Y. Image classification of parcel boxes under the underground logistics system using CNN MobileNet. Appl. Sci. 2022, 12, 3337. [Google Scholar] [CrossRef]
Zeng, B. Research on Chinese Calligraphy Authenticity Identification Method Based on Image Recognition. Master’s Thesis, Xi’an University of Architecture and Technology, Xi’an, China, 2016. [Google Scholar]

Figure 1. Projection algorithm. (a) Horizontal foreground projection, (b) vertical foreground projection, (c) minimum bounding box.

Figure 2. The limitations of the character box algorithm are (a) the image of the original calligraphy work and (b) the processed image.

Figure 3. Morphology-based connected domain segmentation of images. (a) Connected area; (b) character segmentation.

Figure 4. Effectiveness of the centroid algorithm: (a) Stroke centroid points, (b) closest centroid points, (c) similar centroid points, (d) algorithm result.

Figure 5. Comparison of the effects of the three font box algorithms. (a) Original work, (b) projection algorithm, (c) connected domain algorithm, (d) projection algorithm and centroid algorithm.

Figure 6. Data augmentation.

Figure 7. Network structure of EfficientNet_v2-S, CBAM-EfficientNet, and SA-EfficientNet.

Table 1. Calligrapher A’s regular script model testing results.

Algorithm	Authentic 1	Authentic 2	Authentic 3	Imitation	Others
EfficientNetv2-S	98.877%	93.911%	34.856%	4.003%	15.598%
CBAM-EfficientNet	98.633%	93.922%	35.900%	7.937%	13.557%
SA-EfficientNet	98.846%	95.611%	59.233%	17.265%	21.714%
InceptionResNetv2	98.963%	90.910%	40.519%	0.153%	15.888%
Inceptionv3	98.276%	90.918%	40.503%	5.587%	18.011%
MobileNetv3-L	98.278%	90.930%	41.143%	15.760%	21.630%
ResNet50	97.131%	64.502%	23.942%	13.726%	38.338%
MobileNet	98.634%	90.914%	30.208%	1.961%	18.306%

Table 2. Calligrapher B’s regular script model testing results.

Algorithm	Authentic 1	Imitation	Other Work 1	Other Work 2
EfficientNetv2-S	81.762%	11.098%	0%	0.011%
CBAM-EfficientNet	85.717%	5.556%	0%	0%
SA-EfficientNet	90.997%	5.556%	0%	0.006%
InceptionResNetv2	92.584%	10.999%	0.373%	0%
Inceptionv3	88.614	5.556%	0%	0%
MobileNetv3-L	-	-	-	-
ResNet50	88.950%	16.660%	3.890%	0.429%
MobileNet	91.385%	16.674%	0%	0%

Table 3. Calligrapher B tests the effect of the cursive model.

Algorithm	Authentic 1	Authentic 2	Other Work 1	Other Work 2
EfficientNetv2-S	80.326%	66.867%	13.779%	0.008%
CBAM-EfficientNet	82.855%	68.028%	21.169%	0%
SA-EfficientNet	85.562%	78.411%	25.588%	0.048%
InceptionResNetv2	84.846%	70.616%	18.885%	0%
Inceptionv3	87.958%	67.722%	15.060%	0%
MobileNetv3-L	-	-	-	-
ResNet50	86.218%	79.113%	25.254%	0.253%
MobileNet	80.000%	69.813%	10.448%	0%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, W.; Jiang, X.; Yuan, H.; Chen, J.; Wang, X.; Huang, Z. Research on Algorithm for Authenticating the Authenticity of Calligraphy Works Based on Improved EfficientNet Network. Appl. Sci. 2024, 14, 295. https://doi.org/10.3390/app14010295

AMA Style

Wang W, Jiang X, Yuan H, Chen J, Wang X, Huang Z. Research on Algorithm for Authenticating the Authenticity of Calligraphy Works Based on Improved EfficientNet Network. Applied Sciences. 2024; 14(1):295. https://doi.org/10.3390/app14010295

Chicago/Turabian Style

Wang, Weijun, Xuyao Jiang, Hai Yuan, Jinyuan Chen, Xintong Wang, and Zucheng Huang. 2024. "Research on Algorithm for Authenticating the Authenticity of Calligraphy Works Based on Improved EfficientNet Network" Applied Sciences 14, no. 1: 295. https://doi.org/10.3390/app14010295

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Research on Algorithm for Authenticating the Authenticity of Calligraphy Works Based on Improved EfficientNet Network

Abstract

1. Introduction

2. Calligraphy Dataset

2.1. Character Box Algorithm

2.1.1. Projection Algorithm

2.1.2. Connected Domain Algorithm

2.2. Centroid Algorithm

2.2.1. Centroid Calculation

2.2.2. Character Segmentation

2.3. Data Augmentation

3. Network Design

EfficientNetv2-S Network and Attention Module Improvement

4. Experimental Results

4.1. Calligrapher A’s Regular Script Model Testing Results

4.2. Calligrapher B’s Regular Script Model Testing Results

4.3. Calligrapher B Tests the Effect of the Cursive Model

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI