AI-Assisted Restoration of Yangshao Painted Pottery Using LoRA and Stable Diffusion

Zhang, Xinyi

doi:10.3390/heritage7110295

Open AccessArticle

AI-Assisted Restoration of Yangshao Painted Pottery Using LoRA and Stable Diffusion

by

Xinyi Zhang

Academy of Arts & Design, Tsinghua University, Beijing 100084, China

Heritage 2024, 7(11), 6282-6309; https://doi.org/10.3390/heritage7110295

Submission received: 14 September 2024 / Revised: 23 October 2024 / Accepted: 30 October 2024 / Published: 8 November 2024

Download

Browse Figures

Versions Notes

Abstract

:

This study is concerned with the restoration of painted pottery images from the Yangshao period. The objective is to enhance the efficiency and accuracy of the restoration process for complex pottery patterns. Conventional restoration techniques encounter difficulties in accurately and efficiently reconstructing intricate designs. To address this issue, the study proposes an AI-assisted restoration workflow that combines Stable Diffusion models (SD) with Low-Rank Adaptation (LoRA) technology. By training a LoRA model on a dataset of typical Yangshao painted pottery patterns and integrating image inpainting techniques, the accuracy and efficiency of the restoration process are enhanced. The results demonstrate that this method provides an effective restoration tool while maintaining consistency with the original artistic style, supporting the digital preservation of cultural heritage. This approach also offers archaeologists flexible restoration options, promoting the broader application and preservation of cultural heritage.

Keywords:

cultural heritage; artificial intelligence; Yangshao; painted pottery; image restoration; LoRA; stable diffusion; image inpainting

1. Introduction

The restoration of cultural heritage, particularly painted pottery from the Yangshao culture (circa 5000–3000 BCE), poses a unique challenge due to the intricate patterns and historical significance of these artifacts. Conventional restoration techniques predominantly depend on manual expertise, which can result in prolonged and inefficient processes. The painted pottery, which is adorned with bird-shaped, floral, and geometric motifs, and its restoration requires methods that preserve the original design’s inherent symmetry, balance, and continuity of the designs. Consequently, there is an increasing necessity for more effective, automated restoration techniques that can simultaneously preserve the aesthetic and historical authenticity of these objects.

Machine learning has shown significant potential in archaeological research, particularly in classifying and restoring ceramic images [1,2,3,4,5,6,7]. For example, convolutional neural networks (CNNs) have exhibited efficacy in the automated identification and categorization of pottery patterns, thereby improving the accuracy of image analysis [8,9,10]. However, these methods rely on extensive pre-classified image datasets for training, and the quality and diversity of the training data impose constraints on the model’s performance. This limitation is particularly evident when dealing with rare historical relics or intricate patterns, where the model’s recognition and matching capabilities are often inadequate. Moreover, these models typically necessitate substantial manual intervention and considerable computational resources, posing significant challenges for practical application [11].

At the same time, bespoke artificial intelligence systems have been utilized for the restoration of cultural heritage. Sizyakin et al. [12] demonstrated the significant potential of deep learning in virtual painting restoration by improving the practical performance of GAN models through the training of custom GAN architectures. These models were employed to repair depressions and cracks caused by the aging of oil paintings. Similarly, Jiang et al. [13] concentrated on collecting images of specific damaged areas in oil paintings based on restoration algorithms and digital restoration techniques.

The advent of generative artificial intelligence (AI), exemplified by models such as Stable Diffusion and DALL-E, has ushered in a new era of possibilities for the automated restoration of cultural heritage [7]. Inpainting, a fundamental component of Stable Diffusion, is employed to generate images and reconstruct missing or damaged regions, offering a more adaptable and scalable approach to restoration [14,15,16]. However, when dealing with specialized artworks and intricate patterns, the standard restoration capabilities of inpainting frequently prove inadequate in terms of accuracy.

LoRA is a technique designed for the efficient fine-tuning of large pre-trained models. It achieves this by significantly reducing the computational and storage requirements through low-rank matrix decomposition, while maintaining high model performance [17]. LoRA enables efficient model fine-tuning with limited data, thereby enhancing the model’s capacity to transfer styles and patterns effectively [18,19].

From a broader perspective, the expertise of archaeologists and restorers plays a vital role in the intricate process of cultural heritage restoration. While AI is a helpful assistant, it should not overshadow the invaluable insights that humans contribute. For example, ongoing research highlights the importance of human judgment in assessing AI-generated images. Ultimately, the effectiveness of these images relies heavily on the user’s knowledge and critical thinking skills [20].

In both scientific research and educational contexts, achieving a harmonious balance between the remarkable capabilities of AI and the irreplaceable wisdom of human intellect is essential. This philosophy aligns with the principles articulated by Cesare Brandi, who argued that restoration is not merely about reconstructing physical artifacts; it also involves honoring their aesthetic and historical significance. Brandi’s perspective echoes the values of the Venice Charter, which advocates for preserving the authenticity and recognizability of cultural heritage [21].

The potential of AI technologies in reconstructing intricate historical patterns is vast and promising. However, archaeologists and restorers must still conduct comprehensive evaluations of AI-generated images. As one expert pointed out, “AI technologies can enhance the accuracy and efficiency of conservation efforts; however, ethical concerns, such as inauthentic restoration and erosion of cultural sensitivity, must be addressed by human expertise” [22]. Despite the impressive outputs of AI, they can occasionally fail to capture the intricate historical nuances and cultural significance embedded in artifacts. This reality highlights the importance of granting professionals the authority to halt the restoration process and reassess AI-generated outcomes when they fall short of meeting historical or cultural expectations.

One noteworthy example is the restoration of Rembrandt’s The Night Watch. In this case, parts of the painting that had been removed over time were reconstructed using AI technology. This collaborative approach enabled the painting to be displayed once more at the Rijksmuseum in Amsterdam, reviving an essential piece of history. Although AI played a crucial role in the reconstruction, it was the expertise of human specialists that ensured the restored sections aligned with the original artwork’s authenticity [23]. Throughout the process, archaeologists and restorers retained the ultimate decision-making authority, ensuring that the AI-generated results met rigorous standards of historical, cultural, and artistic integrity.

While diffusion models have the potential to achieve impressive results, the computational speed gap between diffusion models and generative adversarial networks (GANs) is gradually narrowing. Nevertheless, notable discrepancies persist regarding their respective applications. For example, a recent study investigated the potential of Stable Diffusion models for the generation of detailed and visually appealing architectural facades, with a view to integrating this technology into cultural heritage conservation within the field of architecture. The study compared the performance of diffusion models with GANs, with diffusion models demonstrating superior results in terms of detail and quality [24]. Even so, the application of generative techniques continues to present several challenges, including the issue of “object expansion” in complex scenes, which adversely affects the realism of the images that are restored [16]. At a macro level, more research is needed to verify the accuracy of AI-generated images, which largely depends on the user’s knowledge and critical thinking [20]. Therefore, a reasonable balance must be achieved between the intellectual power of human thought and the technological capabilities of artificial intelligence, whether for scientific research or educational purposes.

This study is based on the Inpainting model of Stable Diffusion and proposes a novel restoration workflow combining Sttable Diffusion with LoRA technology to generate images in a specified style. In addition, this research introduces a highly efficient, low-technical-threshold method for pattern restoration, specifically aimed at supporting archaeological restoration, art history research, and engaging cultural heritage enthusiasts. The method of restoring specific patterns through training style specific LoRA models has also been applied in cases of cultural heritage preservation and utilization [25,26]. By training a LoRA model for Yangshao pottery patterns and integrating it into the Inpainting model, this study addresses the limitations of traditional and machine learning-based restoration methods. This approach enhances accuracy, maintains stylistic consistency, and provides a flexible and efficient AI-assisted solution for the preservation of cultural heritage. Comparative experiments and qualitative assessments demonstrate the efficacy of the workflow in restoring complex pottery designs while adhering to established restoration principles. To prevent ambiguity, this paper will uniformly use the term “Restoration” to denote the repair effects obtained through the methods of Reconstruction and Re-creation. This terminology will encompass all relevant restoration processes, including supplementary restoration methods, ensuring a clearer understanding of artifact restoration.

2. Theoretical and Technical Foundations

2.1. Analysis of Yangshao Pottery Patterns

Miaodigou type of the Yangshao culture, which dates to the Neolithic period (circa 4000–3000 BCE), is renowned for its highly distinctive and elaborately decorated painted pottery. The pottery is typically painted with black, red, or brown pigments on a buff or light-colored clay surface. The earliest Yangshao pottery from the Banpo culture is characterized by the prevalence of fish motifs. During the subsequent Miaodigou phase, both realistic and abstract bird motifs became dominant features. These motifs reflect both artistic creativity and symbolic meanings. As posited by [27], the bird motifs from the Miaodigou phase can be classified into three simplified categories: (1) motifs comprising triangular arcs with dots within circular apertures, triangular arcs with dots and supplementary arcs or lines, and linear elements with dots; (2) “Xiyin patterns” (Maltese crosses) or “hanging arcs”, which consist of dots and arcs or lines; and (3) dot and hooked arc motifs (Figure 1).

Wang [29] proposed that while fish motifs appeared to decline during the Miaodigou phase, they merged with bird motifs, creating a unique blended pattern style. Li [30] further supported this concept of simplified bird motifs and explored common patterns such as spiral and hooked patterns, Xiyin patterns, and paired bird motifs in his study of painted pottery from the Miaodigou culture. This research not only underscored the significance of bird motifs but also prompted a systematic classification of these designs within the academic community.

Jin [31] conducted a comprehensive analysis of bird motifs on Miaodigou-Type pottery, thereby filling gaps in previous studies on motif categories and structural forms. This expanded understanding of the symbolic meanings behind the Miaodigou-Type motifs has further deepened our knowledge of the spiritual beliefs of the Yangshao society (Figure 2).

Miaodigou pottery is commonly characterized using voluminous, wide-mouthed jars, bowls, and basins, exhibiting smooth and meticulously polished surfaces. The decorations, applied with precision and skill, are often arranged in a symmetrical fashion, which serves to exemplify the potters’ proficiency in achieving equilibrium and devising designs. These ceramics played a pivotal role in the daily and ceremonial lives of the Yangshao people, offering invaluable insights into their social, cultural, and religious practices.

2.2. Traditional Restoration Methods and Their Limitations

In China, two primary methods are employed in the field of cultural heritage restoration. The first method follows the principle of “restoration to original condition”, emphasizing minimal intervention, recognizability, and reversibility, while striving to maintain the original state using techniques consistent with the initial restoration process. This method also stipulates that all restorations must be reversible. The second method is grounded in the Venice Charter of 1964, which outlines international principles for cultural heritage restoration [21]. These principles are derived primarily from the Italian restoration school led by Cesare Brandi. Brandi emphasized the importance of recognizability in restoration, advocating for modern additions that harmoniously integrate with the original structure while remaining distinctly identifiable. This method prioritizes the preservation and presentation of historical and aesthetic values, serving as a widely accepted framework for contemporary restoration and preservation efforts [32,33].

Traditional pottery restoration techniques rely heavily on the expertise of practitioners. Specialists often reproduce the original patterns and appearances of pottery through hand-painted reconstructions [27]. Many cultural heritage artifacts lack detailed graphic records, compelling restorers to rely on experience for speculation and redrawing, typically limited to reproducing patterns or sketching outlines with pencils in the restored areas (Figure 3). Archaeological restoration techniques aim to preserve the original appearance of pottery artifacts, ensuring they remain as close as possible to their excavated state. Consequently, materials such as plaster are commonly used to fill in missing parts. To maintain the original appearance of the pottery, patterns are generally not applied directly onto the ceramic, aligning with the principle of reversibility in artifact restoration.

Despite the high artistic value of manual restoration, which allows for personalized approaches based on the uniqueness of each artifact, this method presents several significant limitations. First, the manual restoration process is extremely time-consuming. For pottery artifacts that have been buried underground for extended periods, the accumulation of impurities can lead to brittleness and peeling paint. Before beginning restoration work, technicians conduct a thorough external examination of the pottery, measuring its size, weight, morphological features, and overall structure. In cases of extensively damaged artifacts or those with complex patterns, the restoration process can take months or even years.

Second, manual restoration relies heavily on the experience and skill level of the restorer, leading to variability in restoration quality due to individual differences. Given the complexity of artifact damage and the irreversibility of historical information, minor errors can result in a loss of the artifact’s original historical value. Additionally, when restorers attempt to redraw intricate patterns or enhance colors, they risk introducing unintended deviations in artistic style, creating visual discrepancies between the restored artifact and the original, which impacts its historical authenticity.

Another significant limitation of traditional manual restoration is the challenge of reconstructing missing parts. For artifacts with complex geometric patterns and intricate details, restorers not only require proficient hand-painting skills but also need the ability to review extensive literature, study pattern variations, and cultural meanings from different periods, and access comprehensive databases of pottery patterns. However, current resources in the Chinese pottery field are limited to specific publications, such as The Atlas of Chinese Painted Pottery [27] and The Complete Collection of Painted Pottery Unearthed in China [28,29], with relatively few targeted databases available.

The introduction of modern scientific technologies, including phase-assisted optical 3D scanning, multi-spectral imaging, X-ray fluorescence, and laser Raman spectroscopy [34,35], has laid more precise and scientific foundations for artifact restoration. Nevertheless, these technologies also have limitations. For instance, they prioritize restoration over reconstruction, allowing for the analysis of only existing patterns and materials, and cannot directly reconstruct lost patterns or missing sections.

In contrast, the integration of AI technologies provides new avenues to tackle this challenge. By utilizing Stable Diffusion and Inpainting techniques, AI can automatically generate missing patterns and, in combination with LoRA technology, perform style transfer to ensure that the generated patterns closely align with the style of the original artifacts. Machine learning-based image restoration technologies offer greater flexibility and adaptability, thereby preventing irreversible physical damage. Furthermore, this approach significantly improves restoration efficiency while reducing dependence on the intervention of specialized personnel.

2.3. Stable Diffusion

Stable Diffusion is a cutting-edge image generation technology based on diffusion models. Its core principal entails beginning with random noise and progressively refining it to produce clear images. The model learns to eliminate noise from distorted images, ultimately restoring them to their original state. The operation of the Stable Diffusion model can be viewed as a gradual denoising process, organized into two primary phases:

Training Phase: During this phase, the model learns to restore data, such as images, from noise. It begins by introducing noise to the data distribution until the images become unrecognizable and resemble pure noise. The model then reverses this process, learning to recover the original data from total noise.

Generation Phase: In this phase, the model starts with pure noise and progressively denoises it to generate images resembling the training data distribution. Unlike previous diffusion models, Stable Diffusion conducts the diffusion process in a latent space rather than in high-dimensional pixel space. This approach enhances generation speed and significantly reduces computational complexity. Additionally, it enables conditional generation, providing precise control over the output via input text or image prompts. This renders Stable Diffusion especially effective for tasks such as image restoration and style transfer, particularly when handling complex patterns and fine details [36].

In the field of cultural heritage restoration, Stable Diffusion can generate images that match the original style based on the contextual information of the images, effectively filling in missing parts and ensuring the accuracy and consistency of the restoration. A study has demonstrated that the use of diffusion models to generate images based on the descriptive conditions of artworks can enhance the performance of visual recognition tasks in the field of cultural heritage. This data augmentation method enhances the efficacy of visual recognition models by augmenting the diversity and volume of training data [37]. In comparison to other tools such as DALL·E 2 and Midjourney, Stable Diffusion enables the precise modification of generated images through user interfaces such as WebUI (1.10.0) and ComfyUI (v0.2.1), thereby making it an optimal choice for specific restoration requirements. The most notable advantage of Stable Diffusion is its capacity to reduce time and effort expenditure [38]. However, despite these advantages, Stable Diffusion faces limitations when it comes to generation stability and maintaining content consistency [39].

2.4. LoRA Technology Analysis

As the size of models in Stable Diffusion continues to increase, the costs and resource demands associated with fine-tuning these large models have also risen. To address the need for rapid and efficient fine-tuning of large models for specific tasks, several fine-tuning methods have been developed, including Adapt Tuning, Prefix Tuning, and LoRA. Among these methods, LoRA is particularly noteworthy for its effectiveness in low-resource fine-tuning of large language models (LLMs) [40].

The core principle of LoRA is to achieve efficient fine-tuning of the model by introducing low-rank matrix perturbations based on a pre-trained model, without the need to update all the weights of the model. Specifically, LoRA uses two trainable low-rank matrices, A and B, to represent perturbations to the original weight matrix W_pretrained. The final fine-tuned weight matrix is calculated as follows:

W_finetuned = W_pretrained + BA

Here, the ranks of matrices A and B are much smaller than the dimensions of the original weight matrix, significantly reducing the number of trainable parameters [17,41,42]. The main innovation of LoRA is its ability to dramatically reduce computational resources and memory usage during training, significantly lowering GPU memory demands and, in some cases, the number of trainable parameters by up to 10,000 times. LoRA does not add latency during inference since low-rank matrices are used alongside pre-trained weights, ensuring efficient deployment. Additionally, it offers flexibility by combining with other fine-tuning techniques without complex model adjustments. These features make LoRA particularly suitable for fine-tuning large-scale models under limited resources.

A recent study introduced Laplace-LoRA, a Bayesian method applied to LoRA of large language models. This method improves calibration and mitigates overconfidence by estimating uncertainty, particularly in models fine-tuned on small datasets [43]. It efficiently fine-tunes specific layers within diffusion models, significantly improving training efficiency, particularly in environments with limited resources, making it well-suited for complex restoration tasks.

2.5. Analysis of LoRA’s Suitability

The primary advantage of LoRA lies in its remarkable capacity to handle patterns that exhibit regularity and repetition. This is due to its ability to fine-tune the pre-trained model, enabling it to learn the features of specific tasks while retaining knowledge of the source domain, thereby reducing the likelihood of overfitting, or forgetting. For instance, common motifs in Yangshao pottery, including bird patterns, fish patterns, and geometric designs characteristic of Yangshao pottery, typically exhibit highly regular structures, rendering LoRA particularly adept at capturing and reproducing these stylistic features.

By fine-tuning a pre-trained model, LoRA can generate images that are highly consistent with the original style, even when working with limited data and computational resources.

In the restoration of Yangshao pottery, the low-rank decomposition approach employed by LoRA enables the maintenance of high restoration accuracy while markedly reducing computational costs. This is particularly crucial for the restoration of rare and intricate pottery patterns, where a high level of detail fidelity is necessary. In practical applications, LoRA can also be combined with Stable Diffusion to further enhance the quality and speed of image restoration. This combination not only enables the efficient restoration of complex patterns but also reduces reliance on manual intervention, thereby making the restoration process more automated and reliable.

In conclusion, LoRA offers an efficient and innovative solution for the restoration of Yangshao pottery images, reducing the necessity for parameter updates and computational overhead. It demonstrates remarkable efficacy in resource-limited contexts, offering substantial advantages and pioneering advances in the domain of archaeological image restoration.

In practical applications, LoRA can be integrated with Stable Diffusion to enhance both the quality and efficiency of image restoration. This integration not only allows for the effective restoration of intricate patterns but also reduces the need for human intervention, making the restoration process more automated and reliable. A research case study on the use of LoRA for Chu (楚) Lacquerware art successfully trained an intelligent model reflecting the stylistic characteristics of Chu lacquerware using the LoRA framework [25]. The history of Chu lacquerware dates to the Spring and Autumn and Warring States periods (770BCE–221BCE), featuring patterns with a distinct artistic style emblematic of ancient Chinese designs. These forms are complex, ornate, and diverse, yet they adhere to traditional principles, such as symmetry, bilateral relationships, and four-way continuity. The study indicates that LoRA can accurately restore the unique geometric patterns and natural motifs of Chu lacquerware in style transfer tasks, ensuring that the generated patterns align closely with the original design while preserving intricate details. The model was further integrated with Stable Diffusion to facilitate intelligent design practices, generating lacquer patterns that reflect Chu culture and extending their application to the realm of industrial design. This approach contributes to the preservation and transmission of Chu lacquer art from a modern perspective.

Drawing insights from the study on blue calico pattern generation based on an improved stable diffusion model [44], a method was proposed for generating blue printed cloth patterns by combining LoRA with Stable Diffusion to enable automated generation through text-to-image and image-to-image techniques. In this study, LoRA addressed challenges such as limited datasets for blue printed cloth patterns and difficulties in training generative models. LoRA achieved this by injecting fine-tuned low-rank matrices into the layers of the Transformer while freezing the pre-trained model’s weights. This approach enables fine-tuning with minimal computational resources while preserving the defining characteristics of blue printed cloth and generating new patterns. Compared to traditional full-model fine-tuning methods, LoRA significantly reduces GPU memory requirements while maintaining consistency with the characteristics of blue printed cloth, resulting in high-quality pattern generation. This makes LoRA particularly suitable for tasks involving small datasets, enabling the generation of new patterns even with limited data and resources.

These cases illustrate that combining the LoRA model with the Stable Diffusion model offers substantial advantages in preserving cultural heritage with regional characteristics and distinct styles. Users can develop custom LoRA models tailored to specific pattern styles, making it an exceptionally efficient tool for heritage restorers dealing with complex systems and extensive datasets. Therefore, the technical advantages of LoRA will be further emphasized in this study’s restoration efforts.

One of the key strengths of this study is the substantial dataset available. The large quantity of pottery unearthed from the Miaodigou site, representing around 14.02% of the total ceramics found, provides a substantial dataset for this study. The use of LoRA technology enables effective processing of this rich pattern data through low-rank matrix decomposition, thereby ensuring high model performance even with limited data. The pottery from Miaodigou-Type features highly uniform and repetitive motifs, including dots, arcs, and triangular shapes that form bird and fish combinations.

LoRA’s flexibility and adaptability offer substantial advantages in resource efficiency, particularly in archaeological restoration contexts where computational power and storage are often limited. Unlike traditional GAN models and CNN fine-tuning methods, which require updating the entire model’s weights, LoRA employs low-rank matrix decomposition to efficiently train models. This reduces the need for high-end computational resources, allowing effective processing even on simplified hardware.

While LoRA automates certain procedures and reduces the reliance on manual input, expert supervision remains essential to preserving historical authenticity. Traditional restoration techniques play an indispensable role in maintaining accuracy. LoRA’s effectiveness stems from its ability to fine-tune pre-trained models using low-rank matrix adjustments, enabling it to adapt efficiently to different styles and tasks. This approach minimizes human error and enhances reliability. When combined with Stable Diffusion’s automation capabilities, the restoration process becomes more streamlined, enabling the generation of complex patterns and accurate style transfers.

This integrated approach demonstrates significant benefits when restoring cultural heritage items with distinct regional characteristics and specific styles. By simplifying hardware requirements and improving restoration efficiency, LoRA offers an efficient, reliable, and practical solution, particularly in resource-limited settings.

2.6. Inpainting

Inpainting serves as a natural extension of the Stable Diffusion model, designed primarily for filling in or reconstructing parts of an image that may be missing or damaged. This technique has become invaluable in various applications, including restoring old photographs, content-aware filling, and creative image editing. In cultural heritage preservation and art restoration, Inpainting plays a crucial role, especially when working with images of ancient artifacts. For example, it can effectively compensate for the wear and tear that artifacts endure over time, bringing them back to life for future generations to appreciate.

Stable Diffusion operates as a generative model, creating complete images through a stepwise denoising process. In image Inpainting, it focuses on reconstructing missing areas by drawing on the context provided by the existing parts of the image. The process begins with the user masking a section of the image and then supplying text prompts that describe the desired content to be filled in. Following this, Stable Diffusion merges these prompts with the unmasked regions to create new content that blends seamlessly with the original. This method showcases the model’s adaptability: it not only restores lost segments but also maintains the original style of the image. This versatility makes it a powerful tool in digital restoration, ensuring that cultural heritage artifacts and artworks are preserved accurately and authentically for public viewing.

2.7. ComfyUI Workflow

ComfyUI is a user interface tool designed for generative AI models like Stable Diffusion. It simplifies the image generation process and enhances user control [45]. It allows users to build complex workflows by integrating foundational models (checkpoints) with LoRA and ControlNet applications. This enables precise adjustments in areas such as noise control, style fine-tuning, and image restoration via Ksampler. This control mechanism guarantees that the final images will closely align with the user’s expectations.

In this study, ComfyUI offers several key advantages. Firstly, the targeted restoration capabilities of the software allow users to address damaged or missing sections of images, which is of value in the context of archaeological restoration. Furthermore, ComfyUI supports multi-model integration, such as Stable Diffusion and LoRA, allowing users to switch seamlessly based on task requirements. It also enables multi-task processing and batch generation, significantly improving the efficiency of large-scale image restoration. Due to its extensible nature, ComfyUI can continually improve with technological advancements, making it an ideal tool for image generation and restoration.

Moreover, ComfyUI provides a variety of restoration alternatives, allowing archaeologists to select the optimal result from a range of AI-generated options. The flexibility afforded by the ability to select specific restoration methods allows for a more tailored approach to each artifact.

3. Method and Materials

3.1. Methodology

This study aims to establish a comprehensive process for repairing complex patterns using LoRA (Figure 4). The methodology employed was data-driven, with the objective of capturing and generalizing the intricate characteristics of these ancient artifacts. A diverse set of representative pottery images, including examples of fish, birds, and geometric motifs from the Middle to Late Yangshao period, was assembled for use in training the model.

Using these image datasets, specific LoRA models were trained, while the Inpainting model within Stable Diffusion was employed to restore the missing sections of the pottery. The integration of the ComfyUI workflow further enhanced the image generation and restoration process by providing a modular and flexible interface for managing various aspects of image control.

To enhance the restoration process, a ComfyUI workflow was integrated into the procedure, providing a modular and adaptable interface for controlling various aspects of image generation. This workflow permitted precise adjustments to be made in the control of noise, the consistency of style, and the reconstruction of images, thus ensuring that the restored images closely resembled to the original artwork. The combination of LoRA and Inpainting facilitated an efficient restoration process with minimal manual intervention, thereby boosting both the speed and accuracy of the procedure.

3.2. Experimet Design

This study primarily focuses on training LoRA to reconstruct and restore specific pottery pattern styles, leading to the development of two workflow designs: one using Inpainting as the primary restoration tool, and the other combining LoRA with the existing approach. This comparison seeks to highlight LoRA’s unique contributions and advantages.

(1): The study selected ten complete images of Yangshao pottery with intact shapes and patterns. Missing parts (Manual Mask) were intentionally added using Photoshop (20.0.7), with the alterations marked in green. The modified images were then processed using two distinct restoration techniques: the image Inpainting model within Stable Diffusion and a combined approach of Inpainting and LoRA. The restored images were subsequently compared with the original ones, focusing on the differences in the reconstructed missing parts). This comparison highlights LoRA’s contributions through style transfer and fine-tuning in image reconstruction. To evaluate the effectiveness of image restoration, the study employed both quantitative and qualitative methods. Metrics including Peak Signal-to-Noise Ratio (PSNR), Structural Similarity Index (SSIM), mean differences, and standard deviations were utilized to assess image quality and structural consistency.
(2): Damaged pottery fragments were selected as restoration subjects. Since these pottery shards had missing pieces, the final restoration results could not be compared with original images, lacking an absolute evaluation standard; therefore, a qualitative assessment was employed. Restoration results were based on qualitative evaluations from research experts and case analyses to ensure that restored images adhered to professional standards of fidelity, logic, and artistry.

3.3. Image Data Collection

The image data utilized in this study were primarily derived from two sources: The Complete Collection of Painted Pottery Unearthed in China—Henan Volume [28] and The Complete Collection of Painted Pottery Unearthed in China—Shaanxi Volume [29]. From these sources, 100 high-resolution and representative images of Miaodigou pottery were selected, and all images were standardized to a resolution of 512 × 512 pixels. The images included both intact pottery and partially restored pottery with missing patterns, which were filled in using white plaster.

3.4. LoRA Model Training

3.4.1. Image Annotation

In this study, the LoRA model training was supported by Kohya_ss (v24.1.6), a comprehensive toolkit developed by the community for training, fine-tuning, and optimizing large-scale language models (LLMs) and diffusion models. Specifically, Kohya_ss offers a user-friendly interface and a range of features designed for fine-tuning Stable Diffusion and LoRA training. This accessibility allows users with limited GPU memory to efficiently fine-tune models [46]. In this study, 47 complete examples were selected for training the LoRA model in Kohya_ss.

To annotate these images, the study utilized the BLIP (Bootstrapping Language–Image Pre-training) Captioning tool, which is integrated within the Kohya framework. Each image was initially labeled as “A clay pot with a pattern on it”. Given that the objective of the study is to examine stylistic training, it was deemed prudent to avoid over-editing the annotation labels, thereby preventing any potential interference with the model’s capacity to accurately learn from the images. Only minimal manual adjustments were made to the annotations, with the term “Yangshao” added as a prompt to guide the model without disrupting its learning process.

3.4.2. Training Process

The training parameters were modified in accordance with the distinctive characteristics of the pottery. The batch size was set to four, with 100 epochs saved at 10-epoch intervals. The clip_skip parameter was set to 2 to optimize the capture of detail during image generation, striking a balance between detail retention and model diversity performance. To conserve memory, the precision was set to fp16, and the learning rate was set at 0.0001, thereby ensuring stable and efficient training.

3.5. Training Results and Data Analysis

The script function of Stable Diffusion’s webUI enabled the introduction of two variables, NUM and WEIGHT, to evaluate the effectiveness of the training model. This allowed clear observation of the LoRA model’s learning process. The generated results permit a comparison of different models and weights, indicating that model 000007 is more closely aligned with the actual image, with a weight range of 0.8 to 1.0, resulting in optimal outcomes Figure 5. In this range, LoRA demonstrates an understanding of the fundamental elements of Miaodigou-Type painted pottery, including bird patterns, curved triangles, and dots. Moreover, it shows an ability to grasp the positional relationships between these elements. The relationship between loss values and training epochs was analyzed using Koyha’s TensorBoard tool, resulting in three principal loss charts (see Figure 6):

The charts illustrate loss trends during training, with a slight increase in average loss during the initial warm-up phase (10% of training steps) followed by a significant decline from 0.1 to 0.02, indicating effective optimization. Despite fluctuations in current loss due to parameter updates, the overall trend is downward, reflecting ongoing refinement. The consistent decrease in loss per epoch aligns with the average loss, signaling successful convergence and stability.

3.6. Setting Up the ComfyUI Workflow

Here’s a further streamlined version of your text while retaining essential details and professional terminology:

This study employed an inpainting-based workflow to restore images of Yangshao pottery (Figure 7). The process starts by selecting an appropriate inpainting model, such as “dreamshaper_8inpainting”, and loading the VAE model “vae-ft-mse-840000-ema-pruned”. The image is then loaded, and the “MaskEditor” is used to define the restoration area.

The trained LoRA model “yanhshao_art” is loaded to enhance performance, with the strength model and strength clip parameters set between 0.9 and 1.0 to balance sensitivity to image features and overall performance, ensuring the recovery of intricate details.

CLIP text encoding guides image generation. The positive prompt states: “The image depicts a clay pot with a continuous black originalist style (curved triangular pattern: 1.1), photographed in high definition, and is not broken or cracked”. Negative prompts exclude text, watermarks, low quality, and NSFW elements to avoid undesired content.

Preliminary testing showed that excessive prompt modifications could lead to overfitting, highlighting the need for careful moderation of input data.

To optimize restoration, a dual KSampler configuration is utilized. For KSampler (1), control_after_generator is set to randomize with a random seed. The step count ranges from 60 to 100, with a configuration value of 8.0. The sampler is Denoising Diffusion Probabilistic Models++ 2nd Order Multi-step (DPMPP_2M), the scheduler is Karras, and the denoising value is set to 0.92. KSampler (2) follows similar settings, with adjustments such as a reduced step count to 20, a Classifier-Free Guidance(cfg) value of 4 to 7, and a denoising value of 0.2 to 0.3. The outcome varies by operator expertise, with some pottery pieces achieving the desired result in one pass and others requiring two to five iterations for optimal restoration.

4. Restoration Results and Evaluation

4.1. Visual Based Initial Evaluation

A comprehensive comparative experiment was conducted to assess the effectiveness of different restoration techniques for Yangshao pottery images. The objective was to quantify the efficacy of each method in terms of restoring image quality and maintaining pattern consistency. From the perspective of a researcher in craft and art history, the author conducted an initial subjective assessment of the generated results. Upon comparison with the original images, distinct variations in the restoration results were observed, indicating the differing impacts of each technique.

The study selected ten intact Yangshao pottery images, and manual mask sections were added using Photoshop, with the alterations highlighted in green. The modified images were then subjected to two distinct restoration techniques: previous Inpainting and the combination of LoRA and Inpainting. Subsequently, the restored images were compared to the original ones, with a particular focus on the differences in the restored missing sections (Figure 8). Set 1 entailed the occlusion of the right side of a pottery bowl, encompassing a single bird and a curved line. The outcome of the Inpainting technique was somewhat disorganized, with the addition of two supplementary circular dots and an incomplete restoration of the bird’s structure. In contrast, the LoRA-enhanced inpainting method proved effective in restoring the bird’s distinctive features with minimal structural discrepancies. Set 2 occluded the typical Dot + Arc + pair of birds’ pattern and part of the curved triangular pattern. The conventional Inpainting resulted in the blurring of the distinguishing dot between the birds and the introduction of ambiguity regarding the triangular pattern. Although LoRA-enhanced Inpainting exhibited minor contour discrepancies and disproportion in the avian figures, it effectively restored the dot and preserved the fundamental structure. Set 3 examined the restoration of a curved triangular pattern and a dot. The Inpainting was unable to fully restore the triangular pattern and misplaced the dot. In contrast, the LoRA-enhanced Inpainting approach resulted in clearer edges, a more complete structure, and accurate dot placement. Set 4 occluded the Dot + Arc + pair of birds’ pattern and part of the pottery body. Both techniques yielded similar results, but the LoRA-enhanced Inpainting achieved a more natural restoration of the pottery surface.

In Set 5, the restoration that relied solely on Inpainting failed to recover the central dot in the double-bird motif, whereas incorporating LoRA resulted in a more accurate restoration. Set 6, which focused on the triangular arc shape of a single bird, revealed Inpainting’s difficulty in capturing the design, leading to deformation and misrepresentation. Through the deep modeling of multiple arc patterns, LoRA successfully generated shapes closely resembling the original image. In Set 7, the advantages of LoRA became even more apparent. Inpainting introduced inconsistencies when reshaping the missing sections, creating decorative elements reminiscent of modern deconstructivism and cubism. In contrast, LoRA effectively reconstructed the arc triangle motif, crucial for achieving high fidelity to the Yangshao–Miaodigou type pottery style.

In Sets 8 and 9, Inpainting demonstrated limitations in reconstructing complex patterns, especially in its control and placement of dots, resulting in noticeable deviations. Since precise dot placement is essential, LoRA’s fine-tuning capabilities allowed a closer alignment with the original appearance after multiple iterations. In Set 10, both techniques achieved a certain level of restoration; however, LoRA showed minor flaws in the smoothness of arcs.

Overall, LoRA’s restoration technique demonstrated significant advantages in maintaining structural integrity, edge clarity, and the accuracy of complex patterns. When dealing with intricate details and arc elements, LoRA achieved more precise and natural reconstructions, reducing blurriness and distortion. However, minor imperfections in specific areas, such as arc smoothness, persisted. As an auxiliary restoration approach, LoRA requires the expertise of operators and researchers to correct these deviations.

4.2. Quantitative Evaluation

SNR, SSIM, Mean Difference, and Standard Deviation of Difference are four key metrics used to evaluate the quality of restored images. In this study, these metrics were applied to compare two restoration methods—Inpainting and LoRA combined with Inpainting—against the original images to assess the degree of restoration achieved.

PSNR measures pixel-level differences, reflecting overall image quality, while SSIM focuses on structural integrity and perceptual quality, indicating visual similarity. Mean Difference evaluates deviations in brightness and color between the restored and original images, and Standard Deviation of Difference gauges the stability and consistency of local variations in the restored images.

By integrating these metrics, the restoration effects of Inpainting and LoRA + Inpainting can be comprehensively assessed across multiple dimensions, including pixel-level details, structural integrity, global characteristics, and local consistency.

4.3. Quantitative Results Analysis

The results demonstrate that the LoRA + Inpainting method exhibited a notable superiority over the conventional Inpainting technique (see Figure 9 and Figure 10). A visual comparison of the restored images revealed that the LoRA-enhanced approach demonstrated superior detail restoration and pattern continuity, particularly for complex designs. The PSNR and SSIM scores provided further support for these findings. To illustrate, in the initial cohort, the PSNR rose from 30.88 (Inpainting only) to 32.76 (LoRA + Inpainting), while the SSIM increased from 0.934 to 0.941. This evidence substantiates the assertion that the LoRA-based method yields superior overall image quality while simultaneously preserving structural integrity.

Mean Difference quantifies the average pixel value difference between two images; smaller values indicate fewer discrepancies. In specific sets, LoRA combined with Inpainting shows a marginally higher Mean Difference than Inpainting only. For example, in Set 4, the Mean Difference for LoRA + Inpainting is 0.669, while Inpainting shows a value of 0.651. This variation may result from LoRA prioritizing structural optimization over pixel-level precision. Conversely, in other sets, such as Set 10, the Mean Difference for LoRA + Inpainting is significantly lower, demonstrating its effectiveness in restoring pixel details. Although Mean Difference captures average pixel-level deviations, it does not fully represent the overall structural similarity between images.

The Standard Deviation of Difference measures fluctuations in pixel differences within localized areas of restored images, with smaller values indicating improved local consistency. The table indicates that LoRA combined with Inpainting outperformed Inpainting only in specific datasets, such as Set 1 and Set 8, though the overall differences are relatively minor. For instance, in Set 8, the Standard Deviation for LoRA + Inpainting is 2.019, which is slightly lower than Inpainting’s 2.041, indicating a minor advantage in local consistency with the LoRA-enhanced method. Nevertheless, in some sets, Inpainting performs better, successfully achieving good structural and shape restoration despite minor differences in pixel values, including color or brightness. While these variations may not be visually noticeable, they become more pronounced when calculating the Standard Deviation of Difference.

These metrics collectively evaluate pixel-level accuracy (PSNR) and visual structural similarity (SSIM), while also reflecting the performance of restored images concerning brightness and color consistency (Mean Difference) and local stability (Standard Deviation of Difference). The LoRA-enhanced restoration technique shows substantial improvements in overall restoration quality and exhibits considerable potential in managing details while maintaining structural integrity.

4.4. Compare the Graphical Effect with OpenCV TELEA and DeepFill v2

This study aims to evaluate the performance of various restoration methods for Yangshao pottery images, focusing on two approaches: OpenCV TELEA and DeepFill v2(GANs). OpenCV TELEA is a well-established image Inpainting algorithm that relies on neighborhood information, making it particularly effective for simple damage restoration in scenes with minimal detail. In contrast, DeepFill is a contemporary restoration method that utilizes GANs and excels at addressing complex image damage, particularly in extensive areas of loss and detail recovery. Consequently, the choice of these two methods facilitates a comprehensive comparison between traditional algorithms and advanced deep-learning technologies in image restoration.

The experiment will perform a comparative analysis using five sets of Yangshao pottery image data. These datasets include damaged images, original images, and mask images utilized during the restoration process. Each set of images will be repaired using the OpenCV TELEA and DeepFill methods (Figure 11). The experiment will document performance metrics, including repair time for each method, GPU utilization, and memory consumption. Upon completion of the repairs, the PSNR and SSIM for each method will be calculated to quantify their restoration effectiveness. As shown in Figure 12a, the Mean Difference and Standard Deviation of Difference will be employed to analyze pixel-level differences between the restored images and the original images. Ultimately, through this data analysis, a comprehensive comparison can be made of the strengths and weaknesses of the two methods in restoring Yangshao pottery images.

In terms of restoration quality, the combination of Stable Diffusion, LoRA, and Inpainting clearly outperforms the other techniques, demonstrated in Figure 12b. OpenCV TELEA produces excessively blurry edges and filled content, which compromises its effectiveness in pottery restoration. DeepFill, while capable of predicting the structure and outer contours of patterns, delivers subpar restoration quality, resulting in a noticeable gap when compared to the original image.Both OpenCV TELEA and DeepFill are suitable for general image restoration tasks, such as enhancing photo clarity, removing watermarks, and eliminating unwanted objects. In these contexts, balancing restoration effectiveness with repair time becomes a key evaluation criterion. However, in the specialized field of cultural heritage image restoration, substantial differences in repair time should not be prioritized. The primary focus should be on the accuracy and reliability of the restoration (as indicated by metrics like SSIM), as well as the accuracy of the image’s structure and trends.

4.5. Qualitative Evaluation

In this study, we selected ten images of naturally damaged pottery for processing using two distinct restoration techniques: the image Inpainting model in Stable Diffusion and a combined approach of Inpainting with LoRA. The outcomes of both methods will be compared for qualitative evaluation. Given the significant missing fragments in these pottery pieces, fully restoring their original state is unfeasible; furthermore, the incompleteness of the original images or artifacts complicates the establishment of clear comparative standards. This situation renders quantitative assessment impractical; consequently, we have adopted qualitative evaluation methods for a more comprehensive analysis of the restoration effects. The evaluation process incorporates the following elements:

(1): Expert Qualitative Assessment

Expert qualitative assessment plays a vital role in the evaluation process. This study invited experts from archaeology, artifact restoration, Pottery craft and art history to review and evaluate the restoration outcomes from a professional standpoint. These specialists, leveraging their extensive experience and profound understanding of cultural heritage, conducted a visual examination of the restored images. Their primary focus was on the fidelity of the restored areas, color coordination, and overall consistency of the image style. The qualitative feedback from the specialists emphasized not only the technical quality of the restoration but also the alignment of the image with its historical context, ensuring that the restoration effects align with the characteristics and artistic style of the cultural period to which the pottery belongs.

(2): Case Analysis in Conjunction with Historical Materials

Case analysis and artistic evaluation represent another crucial aspect of this qualitative assessment. We compared the restored images with pottery and other artifacts from the same cultural period and similar styles to evaluate the rationale and artistry of the restoration. This process entails a thorough examination of the pottery’s patterns, colors, and shapes, ensuring that the restored sections visually integrate with the undamaged areas. By referencing artifacts from the same period, we infer and reconstruct the missing portions of the image, ensuring that the restoration results embody both historical authenticity and artistic expression consistent with the unique style of the cultural heritage.

Qualitative evaluation guarantees the fidelity of restored images and validates the rationale and effectiveness of the restoration work through expert judgment and comparative analysis of artifact cases, even without original standards. Furthermore, the restored images adhere to professional standards of logic and artistry, avoiding “over-restoration” or elements inconsistent with historical facts, thus ensuring high academic and aesthetic consistency. By integrating expert feedback with case analysis, the evaluation provides a robust foundation for assessing the restoration effects of pottery images. Even when quantitative assessment is not fully applicable, it guarantees the scientific integrity and artistic value of the restoration work.

The experts conducted an evaluation of the restored images based on four primary criteria: edge clarity, color fidelity, pattern consistency, and structural integrity. Each criterion was evaluated on a scale of 1 (very poor) to 5 (excellent), thereby ensuring a standardized and objective assessment.

Edge Clarity: This criterion assesses whether the edges of elements in the restored image are clear and sharp, avoiding blurriness or distortion. Edge clarity is vital for retaining image details and directly impacts the visual consistency between restored and original areas.

Color Fidelity: This criterion evaluates whether the colors in the restored areas match those in the original parts. Accurate color representation is essential in artifact restoration. Experts verify color fidelity by comparing the restored colors with original parts or reference literature.

Pattern Consistency: This criterion examines whether the patterns in the restored areas align with the original design style. For pottery artifacts with intricate patterns, coherence and symmetry are crucial indicators of restoration quality.

Structural Integrity: This criterion evaluates whether the restored image maintains overall structural consistency with the original. It assesses whether the restored areas conform to the object’s overall structure, avoiding deformation or inconsistencies.

4.6. Qualitative Results and Analysis

According to the experts’ evaluations of the restored images, the performance of the LoRA + Inpaint technique demonstrates notable variations across the four primary criteria (Figure 13). The scores for edge clarity predominantly fall between 4 and 5, indicating that the edge details of the restored images were largely recovered, although a few instances yielded slightly lower scores. Color fidelity was particularly impressive, with nearly all images achieving a score of 5, showcasing that this technique is highly accurate in color restoration and maintains consistency with the original images. The scores for pattern consistency displayed considerable fluctuations, with some images earning only 2–3 points, indicating that in certain instances, the restoration technique did not fully preserve or accurately generate the details and symmetry of the original patterns. Conversely, other images received scores of 5, reflecting successful restoration outcomes.

The structural integrity scores were consistently high, generally ranging from 4 to 5, indicating that the restored images maintain a coherent and intact overall structure (Figure 14). In summary, the LoRA + Inpaint technique excels in color fidelity and structural integrity, yet there is variability in edge clarity and pattern consistency, particularly in areas with complex or severely damaged patterns that require further optimization.

4.7. Case Study

In this session, the G8-2 pottery from Gaoling County, Xi’an, Shaanxi Province [29], was selected for a detailed restoration and style reconstruction analysis, as illustrated in Figure 15. By comparing the effects of different restoration techniques on Yangshao period pottery, the varying outcomes of such treatments were highlighted.

The G8-2 pottery exemplifies the distinctive features of the Miaodigou type within the Yangshao culture, showcasing an elegant shape and distinct decorative patterns, despite having considerable areas of loss and damage. The damaged areas have been preliminarily identified as featuring a triangular arc bird motif. Consistent with prior restoration methods, the damaged areas were selected using a mask, and restoration was performed using two techniques: Inpainting and Inpainting combined with LoRA. The results indicate that relying solely on Inpainting led to noticeable errors and distortions in restoring the original patterns and textures of the pottery, failing to reproduce the primary missing element—the triangular arc motif. Instead, it merely extended the existing line patterns.

In comparison, the restoration methodology that integrated LoRA technology markedly enhanced the quality of the restoration. The restored pottery more closely resembled its original state, particularly in the highlighted red box, where the triangular arc motif in the single bird pattern exhibited a high degree of detail preservation. A comparison with other excavated pottery, including W1 (Gaoling County, Xi’an, Shaanxi), H776 (Gaoling County, Xi’an, Shaanxi), and H86 (Quanhu Village, Weinan, Shaanxi), and H86 (Quanhu Village, Weinan, Shaanxi) are highly relevant [29]. Notably, W1 and H776, originating from the same district, serve as valuable comparative references. These samples also feature single bird motifs with distinctly pronounced triangular arc patterns. The resemblance between the missing sections of G8-2 and those of W1 and H776 strengthens the reliability of this restoration, further confirming LoRA’s effectiveness in managing complex archaeological restoration tasks.

5. Discuss: The Role of AI in Archaeological Restoration

5.1. Challenges and Limitations

While the experimental results presented in this study show significant improvements in efficiency and high-precision generation capabilities through LoRA and Stable Diffusion techniques for restoring painted pottery patterns from the Yangshao culture, AI still faces limitations in handling certain complex cultural heritage restoration tasks. AI technology relies on pre-trained models and extensive datasets for pattern learning, enabling effective restoration of repetitive patterns. However, in more complex historical contexts or culturally sensitive areas, AI may lack the nuanced judgment and cultural interpretation that human experts possess.

AI relies on input data and predefined patterns, lacking the ability to perceive subtle cultural nuances. Despite advancements in contextual understanding within Natural Language Processing (NLP), image generation models such as Stable Diffusion still lack true awareness of cultural context. Specifically, deep learning-based image generation models like LoRA and Stable Diffusion analyze extensive datasets of existing images to identify and generate similar features. However, in cultural heritage restoration, these technologies reveal limitations, particularly regarding data bias, insufficient contextual awareness, and sensitivity to details.

The effectiveness of AI models relies on the quality and diversity of the training data. If the training data is inadequate or lacks representation of specific cultural or artistic styles, the generated images may deviate from the original style. In heritage restoration, overfitting may result in overly standardized or repetitive patterns in AI-generated imagery, failing to capture the handcrafted details, gradients, and irregularities. Most generative models, including GANs and Vector Quantized-Variational AutoEncoder 2(VQ-VAE-2), primarily focus on pixel-level feature processing and lack an understanding of higher-level semantic symbols. AI lacks comprehension of cultural contexts and historical backgrounds; it can replicate geometric shapes and colors but struggles to grasp the symbolic meanings and cultural significance of certain motifs. This limitation may lead to the misinterpretation of patterns with religious or ritual significance as mere decorative designs, failing to convey their deeper cultural implications.

AI image generation models, including GANs and diffusion models, introduce randomness into the image generation process to diversify results and avoid strict replication of training data. However, AI lacks the sensitivity inherent to human restorers. While it can accurately generate geometric structures, it struggles with subtle transitions in color gradients, resulting in stiffness in areas where fine tonal transitions are prevalent in Yangshao culture. This limitation highlights AI’s inadequacy in perceiving cultural and historical details.

When it comes to protecting cultural heritage, it is crucial to recognize the ethical concerns and limitations of AI technology. Certain subjective decisions require human expertise and sensitivity to preserve the authenticity and integrity of the artwork. This highlights the indispensable role of human experts in the evaluation and restoration processes, as they can address AI’s shortcomings in understanding cultural context and artistic expression, ensuring that restoration outcomes remain true to historical and cultural authenticity. To address these challenges, methods such as Stable Diffusion, LoRA, and ControlNet can be employed for optimization, effectively combining hand-drawn restoration with AI-based reconstruction. This integrated approach helps prevent excessive overfitting in AI-generated images and enhances the flexibility and precision of the restoration process.

5.2. Enhancing Efficiency and Well-Being Through AI

Numerous studies have shown that AI technology provides significant support to archaeologists and restorers, reducing their workload and improving overall efficiency. For instance, “AI can help museums manage collections more effectively, enabling archaeologists to identify, classify, and catalog artifacts more quickly” [23]. By automating large-scale restoration tasks, restorers can reduce the need for intensive manual operations, thereby lowering the risk of fatigue and errors, minimizing occupational burnout, and improving job satisfaction and mental health.

What sets this study apart is its dual perspective, approached from both the standpoint of an art historian and a creator of painted pottery. This dual identity provides a profound understanding of the potential benefits of AI in restoration, particularly in enhancing work efficiency and the quality of artistic creation. As an interdisciplinary practice, this study, grounded in the understanding of pottery history and theory as well as the practical needs of the field, genuinely experiences the advantages of this workflow in terms of time savings, improved accuracy, provision of diverse samples, and even enhanced creative inspiration.

From the perspective of art historians and cultural heritage restorers, their work goes beyond routine tasks, requiring creativity and critical thinking. AI-based restoration is not only a technological advancement but also a means to support and inspire restorers, promoting the ongoing preservation and innovation of cultural heritage.

6. Conclusions

The integration of LoRA with Stable Diffusion represents a notable advancement in the restoration of Yangshao pottery, addressing the limitations inherent in traditional methods. This research demonstrates that automating complex pattern restoration can enhance both efficiency and accuracy, allowing for the preservation of the unique characteristics of these ancient artifacts. The LoRA-enhanced workflow contributes to a better understanding of original artistic styles, facilitating the generation of high-quality images that accurately reflect the intricate details of the pottery. Comparative experiments indicate a clear superiority of the LoRA + Stable Diffusion approach over conventional techniques, particularly in terms of edge clarity, structural integrity, and fidelity to original designs. Metrics such as PSNR and SSIM confirm that the restored images exhibit significantly improved quality compared to those achieved through previous Inpainting methods. Additionally, qualitative assessments from experts highlight the enhanced visual appeal and authenticity of the restored artifacts, underscoring the role of technological integration in archaeological restoration. Building on these findings. The contribution and significance of this paper are as follows,

(1): Development of a New Workflow for Pottery Pattern Generation and Restoration: This study presents an innovative restoration workflow based on generative AI technology. This method meets the complex restoration requirements of pottery surface patterns, offering an efficient solution for cultural heritage preservation. Additionally, this research presents a highly efficient, user-friendly method for pattern restoration, specifically designed to facilitate archaeological restoration, enrich art history research, and engage cultural heritage enthusiasts. This method can serve as a supplementary approach to traditional restoration techniques.
(2): Training the Yangshao Painted Pottery Style LoRA Model: This study successfully trained a LoRA-based generation tool by collecting and analyzing a large dataset of pottery images and patterns, enabling the production of pottery patterns that closely align with the original artistic style. This tool integrates seamlessly into the existing Inpainting workflow, enhancing restoration accuracy while ensuring both artistic style and historical authenticity.
(3): Validation of the ComfyUI Workflow’s Effectiveness: The reliability and accuracy of the ComfyUI workflow in restoring missing pottery sections were validated through comparative experiments and qualitative evaluations. The workflow minimizes uncertainty in manual operations while enhancing the efficiency and flexibility of archaeologists throughout the restoration process.
(4): Expansion of Image Restoration Technology’s Applications: This restoration method extends beyond cultural heritage preservation and can be widely applied to low-cost image restoration scenarios. This approach offers research institutions and museums an easy-to-use and efficient restoration tool, facilitating the widespread adoption and digital preservation of cultural heritage.

Despite these advancements, this study acknowledges ongoing challenges associated with AI-assisted applications in cultural heritage. The reliance on extensive and diverse training datasets is crucial for optimal model performance, suggesting that future research should focus on expanding these datasets to encompass a wider range of styles and periods. While automated processes can reduce the need for manual intervention, the expertise of human restorers remains essential for evaluating and ensuring the cultural and historical accuracy of the restored images.

Looking forward, the implications of this research go beyond the restoration of Yangshao pottery. The methodologies developed can be adapted for other cultural artifacts, facilitating a more systematic application of AI technologies in heritage preservation. As the field evolves, it is crucial to balance the efficiency of AI-driven methods with the nuanced understanding offered by human expertise, fostering a collaborative approach to restoration.

In summary, this study highlights the effectiveness of LoRA and SD models in restoring Yangshao-painted pottery. Future research should prioritize the scalability of these methods across diverse cultural contexts, emphasizing the need for interdisciplinary collaboration between technology and heritage studies. By refining these restoration techniques, we can better protect and promote cultural heritage, enabling a wider audience to engage with and connect to these artifacts.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/heritage7110295/s1, Restorition of manual masked images using Inpainting and LoRA + Inpainting methods.

Funding

This research received no external funding.

Data Availability Statement

The data for this study is available at https://zenodo.org/records/14026630 (accessed on 29 October 2024).

Conflicts of Interest

The authors declare no conflict of interest.

References

Marie, I.; Qasrawi, H. Virtual assembly of pottery fragments using moiré surface profile measurements. J. Archaeol. Sci. 2005, 32, 1527–1533. [Google Scholar] [CrossRef]
Aoulalay, A.; El Makhfi, N.; Abounaima, M.C.; Massar, M. Classification of Moroccan decorative patterns based on machine learning algorithms. In Proceedings of the 2020 IEEE 2nd International Conference on Electronics, Control, Optimization and Computer Science (ICECOCS), Kenitra, Morocco, 2–3 December 2020; pp. 1–7. [Google Scholar]
Chetouani, A.; Treuillet, S.; Exbrayat, M.; Jesset, S. Classification of engraved pottery sherds mixing deep-learning features by compact bilinear pooling. Pattern Recognit. Lett. 2020, 131, 1–7. [Google Scholar] [CrossRef]
Cardarelli, L. A deep variational convolutional autoencoder for unsupervised features extraction of ceramic profiles: A case study from central Italy. J. Archaeol. Sci. 2022, 144, 105640. [Google Scholar] [CrossRef]
Kuntitan, P.; Chaowalit, O. Using deep learning for the image recognition of motifs on the Center of Sukhothai Ceramics. Curr. Appl. Sci. Technol. 2022, 22, 2. [Google Scholar] [CrossRef]
Argyrou, A.; Agapiou, A.; Papakonstantinou, A.; Alexakis, D.D. Comparison of machine learning pixel-based classifiers for detecting archaeological ceramics. Drones 2023, 7, 578. [Google Scholar] [CrossRef]
Spennemann, D.H.R. Generative artificial intelligence, human agency and the future of cultural heritage. Heritage 2024, 7, 3597–3609. [Google Scholar] [CrossRef]
Ma, J.; Peng, Y.; Cheng, W.; Qiu, M.; Nie, Y. Identification method of ancient ceramics revision. In Proceedings of the 2021 8th IEEE International Conference on Cyber Security and Cloud Computing (CSCloud)/2021 7th IEEE International Conference on Edge Computing and Scalable Cloud (EdgeCom), Washington, DC, USA, 26–28 June 2021; pp. 213–218. [Google Scholar]
Liu, Q. Technological innovation in the recognition process of Yaozhou Kiln ware patterns based on image classification. Soft Comput. 2023, 1–10. [Google Scholar] [CrossRef]
Ling, Z.; Delnevo, G.; Salomoni, P.; Mirri, S. Findings on machine learning for identification of archaeological ceramics: A systematic literature review. IEEE Access 2024, 12, 100167–100185. [Google Scholar] [CrossRef]
Bickler, S.H. Machine learning identification and classification of historic ceramics. Archaeology 2018, 20, 20–32. [Google Scholar]
Sizyakin, R.; Voronin, V.; Pižurica, A. Virtual restoration of paintings based on deep learning. In Proceedings of the Fourteenth International Conference on Machine Vision (ICMV 2021), Rome, Italy, 8–12 November 2021; pp. 422–432. [Google Scholar]
Jiang, D.; Li, P.; Xie, H. Research into digital oil painting restoration algorithm based on image acquisition technology. In Proceedings of the 2022 International Conference on 3D Immersion, Interaction and Multi-sensory Experiences (ICDIIME) 2022, Madrid, Spain, 27–29 June 2022; pp. 65–68. [Google Scholar]
Guillemot, C.; Le Meur, O. Image inpainting: Overview and recent advances. IEEE Signal Process. Mag. 2013, 31, 127–144. [Google Scholar] [CrossRef]
Conde, J.; González, M.; Martínez, G.; Moral, F.; Merino-Gómez, E.; Reviriego, P. How stable is stable diffusion under recursive inpainting (RIP)? arXiv 2024, arXiv:2407.09549. [Google Scholar]
Corneanu, C.; Gadde, R.; Martinez, A.M. Latentpaint: Image inpainting in latent space with diffusion models. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision 2024, Waikoloa, HI, USA, 3–8 January 2024; pp. 4334–4343. [Google Scholar]
Hu, E.J.; Shen, Y.; Wallis, P.; Allen-Zhu, Z.; Li, Y.; Wang, S.; Wang, L.; Chen, W. Lora: Low-rank adaptation of large language models. arXiv 2021, arXiv:2106.09685. [Google Scholar]
Hartley, Z.K.J.; Lind, R.J.; Pound, M.P.; French, A.P. Domain targeted synthetic plant style transfer using stable diffusion LoRA and ControlNet. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2024, Seattle, WA, USA, 16–22 June 2024; pp. 5375–5383. [Google Scholar]
Levin, A.O.; Belov, Y.S. A study on the application of using hypernetwork and low-rank adaptation for text-to-image generation based on diffusion models. In Proceedings of the 2024 6th International Youth Conference on Radio Electronics, Electrical and Power Engineering (REEPE), Moscow, Russia, 29 February–2 March 2024; pp. 1–5. [Google Scholar]
Fareed, M.W.; Bou Nassif, A.; Nofal, E. Exploring the potentials of artificial intelligence image generators for educating the history of architecture. Heritage 2024, 7, 1727–1753. [Google Scholar] [CrossRef]
Brandi, C. Teoria del Restauro; Ed. di Storia e Letteratura: Roma, Italy, 1963. [Google Scholar]
Ghaith, K. AI integration in cultural heritage conservation–Ethical considerations and the human imperative. Int. J. Emerg. Disruptive Innov. Educ. VISIONARIUM 2024, 2, 6. [Google Scholar] [CrossRef]
Pasikowska-Schnass, M.; Lim, Y.S. Artificial Intelligence in the Context of Cultural Heritage and Museums: Complex Challenges and New Opportunities; Technical Report PE 747.120; European Parliamentary Research Service: Brussels, Belgium, 2023. [Google Scholar]
Lyu, Z.; Li, Z.; Wu, Z. Research on image-to-image generation and optimization methods based on diffusion model compared with traditional methods: Taking façade as the optimization object. In Proceedings of the International Conference on Computational Design and Robotic Fabrication, Shanghai, China, 24 July 2023; pp. 35–50. [Google Scholar]
Hou, Y.; Peng, H.; Liu, Y. Digital Inheritance of Intangible Cultural Heritage Based on the LoRA Model: A Case Study of Chu Lacquerware. Des. Art Study 2024, 14, 14–18. [Google Scholar]
Xu, S.; Zhang, J.; Li, Y. Knowledge-driven and diffusion model-based methods for generating historical building facades: A case study of traditional Minnan residences in China. Information 2024, 15, 344. [Google Scholar] [CrossRef]
Zhang, P. The Atlas of Chinese Painted Pottery; Cultural Relics Publishing House: Beijing, China, 1990; pp. 184–186. [Google Scholar]
Liu, H.; Ma, X.; Gu, W. The Complete Collection of Painted Pottery Unearthed in China (Henan Volume); Science Press & Longmen Bookstore: Beijing, China, 2021. [Google Scholar]
Wang, W.; Sun, Z. The Complete Collection of Painted Pottery Unearthed in China (Shaanxi Volume); Science Press & Longmen Bookstore: Beijing, China, 2021. [Google Scholar]
Li, X. Fish and bird combination images on Miaodigou-type painted pottery from the Yangshao Culture. Archaeology 2021, 8, 71–81. [Google Scholar]
Jin, X. Interpretation and study of bird patterns on Miaodigou-type painted pottery. Huaxia Archaeol. 2023, 6, 70–82. [Google Scholar] [CrossRef]
Li, J. The Protection and Restoration of Cultural Heritage: A Comparative Study of Theoretical Models. Art Stud. 2006, 2, 102–117. [Google Scholar]
Yu, X.; Chen, D.; Liu, Y. A Preliminary Study on Digital Restoration Measures of Textiles based on the Concept of "Restore the Old as Original":A Case Study on Plain Weave. Art Des. (Theory) 2024, 2, 89–92. [Google Scholar]
Zhao, L.; Chen, H.; Zhao, H.; Dong, J.; Li, Q. A Scientific Research of the Painted Potteries of the Yangshao Culture from the Miao-Di-Gou Site. Spectrosc. Spectr. Anal. 2018, 38, 1420–1429. [Google Scholar]
Wu, J. Application of nondestructive testing analysis technology in the study of painted pottery. Identif. Apprec. Cult. Relics 2022, 15, 56–59. [Google Scholar] [CrossRef]
Nichol, A.; Dhariwal, P.; Ramesh, A.; Shyam, P.; Mishkin, P.; McGrew, B.; Sutskever, I.; Chen, M. Glide: Towards photorealistic image generation and editing with text-guided diffusion models. arXiv 2021, arXiv:2112.10741. [Google Scholar]
Cioni, D.; Berlincioni, L.; Becattini, F.; Del Bimbo, A. Diffusion-based augmentation for captioning and retrieval in cultural heritage. In Proceedings of the IEEE/CVF International Conference on Computer Vision 2023, Paris, France, 2–3 October 2023; pp. 1707–1716. [Google Scholar]
Yıldırım, E. Text to image artificial intelligence in a basic design studio: Spatialization from novel. In Proceedings of the 4th International Scientific Research and Innovation Congress, Istanbul, Turkey, 30–31 July 2022; pp. 24–25. [Google Scholar]
Sun, L.; Wu, R.; Zhang, Z.; Yong, H.; Zhang, L. Improving the stability of diffusion models for content consistent super-resolution. arXiv 2023, arXiv:2401.00877. [Google Scholar]
Ma, S.; Xu, H.; Li, C.; Geng, W.; Shen, H.; Li, M. Painting style simulation method based on fine-tuning paradigm for large models. Comput. Appl. 2024, 44, 268–272. [Google Scholar]
Biderman, D.; Ortiz, J.G.; Portes, J.; Paul, M.; Greengard, P.; Jennings, C.; King, D.; Havens, S.; Chiley, V.; Frankle, J. Lora learns less and forgets less. arXiv 2024, arXiv:2405.09673. [Google Scholar]
Ćulafić, I.; Šćekić, Z.; Dejan, B.; Popović, T.; Jovović, I. Output manipulation via LoRA for generative AI. In Proceedings of the 2024 23rd International Symposium INFOTEH-JAHORINA (INFOTEH), East Sarajevo, Bosnia and Herzegovina, 20–22 March 2024; pp. 1–4. [Google Scholar]
Yang, A.X.; Robeyns, M.; Wang, X.; Aitchison, L. Bayesian low-rank adaptation for large language models. arXiv 2023, arXiv:2308.13111. [Google Scholar]
Wang, Z.; Jia, X.; Ran, E.; Xu, C. Blue Calico Pattern Generation Based on an Improved Stable Diffusion Model. J. Optoelectron.·Laser 2024, 35, 1–11. Available online: https://link.cnki.net/urlid/12.1182.o4.20240604.1447.013 (accessed on 5 June 2024).
ComfyUI. Comfyui: The Most Powerful and Modular Diffusion Model GUI, API, and Backend with a Graph/Nodes Interface. Available online: https://github.com/comfyanonymous/ComfyUI (accessed on 29 October 2024).
Bmaltais. Kohya’s GUI. Available online: https://github.com/bmaltais/kohya_ss (accessed on 10 October 2024).

Figure 1. (a) Simplified bird motifs in Miaodigou-Type pottery [27]. (b) H106, Neolithic Period, Miaodigou-Type, unearthed in 2002 from the Miaodigou site, Hanzhuang Village, Sanmenxia City, Henan Province [28].

Figure 2. Analysis of bird motifs in Miaodigou-Type painted pottery [31].

Figure 3. (a) Basin, Middle Yangshao culture in the Neolithic Age. It was discovered at the site H278 of Miaodigou, Hanzhuang Village, Sanmenxia City, He-nan Province, China [28]. (b) Basin, Middle Yangshao culture in the Neolithic Age. It was discovered at the Longgang Temple site H1, Nanzheng County, Hanzhong Prefecture, Shaanxi Province, China [29].

Figure 4. Yangsha pottery pattern restoration process.

Figure 5. Comparison of weights of multiple models generated furring LoRA training.

Figure 6. TensorBoard generates the loss value changes during the LoRA training process.

Figure 7. Comfy UI workflow of Yangshao painted pottery picture restoration process.

Figure 8. The restoration of artificial missing sections (Sets 1–3), For the complete catalog, refer to Supplementary Materials.

Figure 9. Comparison of PSNR, SSIM, Mean Difference, and Standard Deviation of Difference values between “After Inpainting” and “After LoRA + Inpainting”.

Figure 10. Average line chart of comparison of PSNR, SSIM, Mean Difference, and Standard Deviation of Difference values between “Original Image/After Inpainting” and “Original Image/After LoRA + Inpainting”.

Figure 11. SD + LoRA + Inpainting compares the restoration effect with OpenCV TELEA and DeepFill v2.

Figure 12. (a) Compare the average values of restoration time, GPU, memory usage, and PSNR and SSIM in the three methods. (b) Bar charts that compare different metrics related to various image Inpainting methods.

Figure 13. The average scores were evaluated by five experts based on the criteria of edge clarity, color fidelity, pattern consistency, and structural integrity.

Figure 14. Expert evaluation score radar chart.

Figure 15. The restoration process and comparison images of G8-2 pottery.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhang, X. AI-Assisted Restoration of Yangshao Painted Pottery Using LoRA and Stable Diffusion. Heritage 2024, 7, 6282-6309. https://doi.org/10.3390/heritage7110295

AMA Style

Zhang X. AI-Assisted Restoration of Yangshao Painted Pottery Using LoRA and Stable Diffusion. Heritage. 2024; 7(11):6282-6309. https://doi.org/10.3390/heritage7110295

Chicago/Turabian Style

Zhang, Xinyi. 2024. "AI-Assisted Restoration of Yangshao Painted Pottery Using LoRA and Stable Diffusion" Heritage 7, no. 11: 6282-6309. https://doi.org/10.3390/heritage7110295

APA Style

Zhang, X. (2024). AI-Assisted Restoration of Yangshao Painted Pottery Using LoRA and Stable Diffusion. Heritage, 7(11), 6282-6309. https://doi.org/10.3390/heritage7110295

Article Menu

AI-Assisted Restoration of Yangshao Painted Pottery Using LoRA and Stable Diffusion

Abstract

1. Introduction

2. Theoretical and Technical Foundations

2.1. Analysis of Yangshao Pottery Patterns

2.2. Traditional Restoration Methods and Their Limitations

2.3. Stable Diffusion

2.4. LoRA Technology Analysis

2.5. Analysis of LoRA’s Suitability

2.6. Inpainting

2.7. ComfyUI Workflow

3. Method and Materials

3.1. Methodology

3.2. Experimet Design

3.3. Image Data Collection

3.4. LoRA Model Training

3.4.1. Image Annotation

3.4.2. Training Process

3.5. Training Results and Data Analysis

3.6. Setting Up the ComfyUI Workflow

4. Restoration Results and Evaluation

4.1. Visual Based Initial Evaluation

4.2. Quantitative Evaluation

4.3. Quantitative Results Analysis

4.4. Compare the Graphical Effect with OpenCV TELEA and DeepFill v2

4.5. Qualitative Evaluation

4.6. Qualitative Results and Analysis

4.7. Case Study

5. Discuss: The Role of AI in Archaeological Restoration

5.1. Challenges and Limitations

5.2. Enhancing Efficiency and Well-Being Through AI

6. Conclusions

Supplementary Materials

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI