1. Introduction
Coal is the mixture of organic macerals, including vitrinite, liptinite as well as inertinite, and inorganic minerals [
1,
2]. Analyzing coal photomicrographs and conducting maceral analysis provide valuable insights into the coal’s composition, quality, and potential applications. Advanced microscopy techniques, such as scanning transmission electron microscope [
3], enable the acquisition of high-quality photomicrographs. Nevertheless, these precise instruments require skilled operators and often have limited availability in most laboratories due to the high cost. Conventional microscopes or older photomicrographs often suffer from limited resolution, particularly in laboratories with constrained funding. Re-acquiring photomicrographs with advanced microscopes involves substantial expenses and labor-intensive procedures, such as sieving, molding, and polishing. Recently, with the rapid development of computer vision, machine learning methods were introduced to identify maceral groups in given photomicrographs [
4,
5,
6]. To effectively explore these low-resolution photomicrographs, this study proposes a novel strategy to enhance the resolution of low-resolution photomicrographs. The approach involves utilizing an improved generative adversarial network (GAN) to obtain high-quality images, allowing for precise maceral identification and analysis. This method presents a promising solution to overcome the limitations of conventional techniques and improve the performance of coal photomicrographs analysis.
The purpose of super-resolution (SR) is to convert low-resolution image with coarse details into corresponding high-resolution image with better visual quality and refined details. Single-Image Super-Resolution (SISR) is an important branch of SR, which aims to determine the mapping function between a low-resolution image and a high-resolution image, and reconstruct the corresponding high-resolution image. Although promising results have been achieved, it remains a challenging problem in computer vision. Filtering approaches, such as linear, bicubic interpolation, and Lanczos re-sampling, are classical methods of enhancing resolution, which can reconstruct images quickly and straightforwardly. Freedman et al. introduced a filter based on local self-similarity observations to search for similar patches, whereas its performance is suboptimal in clustered regions with fine details [
7]. These filtering methods tend to oversimplify the Single-Image Super-Resolution (SISR) problem and result in a loss of image details [
8]. Additionally, they may lead to over-smooth texture in the reconstructed images.
With the rapid advance of machine learning, deep learning has attracted more and more attention in computer vision and medical signal analysis, and is widely used in the super-resolution field [
9,
10]. Super-resolution CNN (SRCNN) employs neural network to solve SISR problem, which demonstrates strong capability of learning rich features from big data in an end-to-end manner [
8]. However, its shallow architecture restricts its performance. Very Deep Super-Resolution (VDSR), with 20 residual layers, enhances the performance of super-resolution image reconstruction, whereas it consumes much more computational cost [
11,
12]. Fast Super-Resolution Convolutional Neural Network (FSRCNN) has a relatively shallow network structure consisting of four convolution layers and one deconvolution layer [
13]. It was demonstrated to have faster speed and better reconstructed image quality than the SRCNN. Although significant improvement in terms of accuracy and reconstruction speed were achieved, one critical problem for that period was that the resulted high-resolution images always have poor visual quality, especially for the cases with large upscaling factors. In addition, the loss function of these methods has largely focused on minimizing the mean squared error (MSE) between the restored image and the ground truth. These methods aim to enhance the Peak Signal-to-Noise Ratio (PSNR), and may ignore high-frequency information, leading to over-smoothed results.
To address this concern, the researchers mainly improve the reconstruction performance from the prospective of loss function and network structure. Ledig et al. developed a novel framework for Super-Resolution based on the Generative Adversarial Network (SRGAN), and proposed perceptual loss function instead of traditional MSE loss [
14]. It calculates the difference between the generated and real images in feature space based on pretrained VGG, thus preserving more realistic details and textures. The experimental results indicate that SRGAN achieved finer texture details even with large up-scaling factors in comparison with SRCNN. The core concept of Multi-Agent Diverse GAN (MAD-GAN) involves simultaneous training of multiple generators, wherein each generator is tasked with producing a set of related yet not entirely consistent samples, thus facilitating more efficient data processing [
15]. During the training process, the discriminator assesses the images generated by the generators and assigns a reward score to each generator, which serves as an indicator of the quality of the generated samples. Consequently, the optimization objective for the generators encompasses not only the minimization of differences with real samples but also the maximization of diversity among the generated samples. This char acteristic endows MAD-GAN with the ability to not only generate high-quality samples but also maintain sample diversity, effectively preventing the generation of excessively similar samples. Wang et al. made significant improvements to the key components of SRGAN by introducing residuals in the dense blocks. This facilitates the flow of fine-detail features to deep layers of the network. Additionally, they further preserved the details’ information by removing batch normalization [
16]. Compared with SRGAN, the proposed enhanced SRGAN (ESRGAN) achieved better perceptual quality with more realistic and natural textures for the visual sense. RFB-ESRGAN emerged as the winning solution for the image super-resolution reconstruction task in the NTIRE 2020 challenge [
17]. It effectively integrates the distinctive traits of the RFB-Net and ESRGAN models, while introducing receptive field blocks into the feature extraction network to enhance the capture of global and local features within the images. This incorporation of receptive field blocks notably benefits the handling of objects or textures at various scales.
While deep learning models have shown remarkable performance in Single-Image Super-Resolution (SR), the networks proposed recently for general images are not suitable for the reconstruction of coal photomicrographs. The deep neural networks proposed in this research perform well with common images, such as those in the DIV2K and Flickr2K datasets [
18]. DIV2K is the foremost dataset extensively employed for training super-resolution reconstruction models, renowned for its high quality. It encompasses 800 training images, 100 validation images, and 100 test images. On the other hand, Flickr2k constitutes a vast extended dataset with 2650 2K images originating from the renowned image-sharing platform Flickr, a subsidiary of Yahoo. However, their application to coal photomicrographs can lead to the undesired presence of artifacts, causing performance issues. Coal photomicrographs predominantly consist of black and gray colors. Although they do exhibit differences across various macerals, the level of complexity in terms of details and textures is not as high as that found in natural images. Considering the unique characteristics of coal photomicrographs, we specifically designed a novel framework based on improved GAN to enhance the resolution of these images without unwanted artifacts.
The developed super-resolution model is trained with faster speed and fewer parameters in comparison with the state of the art. The main contributions of the proposed method are as follows:
- 1.
Given the unique characteristics of coal photomicrographs, which set them apart from traditional images, we have specifically designed a lightweight generative adversarial network to enhance the resolution of these photomicrographs. Experimental results indicate that the proposed method surpasses state-of-the-art GAN-based methods.
- 2.
We propose a novel residual block called the Wide Residual Block (WRB), designed to enhance the neural network’s non-linear fitting ability and feature extraction capabilities while minimizing computational load. By integrating WRBs into the network architecture, the modified network is able to produce smoother and more continuous restoration effects without introducing artifacts, outperforming networks utilizing traditional residual blocks.
- 3.
We utilize a pyramid attention block that can be seamlessly integrated into existing super-resolution networks. This block significantly improves the performance of super-resolution models by enhancing their capability to capture important feature relationships across multiple scales. The related codes and dataset are publicly available at the following website:
https://github.com/Jackson-LIMU/SR-IGAN (accessed on 30 January 2023).
The rest of this paper is organized as follows.
Section 2 introduces the architecture of the networks.
Section 3 presents the details and evaluation metrics of the experiments.
Section 4 shows the experiment results of the proposed methods and comparison with existing methods.
Section 5 presents the conclusion of this paper.
5. Conclusions
Macerals, organic components present in coal, represent different types of plant materials transformed to varying degrees during coal formation. By analyzing coal photomicrographs and conducting maceral analysis, researchers and industry professionals gain valuable insights into coal’s composition, quality, and potential applications. As a result, coal photomicrograph analysis plays a critical role in assessing coal quality and advancing environmentally friendly mining practices. However, obtaining high-resolution coal photomicrographs is a cumbersome and costly process. To effectively explore the information from low-resolution photomicrographs, we propose a lightweight network designed to enhance the resolution of coal photomicrographs using an improved GAN-based super-resolution method. We propose a novel architecture called wide residual block with subresidual modules to replace the BN layers in conventional residual block. The removal of BN layers contributes to performance improvement, particularly in preventing the distortion of the image’s contrast. In comparison with the recently proposed GAN-based SR strategies, this architecture not only simplifies the generator network but also avoids unpleasant artifacts introduced by BN. In addition, we embed a five-level pyramid attention block in the middle of the generator to adaptively capture the long-range correlations between features. The module is fully differentiable and can serve as a common building block of SR networks to enhance image restoration performance. Moreover, we introduce a global average pooling layer prior to the last second convolutional layer of the discriminator to avoid the overfitting problem. We employ an additional loss function, referred to as TV-loss, to suppress noise during the training process. Collectively, these improvements contribute to the generation of images with more natural textures, finer sharpness, and intricate details. We evaluate the performance of the proposed method and the state of the art’s on 84 coal photomicrographs. The experimental results, both quantitative and qualitative, validate the effectiveness of our proposed approach.
It should be noted that the proposed method still has a few limitations. The main focus is on whether the high-resolution images generated by the generator comply with the geological rules. In addition, more photomicrographs are required to evaluate the generalization ability of the proposed method. All of these aspects require further validation through real-world applications. To address these concerns, in the future, we plan to invite experienced geologists to conduct subjective evaluation tests (MOS testing) on the generated coal photomicrographs to assess their perceptual quality. Additionally, we aim to incorporate domain knowledge into the super-resolution reconstruction network to provide superior performance and ensure that the reconstruction results are comprehensible to humans.