**3. Results**

#### *3.1. Model Training*

Multiple model training session revealed that tackling the objective of training an orthophoto patch encoder is inseparably related to preparing generator and discriminator neural networks that are complex enough to learn all the features present in the input orthophoto. The networks have to be able to produce high-quality artificial images and determine whether the image is artificially generated or not, respectively. This directly influences the following:


Initially chosen BiGAN architecture utilizes many concepts from previously designed networks such as deep convolutional GAN (DCGAN) [48] that, due to their simplicity, are not suitable for processing complex or large images. Therefore, their usefulness in the analysis of aerial imagery is limited. Although BiGAN offered all of the required earlier features, it was not capable of processing an orthophoto patch of size exceeding 32 pixels in both dimensions. This was a huge limitation due to the fact that, with a given 25 cm pixel ground sample distance, this method covered roughly the area of 64 m2. In consequence, the processed patch did not carry enough details to allow a reliable assessment of the similarity between real and artificial images. Attempts to increase the maximum processed input size led to swapping default BiGAN generator and discriminator models with other network types based on deep residual blocks [49] and inception modules [50]. The overall architecture of the generator and discriminator pair resembled BigGAN [51].

After multiple experiments, the authors confirmed that, despite the ability to generate images up to 512 px × 512 px, the network was not capable of learning a reliable bidirectional mapping between the image and the latent space. This was due to the fact that the encoder architecture was lacking in comparison with its powerful counterparts. This problem has been addressed and mitigated in the paper describing large adversarial features learning and the big bidirectional generative adversarial network (BigBiGAN) [44] by introducing intermediate discriminators and proposing a stronger encoder model (Supplementary Materials).
