For rates up to 2 bits/pixel, the proposed architectures Ballé(2018)–laplacian-N128- M192 (with the simplified entropy model) and Ballé(2018)-s-laplacian-N64-M192 (combining the reduction of the number of filters to *N* = 64 and the simplified Laplacian entropy model) are compared with the non-variational reference method Ballé(2017)–nonparametric-N192 [13], with the variational reference method Ballé(2018)–hyperprior-N128-M192 [16], with its version after reduction of the number of filters Ballé(2018)-shyperprior-N64-M192, with the architecture denoted as Ballé(2018)-nonparametric-N128- M192 (combining the main auto-encoder in [16] and the non-parametric entropy model in [13]) and its version after reduction of the number of filters Ballé(2018)-s-non-parametric-N64- M192. Table 4 shows that the coding part complexity of Ballé(2018)-s-laplacian-N64-M192

is 13% lower than the one of Ballé(2018)-s-hyperprior-N64-M192.

**Table 4.** Reduction of the encoder complexity induced by simplified entropy model on the coding part-Case of rates up to 2 bits/pixel).


Figure 10 shows the rate-distortion averaged over the validation dataset for the trained models for both MSE and MS-SSIM quality measures. Recall that the architectures were trained for MSE only. The proposed simplified entropy model (Ballé(2018)-slaplacian-N64-M192) achieves an intermediate performance between the variational model (Ballé(2018)-s-hyperprior-N64-M192) and the non-variational model (Ballé (2018)-snon-parametric-N64-M192). Obviously, due to the entropy model simplification, Ballé(2018)-s-laplacian-N64-M192 underperforms the more general and thus more complex Ballé(2018)-s-hyperprior-N64-M192 model. However, the proposed entropy model, even if simpler, preserves the adaptability to the input image, unlike the models Ballé(2018)–non-parametric-N128-M192 and Ballé(2017)–non-parametric-N192 [13]. Please note that the simplified Laplacian entropy model perform close to the hyperprior model at relatively high rates. One possible explanation for this behaviour can be the increased amount of side information required by the hyperprior model [16] for these rates [28].

#### 4.7.2. At High Rates

For high rates (above 2 bits/pixel), the proposed architectures Ballé(2018)–laplacian-N192-M320 (with the simplified entropy model) and Ballé(2018)-s-laplacian-N64-M320 (combining the reduction of the number of filters to *N* = 64 and the simplified Laplacian

entropy model) are compared with the non-variational reference method Ballé(2017)–nonparametric-N256 [13], with the variational reference method Ballé(2018)–hyperprior-N192-M320 [16], with its version after reduction of the number of filters Ballé(2018) s-hyperprior-N64-M320, with the architecture denoted as Ballé(2018)-nonparametric-N192-M320 (combining the main auto-encoder in [16] and the non-parametric entropy model in [13]) and its version after reduction of the number of filters Ballé(2018)-s-nonparametric-N64-M320. Figure 11 displays the rate-distortion averaged over the validation dataset for the trained models in terms of MSE. The proposed simplified entropy method Ballé(2018)-s-laplacian-N64-M320 achieves an intermediate performance between the variational model (Ballé(2018)-s-hyperprior-N64-M320) and the non-variational model Ballé(2018)-s-non-parametric-N64-M320, similarly to the models targeting lower rates in Figure 10. Table 5 shows that the coding part complexity of Ballé(2018)-s-laplacian-N64-M320 is around 16% lower than the one of Ballé(2018)-s-hyperprior-N64-M320.

**Table 5.** Reduction of the encoder complexity induced by simplified entropy model on the coding part-Cas of rates above 2 bits/pixel.


#### 4.7.3. Summary

For either low or high bit rates, the proposed entropy model simplication leads to intermediary performance when compared to the reference architectures [13,16], both in MSE and in MS-SSIM, while it leads to a coding part complexity decrease of more than 10% with respect to [16].

#### *4.8. Discussion About Complexity*

According to the previous performance analysis, the computational time complexity of the proposed method is significantly lower than the one of the reference learned compression architecture [16]. However, around 10 kFLOPs/pixel, the attained complexity is at least 2 orders of magnitude higher than the ones of the CCSDS and JPEG2000 [5] standards. Indeed, the complexity of CCSDS 122.0 is around 140 operations per pixel (without optimizations), or 70 MAC (Multiplication Accumulation). The JPEG2000 is 2 to 3 times more complex depending on the optimizations. Note however that the CCSDS 122.0 dates back to 2008 when onboard technologies were limited to radiation-hardened (Rad-Hard) components dedicated to space, with the objectives of 1 Msample/s/W (as specified in the CCSDS 122.0 green book [29]), to process around 50 Mpixels/s. Space technologies, currently developed for the next generation of CNES Earth observation satellites, rather target 5–10 Msample/s/W. Nowadays, the use of commercial off-the-shelf (COTS) components or of dedicated hardware accelerators is envisioned: based on a thinner silicon technology node, they allow higher processing frequencies with consistently lower consumption. For instance, the Movidius Myriade 2 announces 1 TFLOP/s/W. The 10 kFLOP/pixels of the current network would lead to 100 Mpixels/s/W on this component. Therefore, the order of magnitude of the proposed method complexity is not incompatible with an embedded implementation, taking into account the technological leap from the component point of view. Consequently, the complexity increase with respect to the CCSDS one, which we limited as far as possible, is expected to be affordable after computation device upgrading. Please note that, in addition, manufacturers of components dedicated to neural networks provide software suites (for example Xilinx) to optimize the portings. Finally, before on board implementation, a network compression (including pruning, quantization, or tensor decomposition for instance) can be envisioned. However, this is out of the scope of this paper.

#### **5. Conclusions**

This paper proposed different solutions to adapt the reference learned image compression models [13,16] to on board satellite image compression, taking into account their computational complexity. We first performed a reduction of the number of filters composing the convolutional layers of the analysis and synthesis transforms, applying a special treatment to the bottleneck. The impact of the bottleneck size, under a drastic reduction of the overall number of filters, was investigated. This study allowed identifying the lowest global number of filters for each rate. For the sake of completeness, we also called into question the other design options of the reference architectures, and especially the parametric activation functions. Second, in order to simplify the entropy model, we also performed a statistical analysis of the learned representation. This analysis showed that most features follow a Laplacian distribution. We thus proposed a simplified parametric entropy model, involving a single parameter. To preserve the adaptivity and thus the performance, this parameter is estimated in the operational phase for each feature of the input image. This entropy model, although far simpler than non-parametric or hyperprior models, brings comparable performance. In a nutshell, by combining the reduction of the global number of filters, and the simplification of the entropy model, we developed a reduced-complexity compression architecture for satellite images that outperforms the CCSDS 122.0-B [6], in terms of rate-distortion trade-off, while maintaining a competitive performance for medium to high rates in comparison with the reference learned image compression models [13,16]. Thereupon, while more complex than traditional CCSDS 122.0 and JPEG 2000 standards, the proposed solutions offer a good compromise between complexity and performance. Thus, we can recommend their use, subject to the availability of suitable on board devices. Besides, future work will be devoted to hardware considerations regarding the on board implementation.

**Author Contributions:** Conceptualization, M.C. (Marie Chabert), T.O. and C.P.; Formal Analysis, M.C. (Marie Chabert); Investigation, V.A.d.O.; Methodology, V.A.d.O., M.C. (Marie Chabert), T.O. and C.P.; Resources, M.B., C.L., M.C. (Mikael Carlavan), S.H., F.F. and R.C.; Software, V.A.d.O.; Supervision, M.C. (Marie Chabert), T.O. and C.P.; Validation, M.B., C.L., M.C. (Mikael Carlavan), S.H., F.F. and R.C.; Writing—original draft, V.A.d.O. and M.C. (Marie Chabert); Writing—review & editing, T.O. and C.P. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work has been carried out under the financial support of the French space agency CNES and Thales Alenia Space. Part of this work has been funded by the Institute for Artificial and Natural Intelligence Toulouse (ANITI) under grant agreement ANR-19-PI3A-0004.

**Data Availability Statement:** Restrictions apply to the availability of these data. Data was obtained from CNES as partner of this study.

**Acknowledgments:** Experiments presented in this paper were carried out using the OSIRIM platform that is administered by IRIT and supported by CNRS, the Region Midi-Pyrenées, the French Government, ERDF (see http://osirim.irit.fr/site/en).

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Abbreviations**

The following abbreviations are used in this manuscript:


PCA Principal component analysis

ReLU Rectified Linear Unit
