**4. Crystallization Measurement Method**

The process of the proposed image measurement method is presented in Figure 3. The detailed steps are described in the following subsections.

**Figure 3.** Flow chart of the proposed image measurement.

#### *4.1. Crystal Image Preprocessing*

Due to the disturbance of noise and blur, the quality of crystal images is mostly degraded during online image acquisition. Therefore, guided filtering [27] can be taken as a denoising method. The guided filtering function is considered as an edge-preserving filter, guaranteeing good information preservation around the image edges. Supposing that the pixel of the input image is *mi* and the pixel of the output image is *qi*, which are defined by

$$q\_i = \frac{1}{|\omega|} \sum\_{i \in \omega \mathbb{K}} \alpha\_k m\_i + \beta\_k \tag{2}$$

where *k* is the index of the local square window *ω<sup>k</sup>* which is taken as 31 × 31 in the input image, (*αk*, *βk*) are the constants in *ω<sup>k</sup>* which can be obtained by

$$(a\_k, \beta\_k) = \arg\min\_{a\_k} \sum\_{\beta\_k} \left( \left( a\_k m\_i + \beta\_k - m\_i \right)^2 + \varepsilon a\_k^2 \right) \tag{3}$$

where *ε* is the regularization parameter.

#### *4.2. Crystal Image Segmentation*

Since the calibration method uses clear images to compute pixel equivalent, only clear crystals are able to provide accurate size information. In this work, the U-net network model is improved for online crystal images, being also composed of down-sampling path (encoder) and up-sampling path (decoder) [28], as shown in Figure 4. The improved network consists of 11 layer groups. A parametric rectified linear unit (PReLU) [29] is adopted as the activation function to improve the fitting ability. Each Res Block with two 3 × 3 convolutions in the up-sampling path is connected with the corresponding Res Block of the down-sampling path, as shown in Figure 4. The first convolution layer uses a 1 × 1 convolution kernel to extract the features of the input crystal image. Then the designed layer of Res Block is used to further extract image features and deepen the network processing, as shown in Figure 5, which includes the use of two convolutional operations, batch normalization modules and ELU activation functions. After the Res Block in the down-sample path, a 2 × 2 max pooling layer is used to down-sample the feature map, so that the resolution of the feature map decreases to half of the original one. In the up-sampling path, a bilinear interpolation method is used in an up-interpolation block, and the feature map obtained is splicing by layer hopping connection, so that the network can fully fuse the shallow features and deep semantic information. Following the final 1 × 1 convolution layer, the Sigmoid activation function is employed in the last layer to map the response values to (0, 1) pixel by pixel.

**Figure 4.** Improved U-net network structure.

**Figure 5.** Res Block.

To solve the imbalance of positive and negative pixels in the image, the loss function [30] is defined as

$$J\_{\rm loss} = C\_{\rm loss} + D\_{\rm loss} \tag{4}$$

where *C*loss is the cross-entropy loss and *D*loss is the Dice coefficient loss.

Image augmentation is used to increase the training sample size and strengthen the network generalization ability. Random rotation, scale, and translation are used to enhance the image diversity, and the brightness and contrast of the image are adjusted to reduce the influence of uneven lighting and highlight the edge features.

#### *4.3. Crystal Growth Measurement*

The two-dimensional sizes (i.e., length and width) of β-form LGA crystals are measured based on the length and width of the minimum enclosing rectangle [31] for crystal imaging, respectively. If *l*<sup>a</sup> is the pixel number of length and *w*<sup>a</sup> is the pixel number of width, the pixel equivalent *γ*<sup>e</sup> is obtained with the calibration method [10]. The physical length *xl* and the physical width *xw* are given by

$$\begin{cases} \varkappa\_l = \gamma\_\text{e} l\_\text{a} \\ \varkappa\_w = \gamma\_\text{e} w\_\text{a} \end{cases} \tag{5}$$

Based on Equation (5), the 2D crystal sizes can be obtained to produce CSD. For a crystal population with a large number of particles showing statistical characteristics, it is meaningful to estimate their size distribution. Generally, the probability density estimation of a log-normal distribution function can be used to smooth the CSD [32] to represent the current size condition of the crystal population.

For length, conforming to the log-normal distribution LN(*μl*, *σ<sup>l</sup>* 2 , the likelihood function with length variable *xl* is defined as

$$L(\mu\_l, \sigma\_l^2) = \prod\_{i=1}^n \frac{1}{\sqrt{2\pi}\sigma\_l x\_l(i)} \exp\left\{-\frac{\left(\ln x\_l(i) - \mu\_l\right)^2}{2\sigma\_l^2}\right\} \tag{6}$$

The likelihood equations are:

$$\begin{cases} \frac{\partial \ln L(\mu\_l \sigma\_l^2)}{\partial \mu\_l} = \frac{1}{\sigma\_l^2} \sum\_{i=1}^n \left( \ln \mathbf{x}\_l(i) - \mu\_l \right) = 0\\ \frac{\partial \ln L(\mu\_l \sigma\_l^2)}{\partial \sigma\_l^2} = -\frac{n}{2\sigma\_l^2} + \frac{1}{2\sigma\_l^4} \sum\_{i=1}^n \left( \ln \mathbf{x}\_l(i) - \mu\_l \right)^2 = 0 \end{cases} \tag{7}$$

By solving Equation (7), the parameters (*μl*, *σ<sup>l</sup>* <sup>2</sup>) are estimated as

$$\hat{\mu}\_{l} = \frac{1}{n} \sum\_{i=1}^{n} \ln x\_{l}(i) \tag{8}$$

$$\left| \vartheta\_{l}^{\ast} \right|^{2} = \frac{1}{n} \sum\_{i=1}^{n} \left( \ln \chi\_{l}(i) - \frac{1}{n} \sum\_{i=1}^{n} \ln \chi\_{l}(i) \right)^{2} \tag{9}$$

Then LN(*μ*ˆ*l*, *σ*ˆ*<sup>l</sup>* 2 can be computed for denoting the length size distribution of the crystal population by using online images in a predefined time window [8].

The growth parameter of crystal population size is an important factor for crystallization detection. Traditionally, the growth rate in the mean size may not denote the size distribution evolution well, due to the noise of size extremes. Then, *x*max *<sup>l</sup>* in the maximum distribution of *P*(*xl*) is computed as

$$\mathbf{x}\_{l}^{\max} = \arg\max\_{\mathbf{x}\_{l}} P(\mathbf{x}\_{l}) \tag{10}$$

The growth rate of length *Rl* is defined as

$$R\_l = \frac{\text{Diff}(x\_{l,t\_1}^{\text{max}}, x\_{l,t\_2}^{\text{max}})}{T\_{P\_1 P\_2}} \tag{11}$$

where Diff is the difference function between *x*max *<sup>l</sup>*,*t*<sup>1</sup> of *Pt*<sup>1</sup> and *<sup>x</sup>*max *<sup>l</sup>*,*t*<sup>2</sup> of *Pt*<sup>2</sup> , and *TP*1*P*<sup>2</sup> is the time interval between point *t*<sup>1</sup> and point *t*2.

Similar to length, the width size distribution LN(*μw*, *σw*<sup>2</sup> and the growth rate of width *Rw* are obtained as mentioned above.
