**A Computationally E**ffi**cient Mean Sound Speed Estimation Method Based on an Evaluation of Focusing Quality for Medical Ultrasound Imaging**

#### **Jaejin Lee 1, Yangmo Yoo 1,2, Changhan Yoon 3,\* and Tai-kyong Song 1,\***


Received: 4 October 2019; Accepted: 16 November 2019; Published: 18 November 2019

**Abstract:** Generally, ultrasound receive beamformers calculate the focusing time delays of fixed sound speeds in human tissue (e.g., 1540 m/s). However, phase distortions occur due to variations of sound speeds in soft tissues, resulting in degradation of image quality. Thus, an optimal estimation of sound speed is required in order to improve image quality. Implementation of real-time sound speed estimation is challenging due to high computational and hardware complexities. In this paper, an optimal sound speed estimation method with a low-cost hardware resource is presented. In the proposed method, the optimal mean sound speed is determined by measuring the amplitude variance of pre-beamformed radio-frequency (RF) data. The proposed method was evaluated with phantom and in vivo experiments, and implemented on Virtex-4 with Xilinx ISE 12.4 using VHDL. Experiment results indicate that the proposed method could estimate the mean optimal sound speed and enhance spatial resolution with a negligible increase in the hardware resource usage.

**Keywords:** phase aberration; receive beamforming; spatial resolution; sound speed estimation

#### **1. Introduction**

Digital dynamic receive beamforming has been adopted for improving spatial resolution and contrast-to-noise ratios (CNRs) in medical ultrasound imaging [1–3]. In digital receive beamforming, a constant sound speed in human tissue (e.g., 1540 m/s) is typically assumed when generating dynamic receive focusing phase delays. However, phase distortions are introduced due to the variations of sound speeds that occur in soft tissue, leading to defocusing and consequent degradations in image quality [4]. Moreover, the degradation of image quality can significantly reduce diagnostic capability in breast or obese patient imaging, since the sound speed in fatty tissues (e.g., 1450 m/s) is lower than assumed value (e.g., 1540 m/s) [5].

Various phase aberration correction methods have previously been proposed to compensate for the phase distortions [6–13]. Representatively, cross-correlation-based [6–9] and speckle-brightness-based [10] phase aberration correction methods have been proposed. Recently, methods combined with adaptive beamforming methods have been proposed [11–13] that can improve the quality of whole images, but the implementation of these methods is still challenging due to their high computational complexity [13,14]. In the nearest-neighbor cross-correlation (NNCC) method [6,12,13], the number of multiplications is expressed approximately as *Nmult* = (*N* − 1) × *Kimage* × *Limage* × *M*, where *N, Limage* and *Kimage* are the number of channels, scanlines and samples per a scanline in the whole image, respectively, and *M* is the total number

of samples that contribute to the cross-correlation function. For an abdominal image with depth of 160 mm, the number of multiplications is approximately 1.3 billion × *M* when *N* =128, *Limage* = 256 and *Kimage* = 4k; thus, the implementation of these methods in real time would be challenging.

Using the correct mean sound speed can be an alternative solution for minimizing the phase distortions and enhancing the image quality in medical ultrasound imaging. Various algorithms for estimating sound speed have been proposed [14–20], and beam-tracking-based sound speed estimation methods using two transducers has been proposed [15,16]. This method can provide accurate local sound speeds; however, its clinical application is limited since the method uses two transducers. A direct estimation method that estimates the sound speed based on best-fit of one-way geometric delay patterns from the pre-beamformed radio-frequency (RF) channel data has also been proposed [17,18]. However, the performance of this method is sensitive to the transducer array geometry (i.e., the transducer's position) [18].

More recently, region-of-interest (ROI)-based optimal mean sound speed estimation methods have been proposed as a viable solution [14,19,20]. In these methods, the optimal mean sound speed that can produce the best focusing performance in the ROI is estimated rather than the actual sound speed in specific tissue type. The optimal mean sound speed is estimated using a lateral spatial fast Fourier transform (FFT) magnitude of beamformed data as the focusing quality factor [19]. Phase variations at each pre-beamformed RF channel data and coherent factors have been proposed as focusing quality factors to estimate the mean sound speed [14,20]. These methods have shown that image quality in the ROI can be improved by using optimal the mean sound speed in beamforming.

In this paper, we present a hardware-efficient optimal mean sound speed estimation method in which the focusing quality factor is measured by computing the minimum average sum of the absolute difference (MASAD) of pre-beamformed RF channel data, and thus enhance the spatial resolutions in medical ultrasound imaging. The proposed method estimates the mean sound speed that can provide improved image quality in the ROI for real-time imaging rather than full phase aberration correction. The proposed method was evaluated with phantom and in vivo experiments, and was implemented on a Virtex-4 FPGA (Field Programmable Gate Array) chip (Xilinx Instrument, San Jose, CA) with Xilinx ISE 12.4 using VHDL.

#### **2. Materials and Methods**

#### *2.1. Minimum Average Sum of Absolute Di*ff*erence (MASAD)*

In ultrasound imaging systems, the dynamic receive focusing delays are computed based on the geometry of an ultrasound transducer and receive focusing points to adjust phase differences at each channel. The dynamic receive focusing delay of the *nth* element at (*xn*, *zn*), for a focal point at (*x*, *z*), is calculated by

$$\pi\_n = \frac{\sqrt{(x - x\_n)^2 + (z - z\_n)^2} + R}{c} \tag{1}$$

where *R* is a distance between the transducer center and the focal point, and *c* is the assumed sound speed in soft tissues (e.g., 1540 m/s) [21]. After applying receive focusing phase delay in Equation (1), all RF channel data are coherently aligned when the assumed fixed sound speed is equal to the actual sound speed in a propagation medium. However, when the sound speed of a medium is different from the assumed one, the phase distortion cannot be avoided even when the receive dynamic focusing is employed. Since the phase is directly related to the amplitude change of the RF data, these phase distortions cause the amplitude variations [7].

In the proposed method, the optimal mean sound speed is determined by computing the minimum average amplitude variance between all RF channel data after applying the receive dynamic focusing in the region of interest (ROI). Furthermore, we utilized the sum of the absolute difference in place of computing the variance to reduce the computational complexity. Thus, the cost function of sound speed estimation is defined by

$$c\_{opt} = \underset{c = c + c\_{inv}}{\arg\min} \left[ \frac{1}{LK} \sum\_{l=0}^{L-1} \sum\_{k=0}^{K-1} \sum\_{n=0}^{N-1} \left| X\_n(l,k) - X\_{\text{mean}}(l,k) \right| \* w(n) \right] \tag{2}$$

where *N, L* and *K* are the number of channels, scanlines and focal points per a scanline in the ROI, respectively. *Xn* (*l*, *k*) is the delayed RF data of the *nth* element for the *kth* focal point at the *l th* scanline in the ROI, *Xmean* (*l*, *k*) is the mean value of the delayed RF data and *w*(*n*) is the window function. The ROI can be set to around or beyond a transmit focusing point to minimize the effect from phase distortion in the near field. In the proposed estimation method, the pre-beamformed RF data in the ROI are first captured after freezing the image upon a user's request. With an initial sound speed, the receive focusing delays are calculated using Equation (1). The focusing delays are applied to the captured RF data, and the average sum of the absolute difference (ASAD) in Equation (2) is computed. As indicated in Equation (2), the ASAD values are iteratively computed for the RF data in the ROI while changing the sound speed, and the sound speed providing the minimum ASAD (MASAD) value is determined as an optimal mean sound speed. Then, the estimated optimal mean sound speed is applied to subsequent real-time ultrasound beamforming to achieve the enhanced image quality.

Figure 1a shows a block diagram of a conventional dynamic receive beamformer (DRB) based on fractional delay beamforming architecture [22] with the proposed MASAD block, which is shaded gray. Pre-beamformed RF channel data from analog-to-digital converter (ADC) were 12 bit, which is typically used in medical ultrasound imaging systems. The signal to quantization noise ratio (SQNR) of typical ultrasound RF data is 74 dB, which is defined by SQNR = 6.02*b* + 1.76(*dB*), where *b* is number of bits of ADC (i.e., 12 bit) [23]. The block diagram of the proposed MASAD is shown in Figure 1b. As shown in Figure 1b, the MASAD block can be implemented with *N*+*2* register, *N*+*2* adder, *N* absolute and *N* multiplier where *N* is the number of channels in the receive beamformer. To implement the proposed MASAD, input registers (pre-beamformed RF data), window coefficients and output register were 12 bits (12 integral bits), 8 bits (1 signed bit and 7 fractional bits) and 28 bits (21 integral bits and 7 fractional bits), respectively. Since we calculated the ASAD values without any truncation, the error between floating point calculation by PC and fixed point calculation by FPGA was less than 0.03%. The additional hardware for the MASAD block is not be a significant burden to the ultrasound imaging systems.

**Figure 1.** *Cont*.

**Figure 1.** (**a**) Block diagram of the conventional dynamic receive beamformer (DRB) with the proposed MASAD and (**b**) the detailed block diagram of MASAD.

#### *2.2. Experiment Setup and Evaluation Metrics*

To evaluate the proposed MASAD method, an ultrasound research system (Vantage 128, Verasonics Inc., Redmond, WA, USA) was used to obtain pre-beamformed RF data using a 128-element convex array transducer (C5-2v, Ultrasonix Inc., British Columbia, Canada). The center frequency and sampling rate were 3.7 and 14.8 MHz, respectively. For the homogeneous medium experiment, a tissue-mimicking phantom (ATS 539, ATS Laboratories Inc., Bridgeport, CT, USA) with a sound speed of 1450 m/s (±1% error) was used. In the inhomogeneous medium experiment, a tissue-mimicking phantom (040 GSE, CIRS Inc, Norfolk, VA, USA), for which the sound speed was 1540 m/s (±1% error), was immersed in deionized water and pre-beamformed RF data were acquired. Note that the sound speed of deionized water at room temperature is 1480 m/s [24,25]. The in vivo data were acquired from the liver of a healthy volunteer. To compute the MASAD, the sound speed was changed from 1400 m/s (*cinit*) to 1600 m/s (*cmax*) in 10 m/s (*cincr*) increments for each iteration process.

For quantitative evaluation, the lateral resolution was measured with the full width at half maximum (FWHM) for each of the two cases where the conventional sound speed (i.e., 1540 m/s) and the estimated one are employed. The contrast-to-noise ratio (CNR) values were computed for the cyst region by [26]

$$\text{CNR} = \frac{\left| \mu\_{\text{c}} - \mu\_{\text{s}} \right|}{\sqrt{\left(\sigma\_{\text{c}}^{2} + \sigma\_{\text{s}}^{2}\right)}} \tag{3}$$

where μ*<sup>s</sup>* and μ*<sup>c</sup>* are the average intensities in the speckle and cyst regions, respectively, and σ<sup>2</sup> *<sup>s</sup>* and σ<sup>2</sup> *c* are the variances at each region.

The proposed MASAD sound speed estimation method was implemented on a FPGA (Virtex 4, Xilinx, San Jose, CA, USA) chip. In the experiments, the captured pre-beamformed RF data within a ROI were loaded in the custom-built FPGA platform [27], and then computed ASAD values were transferred to a PC to estimate the optimal sound speed. The hardware complexity was estimated by using Xilinx's ISE 12.4.

#### **3. Results and Discussion**

The results from homogeneous phantom experiments are shown in Figure 2. Figure 2a shows the B-mode image with a conventional sound speed (i.e., 1540 m/s); ROIs are indicated with white boxes to compute ASAD. The ROIs were selected around the transmit focusing point at 80 mm, and one ROI had a strong reflector while the other was a speckle region. Figure 2c shows the normalized ASAD values for each ROI. As can be seen, the estimated mean sound speed was 1460 m/s, which was within the fabrication error (i.e., 1450 m/s (±1% error)). The image with the estimated sound speed is shown in Figure 2b. Under the visual assessment, the improved spatial resolution can be readily identified when the estimated sound speed is used in dynamic receive beamforming. For quantitative comparison, the FWHM from the point target in the ROI was measured and showed 1.55 mm for the conventional sound speed and 1.05 mm for estimated one, indicating that the FWHM was improved by 33. 3% using the proposed method. The CNR of the cyst indicated with an arrow in Figure 2a was measured for each image, which were 2.83 in Figure 2a and 4.20 in Figure 2b; the CNR was improved by 48.4%. This result demonstrates that a higher CNR could be achieved using the estimated sound speed in beamformation.

**Figure 2.** Experiment results of the homogeneous medium. B-mode image with a sound speed of (**a**) 1540 m/s, (**b**) 1460 m/s and (**c**) normalized ASAD values for the region of interest (ROI) of the point target (solid line) and speckle regions (dotted line).

The results of the inhomogeneous medium experiment are shown in Figure 3. In this experiment, the sound speeds were estimated for the ROIs with and without a strong reflector, which are indicated in Figure 3a. The ADAS values for each ROIs as a function of sound speed are plotted in Figure 3c,d. As shown in Figure 3c,d, the same sound speed (i.e., 1490 m/s) was estimated for ROI-1, -2 and 3 using the proposed MASAD, and 1500 m/s was estimated for ROI-4. The actual sound speeds for each ROI were calculated by *c* = *dw* + *dp cwcp*/ *cpdw* + *cwdp* , where *dw* is the propagation distance in water (i.e., 50 mm), *dp* is the phantom (i.e., 10, 20, 30 and 40 mm) for each ROI; *cw* is the sound speed in water (i.e., 1480 m/s) and *cp* is the phantom (i.e., 1540 m/s); and the actual sound speeds for each ROI were 1490, 1497, 1502 and 1506 m/s, respectively. These discrepancies between the estimation results and actual sound speeds were within the range of fabrication error (i.e., ±1%). The B-mode image with the estimated sound speed (i.e, 1490 m/s) is shown in Figure 3b. As can be seen, the image with the estimated sound speed yielded a better quality image than the image obtained with a conventional

sound speed (i.e., 1540 m/s). The FWHMs for ROI-1, -2, -3 and -4 with the estimated sound speed were 0.76, 0.69, 0.79 and 1.11 mm, respectively, while those with the conventional sound speed were 1.85, 1.34, 2.1 and 2.21 mm, respectively. The CNR values from the cyst indicated with a white arrow in Figure 3a were 1.89 and 2.53 for sound speeds of 1540 and 1490 m/s, respectively. The improvement of FWHM was 54.90% on average and CNR was improved by 33.9%.

**Figure 3.** Experiment results of the inhomogeneous medium. B-mode image with a sound speed of (**a**) 1540 m/s and (**b**) 1490 m/s, and normalized ASAD values for ROI-1, -2, -3 and -4 for the (**c**) point target and (**d**) speckle regions.

Figure 4a,b shows the liver images constructed by using the conventional and estimated sound speeds, respectively. The ASAD as a function of sound speed is plotted in Figure 4c. As can be seen, the sound speed of 1580 m/s yielded MASAD, which is within the typical range of sound speed in human livers (i.e.,1550–1600 m/s) [28]. To clearly show the improvement of image quality, the profiles of the blood vessel wall in the white box are plotted in Figure 4d. The image with the estimated sound speed of 1580 m/s produced shaper edges than that with the conventional sound speed (i.e., 1540 m/s). The wall thickness with the conventional sound speed was 2.18 mm while it was 1.60 mm for the estimated one, indicating that the lateral resolution was improved by 26.6%. The CNR values inside the blood vessel improved 22.0%, and were 1.27 and 1.55 for the conventional and estimated sound speeds, respectively.

**Figure 4.** In vivo experiment results for the liver. B-mode image with sound speeds of (**a**) 1540 m/s and (**b**) 1580m/s, and (**c**,**d**) normalized ASAD values for the ROI. Cross-section of the vessel wall indicated by a dotted line in Figure 4a.

The time for estimating an optimal sound speed depends on the size of the ROI and is expressed as *test* = *N* × *L* × *K* × *Niter*/ *fclk*, where *Niter* is the number of iteration processes and *fclk* is the system clock used in the ultrasound imaging system. In the above experiments, we used *L* = 10, *K* = 200, *N* = 128, *Niter* = 20 and *fclk* = 40 Mhz, and the processing time was 0.128 s. Note that the estimation of the optimal sound speed was conducted to frozen images upon user request and subsequent beamforming was performed using the estimated sound speed; thus, the time for the processing would not affect the real-time operation.

The hardware complexity of the 32-channel dynamic receive beamformer with the MASAD block was estimated by utilizing the Xilinx ISE 12.4. The hardware utilization of the conventional DRB and the proposed DRB with the MASAD block are summarized in Table 1. As listed in Table 1, the hardware utilization of the proposed DRB with the MASAD slightly increased over the conventional DRB; 0.8% and 0.7% in slice LUTs and slice flip flops, respectively. These results indicate that the proposed method is capable of substantially improving the spatial resolution in medical ultrasound imaging and can be implemented with nearly a negligible increase in hardware complexity.


**Table 1.** Hardware resource of conventional and the proposed beamformers.

#### **4. Conclusions**

In this paper, a hardware-efficient mean sound speed method based on the minimum average sum of the absolute difference was presented for enhancing the spatial resolutions of medical ultrasound imaging. From the phantom and in vivo experiments, FWHM and CNR improved by an average of 46.41% and 134.77%, respectively. The proposed method demonstrated that it can improve spatial resolution with a negligible increase in hardware complexity. We believe that the proposed method would be a viable solution for estimating optimal sound speeds and could provide improved image quality. Further experiments in various clinical environments should be followed to validate the performance of proposed method.

**Author Contributions:** Conceptualization, supervision, and funding Acquisition, C.Y. and T.-K.S.; methodology, J.L. and C.Y.; validation, J.L. Y.Y. and C.Y.; writing—original draft preparation, J.L. and C.Y.; writing—review and editing, C.Y., Y.Y. and T.-K.S.

**Funding:** This work was supported in part by a National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIP; Ministry of Science, ICT & Future Planning) (No. 2019R1A2C1089813) and the R&D program of MOTIE/KEIT (10076675, Development of MR Based High Intensity Focused Ultrasound Systems for Brain and Urinogenital Diseases).

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
