Figure 1.
The graphical model. The blue arrows indicate the direction of feature propagation when training the network. The orange arrow depicts the basic generation task. The guided generation process is indicated by the purple arrow. Input: moving image , fixed image , and diffused fixed image . Output: deformed image , generated image , and guided generated image .
Figure 1.
The graphical model. The blue arrows indicate the direction of feature propagation when training the network. The orange arrow depicts the basic generation task. The guided generation process is indicated by the purple arrow. Input: moving image , fixed image , and diffused fixed image . Output: deformed image , generated image , and guided generated image .
Figure 2.
Architecture of the registration network. The number of output channels is denoted as , where a corresponds to 2D tasks and b corresponds to 3D tasks. For 3D image registration, only two CRBlocks are used before the output. Among the various residual blocks, only the first CRBlock adjusts the channel size of the features, while the second CRBlock maintains the same channel size. The LeakyReLU activation function is used with a parameter of 0.2 for all CRBlocks in the experiment. Moreover, all CRBlocks preserve the feature size and only adjust the number of channels, that is, using convolution layers with a kernel size of 3, stride of 1, and padding of 1. In addition, time embedding is employed to project the time steps and embed temporal information. A scaling factor of 1/2 is chosen in the encoding phase for the low-pass filtering operation, and a 2× low-pass filtering operation is performed in the decoding phase. Moreover, linear interpolation is performed for all interpolation operations. Because the deepest feature size may be smaller than a pixel, we still assign it as a pixel. In the decoding path, following the idea of super-resolution, encoded and decoded features of the same scale are concatenated and then fed into the CRBlocks to enhance the image sharpness and retain more detailed features.
Figure 2.
Architecture of the registration network. The number of output channels is denoted as , where a corresponds to 2D tasks and b corresponds to 3D tasks. For 3D image registration, only two CRBlocks are used before the output. Among the various residual blocks, only the first CRBlock adjusts the channel size of the features, while the second CRBlock maintains the same channel size. The LeakyReLU activation function is used with a parameter of 0.2 for all CRBlocks in the experiment. Moreover, all CRBlocks preserve the feature size and only adjust the number of channels, that is, using convolution layers with a kernel size of 3, stride of 1, and padding of 1. In addition, time embedding is employed to project the time steps and embed temporal information. A scaling factor of 1/2 is chosen in the encoding phase for the low-pass filtering operation, and a 2× low-pass filtering operation is performed in the decoding phase. Moreover, linear interpolation is performed for all interpolation operations. Because the deepest feature size may be smaller than a pixel, we still assign it as a pixel. In the decoding path, following the idea of super-resolution, encoded and decoded features of the same scale are concatenated and then fed into the CRBlocks to enhance the image sharpness and retain more detailed features.
Figure 3.
Comparison results for 2D facial expression grayscale image registration. Original images (left two columns), deformed images (middle four columns), deformation fields (right four columns), and NMSE/SSIM values for grayscale image registration. Top: Fearful front gaze (moving) to surprised front gaze (fixed). Bottom: Disgusted left gaze (moving) to happy left gaze (fixed).
Figure 3.
Comparison results for 2D facial expression grayscale image registration. Original images (left two columns), deformed images (middle four columns), deformation fields (right four columns), and NMSE/SSIM values for grayscale image registration. Top: Fearful front gaze (moving) to surprised front gaze (fixed). Bottom: Disgusted left gaze (moving) to happy left gaze (fixed).
Figure 4.
Comparison results for 2D facial expression RGB image registration. From top to bottom: surprised front gaze (moving) to fearful front gaze (fixed); sad left gaze (moving) to angry left gaze (fixed). The NMSE/SSIM values below correspond to grayscale image registration.
Figure 4.
Comparison results for 2D facial expression RGB image registration. From top to bottom: surprised front gaze (moving) to fearful front gaze (fixed); sad left gaze (moving) to angry left gaze (fixed). The NMSE/SSIM values below correspond to grayscale image registration.
Figure 5.
Visualization of 2D facial grayscale image generation results. Original image (left), guided generated images with (middle), and guided generated images with and (right), where in represents using a center mask with a width of 40 for smoothing operations. From top to bottom: sad front gaze (moving) to neutral right gaze (fixed), contemptuous front gaze (moving) to angry right gaze (fixed), and happy front gaze (moving) to disgusted left gaze (fixed).
Figure 5.
Visualization of 2D facial grayscale image generation results. Original image (left), guided generated images with (middle), and guided generated images with and (right), where in represents using a center mask with a width of 40 for smoothing operations. From top to bottom: sad front gaze (moving) to neutral right gaze (fixed), contemptuous front gaze (moving) to angry right gaze (fixed), and happy front gaze (moving) to disgusted left gaze (fixed).
Figure 6.
Comparison of 2D facial expression grayscale image registration and generation results, showing moving–fixed expressions. From top to bottom: contemptuous front gaze–neutral front gaze; angry left gaze–contemptuous left gaze; happy front gaze–contemptuous front gaze; angry front gaze–sad front gaze.
Figure 6.
Comparison of 2D facial expression grayscale image registration and generation results, showing moving–fixed expressions. From top to bottom: contemptuous front gaze–neutral front gaze; angry left gaze–contemptuous left gaze; happy front gaze–contemptuous front gaze; angry front gaze–sad front gaze.
Figure 7.
Continuous registration results for cardiac MRI images. Visualization of patient No. 35 with hypertrophic cardiomyopathy. Left column: original image in . Middle columns: registration results obtained by deforming the moving image to the image. Right columns: corresponding deformation fields.
Figure 7.
Continuous registration results for cardiac MRI images. Visualization of patient No. 35 with hypertrophic cardiomyopathy. Left column: original image in . Middle columns: registration results obtained by deforming the moving image to the image. Right columns: corresponding deformation fields.
Figure 8.
Continuous registration results for cardiac MRI images. Visualization of fixed images for registration from ED to ES phase of a normal subject (No. 110) on the left. The middle columns show the registration results and the right columns display the corresponding deformation fields.
Figure 8.
Continuous registration results for cardiac MRI images. Visualization of fixed images for registration from ED to ES phase of a normal subject (No. 110) on the left. The middle columns show the registration results and the right columns display the corresponding deformation fields.
Figure 9.
ED–ES registration results for different pathological cases. Cases 70, 120, 10, 40, and 85, respectively representing NOR, MINF, DCM, HCM, and RV, are selected for display.
Figure 9.
ED–ES registration results for different pathological cases. Cases 70, 120, 10, 40, and 85, respectively representing NOR, MINF, DCM, HCM, and RV, are selected for display.
Table 1.
Numerical results for grayscale facial image registration.
Table 1.
Numerical results for grayscale facial image registration.
Method | NMSE × | SSIM | PSNR | |Ja(d)|≤ 0 |
---|
Origin | 0.301 (0.213) | 0.668 (0.100) | 19.692 (3.172) | |
DM | 0.279 (0.100) | 0.643 (0.065) | 19.210 (1.347) | 0.424 (0.047) |
VM | 0.098 (0.037) | 0.828 (0.058) | 23.770 (1.664) | 0.506 (0.013) |
VM-diff | 0.103 (0.038) | 0.823 (0.060) | 23.567 (1.647) | 0.510 (0.021) |
UNIRG | 0.049 (0.030) | 0.859 (0.054) | 27.280 (2.617) | 0.500 (0.026) |
Table 2.
Evaluation results of cardiac MRI image registration.
Table 2.
Evaluation results of cardiac MRI image registration.
Phase | Method | NMSE | PSNR | |Ja(d)|≤ 0 |
---|
| Origin | 0.102 (0.168) | 26.307 (6.149) | |
| DM | 0.112 (0.162) | 24.330 (4.775) | 0.498 (0.001) |
| VM | 0.091 (0.178) | 30.068 (6.863) | 0.501 (0.005) |
| VM-diff | 0.095 (0.179) | 28.947 (6.622) | 0.501 (0.006) |
| UNIRG | 0.079 (0.159) | 31.162 (6.690) | 0.506 (0.009) |
| Origin | 0.135 (0.163) | 22.644 (4.292) | |
| DM | 0.144 (0.158) | 21.915 (3.720) | 0.498 (0.001) |
| VM | 0.092 (0.165) | 27.540 (5.251) | 0.498 (0.004) |
| VM-diff | 0.099 (0.166) | 26.499 (4.938) | 0.499 (0.006) |
| UNIRG | 0.079 (0.149) | 28.914 (5.117) | 0.502 (0.009) |
Table 3.
Evaluation results of cardiac MRI mask registration.
Table 3.
Evaluation results of cardiac MRI mask registration.
Method | Dice | PSNR | NMSE | Time |
---|
Origin | 0.708 (0.184) | 10.753 (1.745) | 0.197 (0.091) | |
DM | 0.708 (0.182) | 10.691 (1.671) | 0.202 (0.090) | 0.533 (0.549) |
VM | 0.770 (0.145) | 11.500 (1.981) | 0.279 (0.311) | 0.160 (0.441) |
VM-diff | 0.786 (0.139) | 11.986 (1.855) | 0.235 (0.233) | 0.182 (0.233) |
UNIRG | 0.795 (0.124) | 12.050 (2.126) | 0.227 (0.160) | 0.516 (0.536) |
Table 4.
Comparison among UNIRG variants.
Table 4.
Comparison among UNIRG variants.
ResConv | ResAtten | Resizer | Dice | PSNR | NMSE |
---|
| | | 0.791 (0.132) | 11.943 (2.191) | 0.255 (0.245) |
✓ | | | 0.783 (0.145) | 11.949 (2.286) | 0.275 (0.335) |
| ✓ | | 0.789 (0.140) | 11.953 (2.187) | 0.284 (0.413) |
| | ✓ | 0.790 (0.129) | 11.975 (2.132) | 0.241 (0.187) |
✓ | ✓ | | 0.789 (0.137) | 11.962 (2.267) | 0.254 (0.235) |
✓ | | ✓ | 0.794 (0.122) | 11.983 (2.064) | 0.230 (0.164) |
| ✓ | ✓ | 0.789 (0.139) | 12.005 (2.254) | 0.272 (0.348) |
✓ | ✓ | ✓ | 0.795 (0.124) | 12.050 (2.126) | 0.227 (0.160) |