*5.1. Robustness against Common Image Attacks*

To prove that the algorithm also has excellent robustness against common image attacks without screen-cam attack, we performed corresponding experiments. The results are shown in Table 2, and the PSNR and mean structural similarity index (MSSIM) [54] values are also listed.

The robustness primarily depends on whether feature points and watermarking information can be simultaneously detected. As shown in Table 2, the algorithm is robust to JPEG attacks, which can mostly survive at a JPEG of 20%. Because scale attacks cause the frame to shrink, we restore the scaled images before detection. The algorithm works when under a scaling 0.5 attack and basically works when under a scaling 0.4 attack. For cropping-off attacks, which refer to a continuous crop from the right in this section, assuming more than one relatively complete embedded LSFR exists, the detection can be successful in theory. Due to the fact that the watermark is repeatedly embedded in each LSFR, we can detect the watermark information at a cropping-off 50% attack in the experiments. The rotation attack may cause the loss of feature points since we only need at least one successful detection, and the algorithm is also effective. The algorithm also works at a median filter 3 × 3 attack. Thus, our watermarking scheme has excellent robustness to common image attacks.


**Table 2.** Performance of image quality and robustness against common attacks.

Note: The underlined coefficient represents failed detection.

#### *5.2. Robustness against Screen-Cam*

In this section, we verify the robustness against a screen-cam attack. First, we compare the proposed method with two existing algorithms [21,33]. Since the size of the host images used in their articles is different from this one, we use the same host images here. In order to improve the independence of the experimental results to the host images, we use additional twelve images from the database [52] to verify the performance. The PSNR values of the images generated by the proposed method are controlled to be not lower than by other methods, which are at around 42 dB. An example of Lena embedded with different methods is shown in Table 3. All the watermarked images are displayed on the screen at the original resolution. The comparison of BER for different shooting conditions is shown in Figure 15. The result shows method [21] designed for print-cam is not applicable for screen-cam process, and the proposed method and method [33] both have good robustness against screen-cam attack.

**Table 3.** Images generated by different methods.

In theory, without considering external interference, the distortion caused by shooting from the horizontal left and horizontal right is similar. Shooting at different vertical angles is also similar to shooting at different horizontal angles with a 90-degree rotation of the image. Therefore, as shown in Figure 16, the shooting angle is set from being perpendicular to the screen up to 60 degrees of horizontal left at intervals of fifteen degrees. The shooting distance is set from 40 to 110 cm at intervals of 10 cm. When the shooting angle is 45 or 60, the shooting distance of 40 cm is too small to capture the entire image. Therefore, the distance is selected to be over 50 cm.

**Figure 15.** Comparison of different methods for different shooting conditions.

The example of Lena images recovered from captured images with different angles and distances and their detected BER by the secret key *K*<sup>1</sup> are shown in Table 4. The detection results of eight images are shown in Figure 16, where the red mark indicates the camera position relative to the screen and the dotted straight line indicates the shooting direction.

As shown in Figure 16, when the horizontal shooting angle is lower than 30 degrees, watermarks are mostly detected successfully. When the horizontal shooting angle is 45 degrees, the watermark can be detected within a shooting distance of 90 or 100 cm. For a large shooting angle of 60 degrees, the image cannot be well focused. Thus, the watermark information can commonly be detected within a closer shooting distance, which is approximately 70 or 80 cm.

**Figure 16.** Watermark detection results against screen-cam attack. (**a**) Lena. (**b**) Baboon. (**c**) Airplane (**d**) Peppers. (**e**) Building. (**f**) Pentagon. (**g**) White House. (**h**) Naval base.

We also tested the performance at other tilt shooting angles with a handhold shooting, as shown in Table 5; it also has excellent performance. Therefore, the proposed algorithm is robust to a screen-cam attack.


**Table 5.** Examples of Handhold Shooting.

#### *5.3. Robustness against Screen-Cam with Additional Common Attacks*

The scheme in [33] needs to record the four vertices, which means it needs to know the original size. Furthermore, the scheme in [21] cannot deal with the cropping attack. However, in a real-life scenario, images may under common image processing attacks caused by normal user operations. Therefore, we experimented with several hypothetical scenarios to verify the effectiveness of the proposed algorithm for screen-cam with additional common attacks. We designed four realistic application scenarios where method [21,33] are not applicable: (a) the Lena image is blocked by the window at 20 percent, which is equal to being cropped; (b) the Peppers image is rotated five degrees and cropped; (c) the Building image is scaled by 80%; (d) the Pentagon image is scaled by 80% and rotated 90 degrees counterclockwise. An example of the four scenarios is shown in Table 6. When doing

the watermark detection, assume that we do not know the specifics of the attacks, which means we do not correct the image to its original scale or original orientation manually. The coordinate points that are used for perspective correction are denoted in Table 6 as red dots.


**Table 6.** Examples of Four Hypothetical Scenarios.

Figure 17 shows the detection results of the four scenarios. The construction of Figure 17 is the same as Figure 16. Furthermore, due to the different sizes of the experimental images, the shooting distance was adjusted accordingly. Because Scenario (a) and Scenario (b) use the four corner points of the screen for perspective correction, the experiment shooting distance starts from 50 cm. In these two scenarios, the performance of watermark detection is the same as the detection results of the same host images in Section 5.2. In Scenario (c) and Scenario (d), because the images are scaled, the test starting shooting distance can be shortened, and the effective detection distance is also shortened. When the shooting angle is 15 and 30 degrees, the watermark information can be detected at all shooting distances in the experiments. As the shooting angle increases, the detectable shooting distance is substantially reduced. Watermark information can be detected within a shooting distance of 50 cm when the horizontal shooting angle is 60 degrees. Thus, the scaling of the images has a considerable influence on the watermark detection of the large angle captured image, but it can still meet the actual needs. These results verified the fact that the proposed scheme can handle screen-cam with common attacks.

**Figure 17.** Watermark detection results against screen-cam with common attacks. (**a**–**d**) represent scenarios (**a**–**d**), respectively.

#### *5.4. Applicability and Limitations Analysis*

The proposed scheme works well for most types of images, but it inevitably has limitations. Feature point-based algorithms are limited by the feature point operator itself. For images with simple texture, the feature points are often unstable when under a severe image quality degradation. Therefore, for images with simple texture, the proposed method may not achieve accurate watermark synchronization, which will probably cause watermark detection to fail.

Another limitation is that the proposed scheme is not applicable to this situation, where the image displayed on the screen is greatly zoomed out before we capture it with a camera. Because in this case, the image displayed on the screen is resampled, which will cause a massive loss of image details. Unfortunately, the screen-cam process will amplify this distortion. Especially for high-resolution images, the users are most likely to zoom out to view the entire image. Therefore, the proposed scheme could be used with access control systems or other specific applications to avoid this situation.

Furthermore, because the motivation of this method is to hold accountability for leakage behavior, the time complexity of algorithm is not a very important consideration. However, in other words, time complexity is also one of our limitations. The computation time of watermark embedding includes two parts: LSFRs construction and message embedding. Based on a personal computer, which CPU is Intel Core i7–9700 CPU and RAM is 32 GB, the average computation time of LSFRs construction and message embedding for the host images are 7.041 s and 0.106 s, respectively. The Harris–Laplace operation involves multiscale and iterative calculations, which cost most of the computation time. Based on the algorithm, the time complexity of embedding algorithm is *O*(*Length* ·*Width*), where *Length* and *Width* define the length and width of the image, respectively. Hence, for high-resolution images, the computation time will vary according to their size. With respect to watermark detection, the process of finding candidate LSFRs is similar to the process of constructing LSFRs. Although the message extraction process iterates the message extraction algorithm within our defined detection range, the computation time is still insignificant compared with the process of finding candidate LSFRs. Hence, after the manual perspective correction process, the time complexity of watermark detection is similar to watermark embedding. Therefore, considering the user experience, the algorithm is not recommended for real-time applications for now.

### **6. Conclusions**

In this paper, a novel feature and Fourier-based screen-cam robust watermarking scheme is proposed. The distortions during the screen-cam process are analyzed. To resist possible desynchronization attacks caused by user operations and the screen-cam process, an LSFR construction method, based on the modified Harris–Laplace detector and SURF orientation descriptor, is designed to achieve watermark synchronization. In the proposed message embedding scheme, we repeatedly embed the message sequence in the DFT domain of each selected LSFR to achieve robustness against the screen-cam process. To decrease the quality degradation after embedding and improve the extraction accuracy, we employ a non-rotating embedding method and a preprocessing method to modulate the DFT magnitude coefficients. On the extraction side, we restore the captured image based on the size of the image itself to help improve the detection accuracy. The experiment shows that the proposed scheme has high robustness for common image attacks and screen-cam attacks. Compared with existing methods, the proposed scheme can further achieve robustness against screen-cam with additional common attacks.

In future research, we aim to investigate automatic detection methods, which is a more practical application foreground. To achieve this goal, screen-cam robust invariants should be further investigated to help design novel local feature-based watermark synchronization methods or develop novel synchronization watermark message embedding and automatic detection methods.

**Author Contributions:** Conceptualization, W.C. and N.R.; methodology, W.C., N.R. and C.Z.; software, W.C. and Q.Z.; data curation, W.C., T.S. and A.K.; writing—original draft preparation, W.C. and N.R.; writing—review and editing, Q.Z., T.S. and A.K.; funding acquisition, C.Z. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the National Natural Science Foundation of China, grant number 42071362 and 41971338, the Natural Science Foundation of Jiangsu Province, grant number BK20191373.

**Conflicts of Interest:** The authors declare no conflict of interest.
