Next Article in Journal
Hazard Analysis for Escalator Emergency Braking System via System Safety Analysis Method Based on STAMP
Previous Article in Journal
Interaction of Cold Atmospheric Argon and Helium Plasma Jets with Bio-Target with Grounded Substrate Beneath
 
 
Article
Peer-Review Record

Exposure Bracketing Techniques for Camera Document Image Enhancement

Appl. Sci. 2019, 9(21), 4529; https://doi.org/10.3390/app9214529
by Tao Liu 1, Hao Liu 1,2, Yingying Wu 1, Bo Yin 1,2,* and Zhiqiang Wei 1,2,*
Reviewer 2: Anonymous
Appl. Sci. 2019, 9(21), 4529; https://doi.org/10.3390/app9214529
Submission received: 31 July 2019 / Revised: 15 October 2019 / Accepted: 18 October 2019 / Published: 25 October 2019
(This article belongs to the Section Computing and Artificial Intelligence)

Round 1

Reviewer 1 Report

The paper addresses an interesting research problem and is well written. However, there are typos here and there, which can be removed by a careful proofread. And the authors have failed to make a link with the vast amount of research literature by the document image analysis community.

 

Following are the two weaknesses of the paper:

There is no literature review and it’s difficult to see how the authors place their work vis-à-vis the existing works? The authors are welcome to check the works of Document Image Analysis community (principally ICDAR conference) and CBDAR community (CBDAR workshop on camera-based document analysis and recognition). Following can be the starting point:

Luqman, M.M., Gomez-Krämer, P. and Ogier, J.M., 2013, August. Mobile phone camera-based video scanning of paper documents. In International Workshop on Camera-Based Document Analysis and Recognition (pp. 164-178). Springer, Cham.

 

Experimentation seems to be limited. What are you trying to achieve by this work on image enhancement? If the objective is to achieve 100% OCR accuracy, the method is failed. If the objective is to have an enhanced image, you need to show is by choosing a more relevant experimental evaluation. Also, please provide more details on the dataset. How it is captured and what are its characteristics?

 

Are the authors willing to make the dataset and code available for academic research?

Author Response

Following are the two weaknesses of the paper:

There is no literature review and it’s difficult to see how the authors place their work vis-à-vis the existing works? The authors are welcome to check the works of Document Image Analysis community (principally ICDAR conference) and CBDAR community (CBDAR workshop on camera-based document analysis and recognition). Following can be the starting point:

In our new edition, we add the following paragraphs (line 89-102):

“In the context of document image exposure bracketing, the literature we found is really limited. The only paper that discloses document image exposure bracketing techniques can be found in paper [4], where an adaptive image fusion is used. However, there are two limitations in this paper: 1) the author assumes that document images are captured with a fixed camera and hence there is no need to perform image registration. As we will discuss in the next section, projective distortion often happens when capturing document images, and there are geometric disparities among captured images. 2) The author discusses and compares different image fusion strategies, but tone mapping techniques are missing from their comparison. We think it is necessary to incorporate tone mapping in the processing chain as it is one of the mainstream methods in exposure bracketing. It is true there are several papers that discuss about document image registration problem. For example, in paper [5] the classic keypoint feature descriptor-based image registration method is combined with mobile phone sensor data such as the accelerometer and gyroscope sensor data to perform image mosaic reconstruction. However, their registration is more suitable for the image mosaic reconstruction purpose while the registration method we are using is more for exposure bracketing.”

Experimentation seems to be limited. What are you trying to achieve by this work on image enhancement? If the objective is to achieve 100% OCR accuracy, the method is failed. If the objective is to have an enhanced image, you need to show is by choosing a more relevant experimental evaluation. Also, please provide more details on the dataset. How it is captured and what are its characteristics?

There are several questions here, and we are going to answer them one by one:

(1) Experimentation seems to be limited. What are you trying to achieve by this work on image enhancement? If the objective is to achieve 100% OCR accuracy, the method is failed. If the objective is to have an enhanced image, you need to show is by choosing a more relevant experimental evaluation.

In fact we have done several experiments and in this paper we select one typical case, where the value of exposure bracketing is illustrated. The purpose is 1) to show using exposure bracketing can help the readability of the output image and 2) to demonstrate using exposure bracketing can help increase OCR accuracy. For 1) we can clearly see that this exposure improve the text readability from Fig. 12, Fig. 13 and Fig. 14. For 2) we can clearly notice that exposure help improve OCR accuracy (see Table I). We make it more clearly in the conclusion part of the paper (line 403 - 407):

“In this paper, we have investigated the potential of exposure bracketing techniques for improving the quality of camera document images. Four state-of-the-art algorithms have been selected for investigation, their technical details are explained and their intermediate results are illustrated. Positive experiment results show that using this technique can not only enhance the text readability but also can lead to OCR accuracy increase. ”

On top of it, we have to mention in paper [4], the author also uses OCR accuracy improvement as a way of demonstrating the value of exposure bracketing technique.

(2) Also, please provide more details on the dataset. How it is captured and what are its characteristics?

This is a very good question, and we have followed your advice and described the data capture in line 48-51:

“These three images are captured with Huawei Phone P20 in the same lighting condition, and for the under-exposed case the exposure time is 1/800 s, for the well-exposed case the exposure time is 1/320s, and for the over-exposed case the exposure time is 1/40 s”

Are the authors willing to make the dataset and code available for academic research?

This research was funded by Pilot National Laboratory for Marine Science and Technology (Qingdao) Aoshan science and technology innovation project (2016ASKJ07) and Qingdao Municipal Science and Technology Bureau grant number 17-1-1-3-jch.  The current research output is the intermediate result of our on-going research, and we will not publish any codes until we completely accomplish all the objectives of our research target.

Author Response File: Author Response.docx

Reviewer 2 Report

This paper presents the existing technique (exposure bracketing) for camera document image enhancement. I read the manuscript with interest and have identified a few minor things that need to be addressed, most importantly the structure of the manuscript.

I believe there should be a minor change to the structure of the manuscript. The overview of exposure bracketing technique can be included in the introduction section and I would suggest adding materials and method section where the process of exposure bracketing technique and HDR document generation can be described. Also, it would be nice to see the materials used for image acquisition and how it was acquired

Introduction:

Line 30-31: Reference no. are not sequential. After 1, reference 12. I would add the reference to no. 2 in the list and use 2 in the introduction section.

Line 48: well-exposed (Spell check)

Line 65: The basic idea behind is to create

From the manuscript, it wasn't clear how the irradiance map was generated or the weighting factor was generated.

Line 118: "Among all the geometric models...."

Comments: It would be nice if other geometric model name was mentioned, it wasn't clear among which models planar homographic was chosen. If not name, having a reference should be sufficient.

Line 129: I believe it wasn't clear to the reader from the manuscript what model's 8 parameters are? No reference here.

Line 141 - Line 151 : Is this from reference [3]?

Line 149: Therefore, in we have used SIFT......

Line 225: 232: reference is missing

Line 267: 272: reference is missing

Line 354 and line 361: Figure 12 and 13 caption are same. One should be under exposed.

Line 379: It would be nice if the table content fits within the same line or line ends at the end of the word.

Author Response

This paper presents the existing technique (exposure bracketing) for camera document image enhancement. I read the manuscript with interest and have identified a few minor things that need to be addressed, most importantly the structure of the manuscript.

I believe there should be a minor change to the structure of the manuscript. The overview of exposure bracketing technique can be included in the introduction section and I would suggest adding materials and method section where the process of exposure bracketing technique and HDR document generation can be described. Also, it would be nice to see the materials used for image acquisition and how it was acquired

Thanks for your suggestion, but we still stick to the original structure after we changed our texts based on the first reviewer’s suggestion. We found it was not clear if we mix the introduction part with the exposure bracketing overview part.

For the data capture part, we have changed in line 48-51:

“These three images are captured with Huawei Phone P20 in the same lighting condition, and for the under-exposed case the exposure time is 1/800 s, for the well-exposed case the exposure time is 1/320s, and for the over-exposed case the exposure time is 1/40 s”

Line 30-31: Reference no. are not sequential. After 1, reference 12. I would add the reference to no. 2 in the list and use 2 in the introduction section.

We have changed it according to the comment.

Line 48: well-exposed (Spell check)

We have changed it according to the comment.

Line 65: The basic idea behind is to create From the manuscript, it wasn't clear how the irradiance map was generated or the weighting factor was generated.

We have changed in line 70 71:“For more details on HDR image generation, please refer to [3]”.

Line 118: "Among all the geometric models...."

Comments: It would be nice if other geometric model name was mentioned, it wasn't clear among which models planar homographic was chosen. If not name, having a reference should be sufficient.

We have changed in line 135-136:“Among all the geometric models such as affine model, translational model, rotation model and so on.”

Line 129: I believe it wasn't clear to the reader from the manuscript what model's 8 parameters are? No reference here.

We have add reference [6].

 

Line 141 - Line 151 : Is this from reference [3]?

Yes. The reference’s number is [7] in the revised paper.

 

Line 149: Therefore, in we have used SIFT......

We have add reference [8].

 

Line 225: 232: reference is missing

We have add reference [11].

 

Line 267: 272: reference is missing

We have add reference [14] [15].

 

Line 354 and line 361: Figure 12 and 13 caption are same. One should be under exposed.

We have changed it according to the comment.

 

Line 379: It would be nice if the table content fits within the same line or line ends at the end of the word.

We have changed it according to the comment.

Author Response File: Author Response.docx

Round 2

Reviewer 1 Report

The paper is significantly improved. However, I am not convinced by the responses of authors concerning the sharing of dataset with the community. For code, I understand and I agree with the authors but it’s not a good practice to work on home-built dataset and to keep it private.

Back to TopTop