Next Article in Journal
Approximated Mixed-Integer Convex Model for Phase Balancing in Three-Phase Electric Networks
Previous Article in Journal
Dynamic Privacy-Preserving Recommendations on Academic Graph Data
Previous Article in Special Issue
On the Effect of Standing and Seated Viewing of 360° Videos on Subjective Quality Assessment: A Pilot Study
 
 
Article
Peer-Review Record

More Plausible Models of Body Ownership Could Benefit Virtual Reality Applications

Computers 2021, 10(9), 108; https://doi.org/10.3390/computers10090108
by Moritz Schubert * and Dominik Endres
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Computers 2021, 10(9), 108; https://doi.org/10.3390/computers10090108
Submission received: 13 June 2021 / Revised: 6 August 2021 / Accepted: 12 August 2021 / Published: 26 August 2021
(This article belongs to the Special Issue Advances in Seated Virtual Reality)

Round 1

Reviewer 1 Report

Thank you for your manuscript!In the present study the authors discuss the BICIBO model as introduced by Samad et al.

I welcome the author´s work. It is a timely discussion of the modelling approaches of ownership so far. these theoretical approaches are rather scarce, thus any contribution in this regard is important. Crucially, the authors shed light on the most known approach and point some critical issues around this model. These issues have not been pointed and tested again, thus, this work is important in the development of modelling approaches.

The truncated model is a step into right direction, to discuss the implausible assumptions of the previous mode. Thus, seeing the results of the modified model is informative and should inform future research.

. I find it somewhat surprising that an essentially flat prior that sums up to probability = 1 would have the identical results as the model with extremely large priors. I also find it surprising that there are differences between the truncated prior of different sigma values. Again, since they are essentially flat and have been normalised to sum to 1 there should be minimal differences across the priors. Looking at the code I was hoping you could explain some details about the sepIntegral and comIntegral function in bcibo_class.py.

Based on code implemented on 14th of June

Line 245: Here it appears that you multiply a truncation factor (z) to your normalising constant (n_joint).

Line 252: The likelihood is divided by your normalising constant (n_joint), which in effect means you have divided your likelihood by (z). This should account for the truncation.

Line254: You multiply the likelihood with the truncation factor (z). Doesn’t this effectively undo line 252.

Could you explain why you multiply the likelihood with the truncation factor? It appears to me that by cancelling out the truncation factor you will undoubtedly obtain the untruncated model again with some floating-point errors thus perhaps explaining why the truncated model is close to the untruncated model.

Similar logic is applied to the comIntegral. In line 321 and 323 the likelihood is multiplied by the truncation factor.

It appears the code has since been modified on the 22nd of June.

Now the likelihood multiplied by z_numerator which presumably should be close to 1 within the truncation bounds you have chosen and thus not cancel out the truncation factor as previously.

Do the new plots for the truncated model still mimic the results of untruncated model?

Line 304 ff: i agree that this maybe limited to the arm´s reach. the reference to Kilteni et al may however not capture that limit. Here i would suggest to refer for example concepts such as reaching space or peripersonal space. It is also one further example how some of the assumptions of the model could be grounded more on actual behavioral results.

In general, despite this being a more theoretical paper, it could use maybe some more references regarding the actual behavioral data on some of these issues. All these parameters like space, time etc have been investigated separately and it may be helpful to introduce the reader a few of these studies, for example alignment of the hands, temporal asynchrony effects...

minor issues:

I would argue that the sensory priors should be centred on 0mm for the spatial priors. I would think that the centre point of a distribution describing potential position of hand you would expect should centre on the midline. Additionally, in Kording et al. (2007) which is a precursor to this model, priors are centred on 0 to explain a midline bias in judgements. Similar midline biases are seen in proprioception. Although I should note, this has little impact to the results due to the essentially uniform priors used.

Even when i agree with the author´s general critique, maybe some formulations appear strong, e.g., line 298 "should be abandoned".

 

overall, i do find this paper sound and interesting, even when it may speak for a more targeted audience interested in these kinds of approaches. But it is important to stimulate further discussions on the formulation of these theoretical models and how they align with the experimental literature!

Author Response

Thank you for your detailed and insightful response. We are particularly impressed that you took the time to look into the code.

Regardings your comments on the implementation of the truncated model in the code: By the time of the review we had discovered why our (incorrect) implementation of the truncated model behaves, as we said in the previous version of the paper, "curiously" and fixed it in the git commit 0e8aca32 (from 14 June). We submitted a revised version of the paper, but unfortunately it seemed to not have reached you on time. The model now produced drastically different results, which we present in the new version of the paper. We are confident that our new implementation of the truncated model is correct, because it passes a sanity check: We increased the sensory priors' truncation bounds stepwise to a point beyond 1035 and, as expected, the resulting plot looks more similar to the one produced by the original model with each step. (As a quick reminder: 1035 is the standard deviation of the truncated prior in the original model.)

"i agree that this maybe limited to the arm´s reach. the reference to Kilteni et al may however not capture that limit"
We think that the reference to Kilteni is appropriate. The sentence preceding the one that contains the citation reads: "It should be noted that by truncating the proprioceptive prior this variation of the model is not able to account for certain abnormal experiences of embodiment outside of the bounds of proprioception." The elongated arm of Kilteni et al.'s experiment is indeed situated (in part) outside of the bounds of the participant's proprioception.
In case that this explanation does not satisfy you, we suggest that the following clarifying footnote could be added:
"Since the elongation of the arm also extends the arm's reach, one could technically say that the illusion is still happening within arm's reach. However, with "arm's reach" we are referring to the reach of the participant's real arm and not of an artificial one."

"In general, despite this being a more theoretical paper, it could use maybe some more references regarding the actual behavioral data on some of these issues."
We concur with you and have moved most of the references to behavioral experiments in the previous version of the paper to a separate "Related Works" section. In there we cover how the model is able to explain proprioceptive drift, referral of touch and synchronicity effects.

"it may be helpful to introduce the reader a few of these studies, for example alignment of the hands"
We agree that the alignment of the hands is an interesting avenue for future extensions of the model. We mention this in the "Future Research" section and cite Kalckert and Ehrsson (2012) there.

"I would argue that the sensory priors should be centred on 0mm for the spatial priors.""
Thanks for the sound argument. It has led us to think more critically about our choice of priors. While we have (partially) mantained what we call an informed prior, we now argue for it in the paper as follows:
"By setting the sensory priors’ mean values to the actual values of the experimental setup we are using an informed prior (Schürmann et al., 2019). This is in contrast to Körding et al. (2007) who first proposed the Bayesian causal inference model. They used an uninformed prior, meaning that they set the sensory priors’ mean values to 0. They did this to implement a “bias to perceive stimuli straight ahead” (p. 3 in Körding et al., 2007). In the context of the RHI this would translate to a bias to perceive stimuli close to the midline.
Schürmann et al. (2019) have argued that it is more appropriate to use an informed prior, because humans constantly update their internal representations based on sensory input. From this perspective, it is likely that by the time of the brush stroke onset the participants have inferred the correct position of the hands. Since participants have no idea when the brush strokes are going to set in, this updating can only occur on the spatial, but not the temporal dimension. Hence, we use an informed prior for the spatial and an uninformed prior for the temporal dimension."

"Even when i agree with the author´s general critique, maybe some formulations appear strong, e.g., line 298 'should be abandoned'."
This is a fair point. We have changed said line to "We hold that the 'one size fits all' approach is too simplistic." Furthermore, we tried to find other instances of strong formulations and toned them down.

We hope that addresses your concerns and would like to thank you again for your constructive review.


References:
Kalckert, A., & Ehrsson, H. H. (2012). Moving a Rubber Hand that Feels Like Your Own: A Dissociation of Ownership and Agency. Frontiers in Human Neuroscience, 6. https://doi.org/10.3389/fnhum.2012.00040
Kilteni, K., Normand, J.-M., Sanchez-Vives, M. V., & Slater, M. (2012). Extending body space in immersive virtual reality: A very long arm illusion. PloS One, 7(7), 40867. https://doi.org/10.1371/journal.pone.0040867
Schürmann, T., Vogt, J., Christ, O., & Beckerle, P. (2019). The Bayesian causal inference model benefits from an informed prior to predict proprioceptive drift in the rubber foot illusion. Cognitive Processing, 20(4), 447–457. https://doi.org/10.1007/s10339-019-00928-9

Reviewer 2 Report

The following aspects must be clarified in order to increase the soundness of the paper:

  • It is not clearly what is the novelty of the paper: please explain more clearly what is the scope of the paper, in what application can be integrated the proposed method (not only a general description for the application - please extend the possible application)
  • a Related Work section must be added - methods that are described currently in Introduction must be included in a separate section. Advantages and disadvantages of such methods must be presented. Based on these aspects, the proposed method must be introduced showing its novelty compared with the existing ones methods
  • results must be compared with other existing applications
  • figures must be placed after their first appearance in the text
  • there are parts of text marked in red that must be coloured in black

Author Response

First of all, we would like to thank the reviewer for the valuable feedback. Below are our responses to the concerns raised.

"It is not clearly what is the novelty of the paper: please explain more clearly what is the scope of the paper"
To address this concern we have added several paragraphs in subsection 2.5. To briefly summarize our point: We argue that the model in its current form is not psychologically plausible, i.e. that it's incompatible with what we know about cognition from a psychological perspective. Bayesian cognitive models have been criticized for being underconstrained and therefore too easily fitted to data. We think that the criterium of psychological plausiblity offers a much needed constraint on Bayesian cognitive models. We develop some variants of the models that we consider to be more psychologically plausible and test whether they still agree as well with empirical data as the original model does. None of these variants have been discussed in the literature so far.
The following points are not mentioned in the paper, but also relevant to this critique: To our knowledge the size of the sensory priors' width of the BCIBO model has not been reported in the literature so far. However, this prior width is crucial for a replication of the results reported by Samad et al. (2015).

"please explain more clearly [...] in what application can be integrated the proposed method"
Body ownership is a crucial part of many VR applications. As an example, we have listed some uses of VR and specifically body ownership within clinical psychology (s. "Applications" subsection) to further highlight this connection. We believe that the BCIBO model is a step towards a deeper understanding of body ownership and therefore also benefits VR applications that rely on body ownership.

"results must be compared with other existing applications"
It is hard to compare our results with other papers, because we are (to the best of our knowledge) only one of two papers who have tackled the BCIBO model on a computational level. Our target outcome was the posterior probability of a common cause. The other paper that has looked at the paper from a computational perspective is Schürmann et al. (2019). However, their target outcome was the posterior predictive distribution of the sensory input, i.e. the predicted sensory input after seeing the data. Since the criteria by which our and Schürmann et al.'s paper evaluated the model are different, they cannot be easily compared.
Another paper that evaluated computations for a Bayesian causal inference model of body ownership is Chancel et al. (2021), but their model is different from ours. They only implemented "half" of Samad et al.'s (2015) model, the temporal dimension, while we implemented the full model. Hence, their results are not comparable to ours.

"a Related Work section must be added"
Thank you for this feedback. We have restructured the paper in order to include a Related Works section and think that the structure of the paper is much clearer now.

"Advantages and disadvantages of such methods must be presented. Based on these aspects, the proposed method must be introduced showing its novelty compared with the existing ones methods"
We hope that the new paragraphs in subsection 2.5 mentioned above address this point. To briefly re-iterate: One of the crucial disadvantages of Bayesian cognitive models is that they can be underconstrained. Our criterium of psychological plausibility addresses this concern.
In the subsection "Related Works", we mention as an advantage of Bayesian causal inference models that they have been successfully applied to a wide variety of applications such as stimulus localization and speech perception. We also list some of the empirical observations in rubber hand illusion experiments that the model can account for.

"figures must be placed after their first appearance in the text"
This is a very sensible principle to follow; we have made sure that our revised paper adheres to it. However, as an unfortunate side effect Figure 4 and Figure 5 are now directly below each other.

"there are parts of text marked in red that must be coloured in black"
After submitting our original version of the paper we noticed a severe flaw in our implementation of the truncated model. We corrected this mistake and submitted a new version of the paper to the journal. Our submission contained two version of the paper: One with the entire text in black and one where we marked the differences to the last version of the paper in red. It seems like you have received the latter rendition of the paper.

Thank you again for your comments, which we have tried to implement in this revision.


References:
Chancel, M., Ehrsson, H. H., & Ma, W. J. (2021). Uncertainty-based inference of a common cause for body ownership. OSF Preprints. https://doi.org/10.31219/osf.io/yh2z7
Samad, M., Chung, A. J., & Shams, L. (2015). Perception of Body Ownership Is Driven by Bayesian Sensory Inference. PLOS ONE, 10(2), 1–23. https://doi.org/10.1371/journal.pone.0117178
Schürmann, T., Vogt, J., Christ, O., & Beckerle, P. (2019). The Bayesian causal inference model benefits from an informed prior to predict proprioceptive drift in the rubber foot illusion. Cognitive Processing, 20(4), 447–457. https://doi.org/10.1007/s10339-019-00928-9

Reviewer 3 Report

This paper investigated a Bayesian Causal Inference model of body ownership. The paper is interesting but the article still needs substantial modifications before publication. First of all, the title and keywords should be revised. It is unclear and not related to VR which is mentioned at the beginning of this paper. Secondly, the paper is difficult to understand. It requires to be proofread by a native speaker. For instance “the user is situated in the virtual environment via an avatar.” I don’t think this sentence is easy to understand by the public readers. “In order for the user to be able to focus on the task”, the sentence can be simplified, etc.

The review on similar studies is missing and the contributions/ significance of this article is unclear.

In figure 2, what is the difference between hand and rubber/ real hand?

Section 2 is based on existing theories, what is the novelty of this article?

How is this study related to VR? Or there is no relationship? It needs to be further elaborated.

Please explain the setup of each experiment and why your results part are related to your research objectives? The whole picture of the experiment is not clear. Why truncated model? Spatial visual input, temporal input are needed?

 

The application should be discussed at the beginning of the paper. Future works should be in the conclusion section.

 

 

Author Response

Thank you for taking the time to review our paper. In the following we are going to outline our revisions in response to your review and hope that you agree that it has led to a significant improvement in the quality of the paper.

"First of all, the title and keywords should be revised."
We have changed the title to "More Plausible Models of Body Ownership Could Benefit Virtual Reality Applications" to increase its relation to VR. As far as the keywords are concerned, we changed "causal inference" to "Bayesian causal inference" (BCI), to point more directly towards the long line of BCI research this paper relates to. Furthermore, we have added "virtual reality" as a keyword, to highlight the virtual reality applications we stress in the paper.

"[The paper] requires to be proofread by a native speaker."
We have attempted to simplify and "straighten out" the language of the paper and hope that it reads more fluently in its current form. This also effected the two sentences specifically mentioned by you:
"the user is situated in the virtual environment via an avatar" --> "the user is represented in the virtual environment by an avatar"
“In order for the user to be able to focus on the task, ..." --> "Body ownership over the avatar is often helpful, e.g. to increase the perception of presence in the VR."

"The review on similar studies is missing"
We have added a "Related Works" section in which we point out the connection of the BCIBO to previous rubber hand illusion experiments.

"contributions/ significance of this article is unclear."
To address this concern we have added several paragraphs in subsection 2.5. To briefly summarize our point: We hold that the model in its current form is not psychologically plausible, i.e. that it's incompatible with what we know about cognition from a psychological perspective. Bayesian cognitive models have been criticized for being underconstrained and therefore too easily fitted to data. We think that the criterium of psychological plausiblity offers a much needed constraint on Bayesian cognitive models. In the paper we present some variants of the models that we consider to be more psychologically plausible than the original and test whether these variants still agree as well with empirical data as the original model does. None of these variants have been discussed in the literature so far.
The following aspect is not mentioned in the paper, but also relevant to this critique: To our knowledge the size of the sensory priors' width of the BCIBO model has not been reported in the literature so far. However, it is crucial for the replication of the model as it was originally reported by Samad et al. (2015).

"In figure 2, what is the difference between hand and rubber/ real hand?"
We have extended the caption of Figure 2 to clarify this point of confusion: "Since a common cause only assumes a single hand, there is no need to distinguish between the two hands. Hence, under a common cause the rubber hand is simply referred to as 'Hand'."

"How is this study related to VR? Or there is no relationship? It needs to be further elaborated."
The relationship between the study and VR is through the concept of body ownership, which is a crucial part of the VR experience. We have listed some uses of VR and specifically body ownership within clinical psychology (s. "Applications" subsection) to further highlight this connection. We believe that the BCIBO model is a step towards a deeper understanding of body ownership and therefore also benefits VR applications that rely on body ownership.

"Please explain the setup of each experiment and why your results part are related to your research objectives?"
We hope that the additional paragraphs in subsection 2.5 mentioned above clarify our research objective. As for the setup of the experiment, we have simplified the language of the Methods section. We hope that this eases understanding. In addition, we have added formal model descriptions in the appendix that follow a notation style common in the field of Bayesian modeling. The style was specifically modeled after the textbook by Lee and Wagenmakers (2013), but the textbook by McElreath (2020) uses a similar one. The intention behind these formal description is to gather all the information regarding the model specifications in one place.

"Spatial visual input, temporal input are needed?"
We are sorry that you were confused on this point. We hope that the streamlined language better explains the meaning of the terms "spatial visual input" and "temporal input".
To directly address the point: The spatial visual and temporal input are an integral part of the BCIBO model. Spatial visual input refers to the visual input from the rubber hand. "Spatial" refers to the fact that the visual input is used to infer the position of the rubber hand. Regarding the temporal input: The temporal congruency of the brush strokes is the driving force behind the multisensory integration process in the rubber hand illusion. By "congruency" we mean that the brush strokes are applied as close to synchrony as possible by the experimenter. Without this congruent temporal input the model would not predict a body ownership illusion.

"The application should be discussed at the beginning of the paper."
We agree that establishing the relevancy of the discussed construct early on is important and that listing applications helps in this endeavor. However, we feel like one of the suggested applications, namely the design of HMD hardware, relies on a thorough understanding of the BCIBO model. Specifically we fear that a reader who is introduced to the BCIBO model through this paper might not understand the following sentence if it is placed in the introduction: "the model is able quantify the trade-off between the spatial and temporal inaccuracies of the system in terms of the probability of a body ownership illusion" Hence, we think the applications are better placed in the discussion.
However, in case you disagree with our argument please let us know and we will gladly add a brief outline of the BCIBO model in the introduction and move the Applications section up front.

"Future works should be in the conclusion section."
We moved the "Future Works" section just there.


References:
Lee, M. D., & Wagenmakers, E.-J. (2013). Bayesian cognitive modeling: A practical course. Cambridge University Press.
McElreath, R. (2020). Statistical rethinking: A Bayesian course with examples in R and Stan (2nd ed.). Taylor and Francis, CRC Press.
Samad, M., Chung, A. J., & Shams, L. (2015). Perception of Body Ownership Is Driven by Bayesian Sensory Inference. PLOS ONE, 10(2), 1–23. https://doi.org/10.1371/journal.pone.0117178

Round 2

Reviewer 2 Report

Some of my comments were addressed. Also, I consider that must be added information showing that are few papers (one or two) that tackle the BCIBO model on a computational level - add the difference aspects compared with these papers.

Figures 2, 3, 4 and 5 still appear before their reference in the text.

Author Response

In the manuscript we submitted we tried to mark all changes compared to the revision version in red in order to make it easier for you to review the new version. In the camera-ready version of the paper these red passages will be switched to black.

We have added information about Schürmann et al. (2019), which (as far as we know) is the one paper that tackles the BCIBO model on a computational level (s. l. 160 - 177) and contrasted it with our study. We hope this addresses your concern.

We have tried to make sure that all the figures appear after their first mentioning in the text. In case that we have missed something, we are confident that the MDPI Computers typesetters will correct the mistake.

Reviewer 3 Report

There are still some grammatical mistakes, please proofread by a native speaker. Such as “the use might not feel comfortable…”

 

More VR and MR papers should be included in order to enhance the content of VR parts of this article, such as: Evaluating the effectiveness of learning design with mixed reality (MR) in higher education. Application of virtual reality (VR) technology for medical practitioners in type and screen (T&S) training.

 

Contributions need to be highlighted.

 

Conclusion section need to be rewrited, it usually starts from the conclusion, and implications of this research, followed by limitation and future works.

Author Response

In the manuscript we submitted we tried to mark all changes compared to the revision version in red in order to make it easier for you to review the new version. In the camera-ready version of the paper these red passages will be switched to black.

The manuscript has been re-read by an English expert. We hope it is acceptable now.

"More VR and MR papers should be included in order to enhance the content of VR parts of this article"
We have expanded the Applications section with regard to VR related papers. Thanks for pointing us towards two papers. We do not think that MR (or more specifically AR) applications are relevant for our paper, because they tend to not effect body ownership in any way. Hence, we have decided against citing Tang et al. (2020). However, since Tang et al. (2021) used a VR scenario with an avatar it is highly relevant to the topic discussed in our paper and we have included a reference to it. In the newly revised Applications section we elaborate on which type of VR applications can (in our opinion) benefit from the BCIBO model and for which ones it is less relevant.

"Contributions need to be highlighted."
Following your suggestion, we have explicitly pointed out the main contribution of the paper in the introduction (lines 42-46).

"Conclusion section needs to be rewritten"
In accordance with the Instructions for Authors from MDPI Computers (https://www.mdpi.com/journal/computers/instructions#manuscript) we have termed our two last sections "Discussion" and "Conclusions". We tried to model the structure of the discussion to your specifications. This led us to rename the "Future Works" subsection to "Limitations and Future Work" and address some of the limitations of the study directly.

References:
Tang, Y. M., Au, K. M., Lau, H. C. W., Ho, G. T. S., & Wu, C. H. (2020). Evaluating the effectiveness of learning design with mixed reality (MR) in higher education. Virtual Reality, 24(4), 797–807. https://doi.org/10.1007/s10055-020-00427-9
Tang, Y. M., Ng, G. W. Y., Chia, N. H., So, E. H. K., Wu, C. H., & Ip, W. H. (2021). Application of virtual reality (VR) technology for medical practitioners in type and screen (T&S) training. Journal of Computer Assisted Learning, 37(2), 359–369. https://doi.org/10.1111/jcal.12494

Back to TopTop