Figure 1.
Visual representation of the simple Dempster-Shafer example.
represents the three options that could be observed by the sensors. The powerset column represents all combinations, which Dempster-Shafer analysis considers. Recall that sets with multiple options signifies the belief that the observed object could be any one of the objects in the set. The two sensors that provide observations—evidence—can detect all objects, but can only detect certain properties of each object. Those detections are shown, along with the belief masses assigned to each element of the powerset—the Basic Probability Assignment (
). The combined mass column shows the powerset again, along with the results of the analysis which correspond to the greater details in
Table 1. The highlighted element of the powerset (the red ball) is the
which is believed to correspond to the true object, based on the Dempster-Shafer analysis.
Figure 1.
Visual representation of the simple Dempster-Shafer example.
represents the three options that could be observed by the sensors. The powerset column represents all combinations, which Dempster-Shafer analysis considers. Recall that sets with multiple options signifies the belief that the observed object could be any one of the objects in the set. The two sensors that provide observations—evidence—can detect all objects, but can only detect certain properties of each object. Those detections are shown, along with the belief masses assigned to each element of the powerset—the Basic Probability Assignment (
). The combined mass column shows the powerset again, along with the results of the analysis which correspond to the greater details in
Table 1. The highlighted element of the powerset (the red ball) is the
which is believed to correspond to the true object, based on the Dempster-Shafer analysis.
Figure 2.
Propagating the Dempster-Shafer evidence masses through the transition between nodes. For simplicity, only two options are available at each node. The direction of propagation is represented by the arrows between evidence masses. The arrows between the nodes represent the definition of the network. As can be seen, evidence propagation from node A to node B results in normalized masses. Conversely, evidence propagation from node B to node A results in non-normalized masses. Moreover, the original masses are not recovered if masses are propagated from node A to node B to node A through the same transition.
Figure 2.
Propagating the Dempster-Shafer evidence masses through the transition between nodes. For simplicity, only two options are available at each node. The direction of propagation is represented by the arrows between evidence masses. The arrows between the nodes represent the definition of the network. As can be seen, evidence propagation from node A to node B results in normalized masses. Conversely, evidence propagation from node B to node A results in non-normalized masses. Moreover, the original masses are not recovered if masses are propagated from node A to node B to node A through the same transition.
Figure 3.
An example case in which the conditional probabilities are a minimum, and a smaller, representation of the joint probabilities. The variable d in node B is not influenced at all by the variables a and b in node A and is, thus, independent from the variables in node A. The conditional probabilities would reflect that relationship by eliminating the middle row of zeros, resulting in fewer values being stored to represent the relationship.
Figure 3.
An example case in which the conditional probabilities are a minimum, and a smaller, representation of the joint probabilities. The variable d in node B is not influenced at all by the variables a and b in node A and is, thus, independent from the variables in node A. The conditional probabilities would reflect that relationship by eliminating the middle row of zeros, resulting in fewer values being stored to represent the relationship.
Figure 4.
An example network that starts with no data. This analysis is beyond the scope of Bayesian logic since Bayesian requires priors. Further, overwriting information is risky in this context since a full overwrite of information suggests sufficient data behind each update. In other words, the first overwrite would be similar to starting with Bayesian logic after the first update, but insufficient information for that event to occur has already been assumed.
Figure 4.
An example network that starts with no data. This analysis is beyond the scope of Bayesian logic since Bayesian requires priors. Further, overwriting information is risky in this context since a full overwrite of information suggests sufficient data behind each update. In other words, the first overwrite would be similar to starting with Bayesian logic after the first update, but insufficient information for that event to occur has already been assumed.
Figure 5.
An example network update that uses Dempster-Shafer updates to combine evidence at each node. The combined value from node A is propagated as evidence through the transition to node B , where it is combined with directly injected evidence , resulting in .
Figure 5.
An example network update that uses Dempster-Shafer updates to combine evidence at each node. The combined value from node A is propagated as evidence through the transition to node B , where it is combined with directly injected evidence , resulting in .
Figure 6.
Case (
1) shows a single parent node with transitions to two child nodes. Each unknown transition can be calculated using the update method described previously via either least squares minimization or linear programming minimization. Moreover, case (
1) reduces to the simplest case of one parent and one child node if node C and the associated transitions are removed. In contrast, case (
2) cannot solely be solved via the described least squares or linear programming methods. The child marginal values are a result of a Dempster-Shafer combination algorithm, which must be part of the method for updating the unknown transition potentials. This case is handled in
Section 3.3.
Figure 6.
Case (
1) shows a single parent node with transitions to two child nodes. Each unknown transition can be calculated using the update method described previously via either least squares minimization or linear programming minimization. Moreover, case (
1) reduces to the simplest case of one parent and one child node if node C and the associated transitions are removed. In contrast, case (
2) cannot solely be solved via the described least squares or linear programming methods. The child marginal values are a result of a Dempster-Shafer combination algorithm, which must be part of the method for updating the unknown transition potentials. This case is handled in
Section 3.3.
Figure 7.
The layout for the traffic signal scenario, which is representative of timed four-way lights without left turn signals. The grey car is approaching a red light intersection and estimating how long until the light turns green to determine whether to slow the car. Visibility is limited due to buildings and other obstructions. The cross-walk signal may be visible before the intersection. The cross-traffic light is not visible to the grey vehicle and must be estimated. Cross traffic density and speed is variable in the simulation and is estimated by the grey vehicle.
Figure 7.
The layout for the traffic signal scenario, which is representative of timed four-way lights without left turn signals. The grey car is approaching a red light intersection and estimating how long until the light turns green to determine whether to slow the car. Visibility is limited due to buildings and other obstructions. The cross-walk signal may be visible before the intersection. The cross-traffic light is not visible to the grey vehicle and must be estimated. Cross traffic density and speed is variable in the simulation and is estimated by the grey vehicle.
Figure 8.
The results of weighting inputs for a Dempster-Shafer network. Given an update on nodes A, B, and F, a no-weight update results in 3 times the amount of experience applied at each node, as shown in part (1) of the figure. With weighting, only the applicable experience is applied at each node, as shown in part (2) of the figure. The direction of the arrows shows the transition of the update experience between nodes in the network. The network is using the standard representation, where moving upwards on the diagram between nodes represents inference.
Figure 8.
The results of weighting inputs for a Dempster-Shafer network. Given an update on nodes A, B, and F, a no-weight update results in 3 times the amount of experience applied at each node, as shown in part (1) of the figure. With weighting, only the applicable experience is applied at each node, as shown in part (2) of the figure. The direction of the arrows shows the transition of the update experience between nodes in the network. The network is using the standard representation, where moving upwards on the diagram between nodes represents inference.
Figure 9.
Test network used to analyze the performance of the novel Dempster-Shafer network algorithms. Only includes single parents for each node. The number after the node name shows the number of for the node. Nodes with two and three were used since these are the more common cases for nodes in a DS network.
Figure 9.
Test network used to analyze the performance of the novel Dempster-Shafer network algorithms. Only includes single parents for each node. The number after the node name shows the number of for the node. Nodes with two and three were used since these are the more common cases for nodes in a DS network.
Figure 10.
Test results for run time with single parent network. Within each combination algorithm group, the no-learning method is faster that un-weighted learning methods, showing the increased burden of the learning calculations. Both Rayleigh and Weighted Rayleigh methods show the same order of magnitude for no-learning time and un-weighted learning time, suggesting that the combination algorithm is the driving factor for the update time. Dempster-Shafer, Evidential Combination Reasoning (ECR), and Overwrite have faster update times due to not retaining explicit history. Note further that un-weighted single and un-weighted multi-update methods do not significantly change the update times. Finally, the weighted methods are typically at least as fast as the no-learning method.
Figure 10.
Test results for run time with single parent network. Within each combination algorithm group, the no-learning method is faster that un-weighted learning methods, showing the increased burden of the learning calculations. Both Rayleigh and Weighted Rayleigh methods show the same order of magnitude for no-learning time and un-weighted learning time, suggesting that the combination algorithm is the driving factor for the update time. Dempster-Shafer, Evidential Combination Reasoning (ECR), and Overwrite have faster update times due to not retaining explicit history. Note further that un-weighted single and un-weighted multi-update methods do not significantly change the update times. Finally, the weighted methods are typically at least as fast as the no-learning method.
Figure 11.
Test results for consistency with single parent networks. The y-axis uses a logarithmic scale. All per-node consistency is low without learning, with the best case being the overwrite case. Since transition potential matrices are not reversible (propagating up through a transition does not guarantee perfect consistency down the transition), the overwrite case has a non-zero consistency test result. However, in all learning cases, the consistency checks return an effectively zero result, equating to perfect consistency with round-off error.
Figure 11.
Test results for consistency with single parent networks. The y-axis uses a logarithmic scale. All per-node consistency is low without learning, with the best case being the overwrite case. Since transition potential matrices are not reversible (propagating up through a transition does not guarantee perfect consistency down the transition), the overwrite case has a non-zero consistency test result. However, in all learning cases, the consistency checks return an effectively zero result, equating to perfect consistency with round-off error.
Figure 12.
Test results for learning with a single parent network. Results vary depending on the combination algorithm used. Except for Rayleigh, Weighted Rayleigh, and overwrite methods, the learning method resulted in significant reduction of unknown information over the baseline case of no-learning with preset transition potentials. Deviations in the combination algorithms can be explained through handling of conflict. High conflict in the Rayleigh and Weighted Rayleigh algorithms means lower assignment of mass to the new focal elements, resulting in higher mass retained in the unknown/complete set. This is a result of the randomized evidence set testing methodology and does not impact the choice of algorithm. Finally, since the overwrite method does not retain previous evidence, propagated evidence through unknown transitions will tend to have higher impact, retaining unknown information.
Figure 12.
Test results for learning with a single parent network. Results vary depending on the combination algorithm used. Except for Rayleigh, Weighted Rayleigh, and overwrite methods, the learning method resulted in significant reduction of unknown information over the baseline case of no-learning with preset transition potentials. Deviations in the combination algorithms can be explained through handling of conflict. High conflict in the Rayleigh and Weighted Rayleigh algorithms means lower assignment of mass to the new focal elements, resulting in higher mass retained in the unknown/complete set. This is a result of the randomized evidence set testing methodology and does not impact the choice of algorithm. Finally, since the overwrite method does not retain previous evidence, propagated evidence through unknown transitions will tend to have higher impact, retaining unknown information.
Figure 13.
Test results for weighted versus unweighted methods in a single parent network after 30 updates of 1 unit of experience each. This test shows a clear difference between unweighted methods (90 units of experience per node), and the weighted methods (approximately 30 units of experience per node). The unweighted methods clearly suggest that data is reused. While not the case (each propagated evidence set is from a different observation), this result is less explainable than the weighted methods, calling into question the ability for the network results to be accepted in decision-making scenarios.
Figure 13.
Test results for weighted versus unweighted methods in a single parent network after 30 updates of 1 unit of experience each. This test shows a clear difference between unweighted methods (90 units of experience per node), and the weighted methods (approximately 30 units of experience per node). The unweighted methods clearly suggest that data is reused. While not the case (each propagated evidence set is from a different observation), this result is less explainable than the weighted methods, calling into question the ability for the network results to be accepted in decision-making scenarios.
Figure 14.
Test network used to analyze the performance of the novel Dempster-Shafer network algorithms. This network only includes multiple parent nodes for each child node. The number after the node name shows the number of for the node. Two and three nodes were used since these are the more common cases for nodes in a DS network.
Figure 14.
Test network used to analyze the performance of the novel Dempster-Shafer network algorithms. This network only includes multiple parent nodes for each child node. The number after the node name shows the number of for the node. Two and three nodes were used since these are the more common cases for nodes in a DS network.
Figure 15.
Test results for run time with a multiple parent network. The root finder primarily works for Murphy’s and Zhang’s combination methods, as expected. For those methods, the root finder shows at least an order of magnitude improvement in run time per node, which translates to significant improvements for larger networks.
Figure 15.
Test results for run time with a multiple parent network. The root finder primarily works for Murphy’s and Zhang’s combination methods, as expected. For those methods, the root finder shows at least an order of magnitude improvement in run time per node, which translates to significant improvements for larger networks.
Figure 16.
Test results for learning with a multi-parent network. The unknown fraction are similar between the optimization and root finder methods. This result is expected, given that similar solutions should be found. Notably, significantly higher unknown fractions are found for the multiple parent cases than for the single parent cases. This difference is due to the learning method since identical marginals are first calculated for each parent, resulting in duplicated unknown information being retained significantly longer.
Figure 16.
Test results for learning with a multi-parent network. The unknown fraction are similar between the optimization and root finder methods. This result is expected, given that similar solutions should be found. Notably, significantly higher unknown fractions are found for the multiple parent cases than for the single parent cases. This difference is due to the learning method since identical marginals are first calculated for each parent, resulting in duplicated unknown information being retained significantly longer.
Figure 17.
Test results for weighting with multi-parent networks. Since all cases are weighted, no deviations between cases were expected or observed. All cases show expected total weights per node of approximately 30, confirming the results from the single parent test case in
Figure 13.
Figure 17.
Test results for weighting with multi-parent networks. Since all cases are weighted, no deviations between cases were expected or observed. All cases show expected total weights per node of approximately 30, confirming the results from the single parent test case in
Figure 13.
Figure 18.
Test failures for the multiple parent cases. The ECR method, Dempster’s Rule, and the overwrite method were not expected to reliably succeed due to the random evidence sets that were not within bounds required for the reverse solver method to succeed. As expected, these methods tend to fail. The root finder method more consistently succeeds or fails. In each set of tests, the root finder method either succeeds or fails in all tests while the optimizer can find solutions that the root finder misses. Murphy’s Rule and Zhang’s Rule show better performance by the root finder than the overwrite method.
Figure 18.
Test failures for the multiple parent cases. The ECR method, Dempster’s Rule, and the overwrite method were not expected to reliably succeed due to the random evidence sets that were not within bounds required for the reverse solver method to succeed. As expected, these methods tend to fail. The root finder method more consistently succeeds or fails. In each set of tests, the root finder method either succeeds or fails in all tests while the optimizer can find solutions that the root finder misses. Murphy’s Rule and Zhang’s Rule show better performance by the root finder than the overwrite method.
Figure 19.
Test network used to analyze the performance of the novel Dempster-Shafer network algorithms. This example includes nodes that have both single and multiple parents. The number after the node name shows the number of for the node. Two and three nodes were used since these are the more common cases for nodes in a DS network.
Figure 19.
Test network used to analyze the performance of the novel Dempster-Shafer network algorithms. This example includes nodes that have both single and multiple parents. The number after the node name shows the number of for the node. Two and three nodes were used since these are the more common cases for nodes in a DS network.
Figure 20.
Test results for run time for a complex network. In both cases which succeeded, the weighting method significantly decreased run time, as expected. In both cases, the run time order of magnitude more closely resembles the multiple parent tests (
Figure 15) than the single parent tests (
Figure 10). This is expected, given that the complex network adds the additional multi-parent calculations. These results also suggest that the root finding method for multi-parents still dominates the single parent solution method.
Figure 20.
Test results for run time for a complex network. In both cases which succeeded, the weighting method significantly decreased run time, as expected. In both cases, the run time order of magnitude more closely resembles the multiple parent tests (
Figure 15) than the single parent tests (
Figure 10). This is expected, given that the complex network adds the additional multi-parent calculations. These results also suggest that the root finding method for multi-parents still dominates the single parent solution method.
Figure 21.
Test results for learning for a complex network. There are two points of interest here: i. The weighted, multiple update method does display a higher unknown fraction. This opposes the results seen in the single-parent tests (
Figure 12), suggesting that the multiple parent solution method fares less well when dealing with weighted data; ii. The unknown fraction is between the single parent tests (
Figure 12), and the multi parent tests (
Figure 16), which is expected, given that the complex network is a combination of the previous networks.
Figure 21.
Test results for learning for a complex network. There are two points of interest here: i. The weighted, multiple update method does display a higher unknown fraction. This opposes the results seen in the single-parent tests (
Figure 12), suggesting that the multiple parent solution method fares less well when dealing with weighted data; ii. The unknown fraction is between the single parent tests (
Figure 12), and the multi parent tests (
Figure 16), which is expected, given that the complex network is a combination of the previous networks.
Table 1.
Dempster-Shafer Evidence Example. The “Powerset” column represents the full set of options to which a can be assigned. The “Evidence 1” column shows the first evidence set from the sensor that distinguishes shape. The “‘Evidence 2” column shows a second evidence from a sensor that distinguishes color. The “Combined” column shows the rounded, combined masses based on Dempster’s Rule, and the “Bel” and “Pl” columns show the Belief and Plausibility functions, respectively, for each of the elements of the powerset of the combined data.
Table 1.
Dempster-Shafer Evidence Example. The “Powerset” column represents the full set of options to which a can be assigned. The “Evidence 1” column shows the first evidence set from the sensor that distinguishes shape. The “‘Evidence 2” column shows a second evidence from a sensor that distinguishes color. The “Combined” column shows the rounded, combined masses based on Dempster’s Rule, and the “Bel” and “Pl” columns show the Belief and Plausibility functions, respectively, for each of the elements of the powerset of the combined data.
Powerset | Evidence 1 | Evidence 2 | Combined | Bel | Pl |
---|
Red ball | 0.0 | 0.0 | 0.490 | 0.490 | 0.734 |
Green ball | 0.0 | 0.2 | 0.184 | 0.184 | 0.367 |
Red cube | 0.1 | 0.0 | 0.082 | 0.082 | 0.163 |
(Red ball, Green ball) | 0.8 | 0.0 | 0.163 | 0.837 | 0.857 |
(Green ball, Red cube) | 0.0 | 0.0 | 0.000 | 0.266 | 0.286 |
(Red ball, Red cube) | 0.0 | 0.6 | 0.061 | 0.633 | 0.653 |
(Red ball, Green ball, Red cube) | 0.1 | 0.2 | 0.020 | 1.0 | 1.0 |
Table 2.
Bayesian Probability Example. This example mirrors the Dempster-Shafer (DS) example in
Table 1 as closely as possible for comparison. Because the evidences are direct observations of the priors, the likelihood is 1.0.
Table 2.
Bayesian Probability Example. This example mirrors the Dempster-Shafer (DS) example in
Table 1 as closely as possible for comparison. Because the evidences are direct observations of the priors, the likelihood is 1.0.
| Prior | Likelihood 1 | Posterior | Likelihood 2 | Posterior |
---|
Red ball | 0.333 | | 0.459 | | 0.623 |
Green ball | 0.333 | | 0.459 | | 0.267 |
Red cube | 0.333 | | 0.081 | | 0.110 |
Incorrect shape | | 0.15 | | | |
Correct shape | | 0.85 | | | |
Incorrect color | | | | 0.3 | |
Correct color | | | | 0.7 | |
Table 3.
Dempster-Shafer Conflicting Example. The combination of highly conflicting data provides non-intuitive results. In this case, although both A and C each have a large belief mass in an evidence set, 0 mass for each of A and C in the other evidence set results in a vote-no-by-one scenario in which one sensor “votes no” for A and the other sensor “votes no” for C. The result is that all belief mass is given to B when combined. Note that for this simple example, only single options are focal points in the frame of discernment.
Table 3.
Dempster-Shafer Conflicting Example. The combination of highly conflicting data provides non-intuitive results. In this case, although both A and C each have a large belief mass in an evidence set, 0 mass for each of A and C in the other evidence set results in a vote-no-by-one scenario in which one sensor “votes no” for A and the other sensor “votes no” for C. The result is that all belief mass is given to B when combined. Note that for this simple example, only single options are focal points in the frame of discernment.
Data Set | A | B | C | (A,B) | (A,C) | (B,C) | (A,B,C) |
---|
Evidence 1 | 0.9 | 0.1 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
Evidence 2 | 0.0 | 0.1 | 0.9 | 0.0 | 0.0 | 0.0 | 0.0 |
Combination | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
Table 4.
Dempster-Shafer combination details for two identical input evidence sets with three options each. E1 and E2 are evidence sets one and two, respectively. Each matrix cell mass is assigned to the specified Destination row mass unless otherwise stated in the cell.
Table 4.
Dempster-Shafer combination details for two identical input evidence sets with three options each. E1 and E2 are evidence sets one and two, respectively. Each matrix cell mass is assigned to the specified Destination row mass unless otherwise stated in the cell.
| | | | | E1 | | | |
---|
| | A | B | C | | | | |
| A | | 0 | 0 | | | 0 | |
| B | 0 | | 0 | | 0 | | |
E2 | C | 0 | 0 | | 0 | | | |
| | | | 0 | | | | |
| | | 0 | | | | | |
| | 0 | | | | | | |
| | | | | | | | |
| Destination | | | | | | | |
Table 5.
Expected relationships between the cross-traffic light and the time until green. Given knowledge of the cross-traffic light at an intersection, this table details the expectations of the time until the light changes from red to green for the evaluator to continue through the intersection. Note that any ambiguous sets are removed for ease of description. Those sets can be interpolated from the relationships shown in this table.
Table 5.
Expected relationships between the cross-traffic light and the time until green. Given knowledge of the cross-traffic light at an intersection, this table details the expectations of the time until the light changes from red to green for the evaluator to continue through the intersection. Note that any ambiguous sets are removed for ease of description. Those sets can be interpolated from the relationships shown in this table.
| Green | Yellow | Red |
---|
Long | 0.9 | 0.1 | 0.0 |
Medium | 0.1 | 0.8 | 0.1 |
Short | 0.0 | 0.1 | 0.9 |
Table 6.
Example of why additional weighting schemes are necessary. “Evidence 1” has significantly higher weight than “Evidence 2”. Without an included weighing scheme, the resulting combined data in “Rayleigh” and “Murphy” have a balance in potential solutions between “A” and “C” instead of heavily favoring “A”, which would be expected if only “Evidence 1” were combined through the DS combination methods. The resulting combined data in “Weighted Rayleigh” and “Weighted Murphy” favor “A”, which is expected since “Evidence 1” favors “A”. Note that the Rayleigh [
28] method is designed to amplify decisions; thus a combination comprised of “Evidence 1” multiple times would be expected to nearly exclusively return “A”.
Table 6.
Example of why additional weighting schemes are necessary. “Evidence 1” has significantly higher weight than “Evidence 2”. Without an included weighing scheme, the resulting combined data in “Rayleigh” and “Murphy” have a balance in potential solutions between “A” and “C” instead of heavily favoring “A”, which would be expected if only “Evidence 1” were combined through the DS combination methods. The resulting combined data in “Weighted Rayleigh” and “Weighted Murphy” favor “A”, which is expected since “Evidence 1” favors “A”. Note that the Rayleigh [
28] method is designed to amplify decisions; thus a combination comprised of “Evidence 1” multiple times would be expected to nearly exclusively return “A”.
Data Set | Experience | A | B | C | (A,B) | (A,C) | (B,C) | (A,B,C) |
---|
Evidence 1 | 30.0 | 0.5 | 0.1 | 0.2 | 0.0 | 0.05 | 0.05 | 0.1 |
Evidence 2 | 0.5 | 0.1 | 0.05 | 0.4 | 0.05 | 0.2 | 0.1 | 0.1 |
Rayleigh | n/a | 0.47 | 6.1 × 10 | 0.24 | 2.6 × 10 | 0.28 | 1.2 × 10 | 2.2 × 10 |
Weighted Rayleigh | 30.5 | 0.91 | 3.8 × 10 | 2.2 × 10 | 1.4 × 10 | 6.9 × 10 | 1.6 × 10 | 2.2 × 10 |
Murphy | n/a | 0.30 | 0.03 | 0.30 | 0.025 | 0.13 | 0.075 | 0.10 |
Weighted Murphy | 30.5 | 0.49 | 0.011 | 0.20 | 8.2 × 10 | 5.2 × 10 | 5.1 × 10 | 0.10 |
Table 7.
Episodic Learning Test. The following two evidence sets have distinctly different correlations for the two-node network. “Evidence 1” shows a correlation between “Option A” in the parent node and “Option C” in the child node, and “Evidence 2” shows a correlation between “Option B” in the parent node and “Option D” in the child node.
Table 7.
Episodic Learning Test. The following two evidence sets have distinctly different correlations for the two-node network. “Evidence 1” shows a correlation between “Option A” in the parent node and “Option C” in the child node, and “Evidence 2” shows a correlation between “Option B” in the parent node and “Option D” in the child node.
Data Set | Option_A | Option_B | (Option_A, Option_B) | Option_C | Option_D | (Option_C, Option_D) |
---|
Evidence 1 | 0.9 | 0.08 | 0.02 | 0.9 | 0.1 | 0.0 |
Evidence 2 | 0.05 | 0.9 | 0.05 | 0.1 | 0.7 | 0.2 |
Table 8.
Episodic Learning Results. The baseline without episodic learning started with unknown information (all marginal masses in the complete sets). Both evidence sets were added and combined via Murphy’s Rule [
14] into their respective nodes, and the transition potential matrix was updated after each evidence injection. The second test applied the “Evidence 1” sets to the appropriate nodes and subsequently ran the transition potential matrix update algorithm. The node marginals were then reset to the unknown state, and the “Evidence 2” sets were applied to the appropriate nodes. The transition potential matrix update algorithm was again run to incorporate the second episode.
Table 8.
Episodic Learning Results. The baseline without episodic learning started with unknown information (all marginal masses in the complete sets). Both evidence sets were added and combined via Murphy’s Rule [
14] into their respective nodes, and the transition potential matrix was updated after each evidence injection. The second test applied the “Evidence 1” sets to the appropriate nodes and subsequently ran the transition potential matrix update algorithm. The node marginals were then reset to the unknown state, and the “Evidence 2” sets were applied to the appropriate nodes. The transition potential matrix update algorithm was again run to incorporate the second episode.
Without Episodic | Option_A | Option_B | (Option_A, Option_B) |
Option_C | 0.552 | 0.455 | 0.821 |
Option_D | 0.448 | 0.329 | 0.179 |
(Option_C, Option_D) | 0.0 | 0.216 | 0.0 |
With Episodic | Option_A | Option_B | (Option_A, Option_B) |
Option_C | 0.727 | 0.0 | 0.233 |
Option_D | 0.129 | 0.703 | 0.336 |
(Option_C, Option_D) | 0.143 | 0.297 | 0.431 |