3.1.3. Comparision to RL And EWA

In RL and self-tuning EWA, attractions at time *t* are a function of attractions at time *t* − 1, and attractions explicitly grow when the strategies they correspond to are valuable to the agent, a process called 'accumulation'. Note that CBL does not explicitly accumulate attractions in this way, and has no built-in depreciation or accumulation factor such as *φ* or *<sup>N</sup>*(*t*) *<sup>N</sup>*(*t*−1). However, closer inspection suggests that CBL implicitly accumulates attractions through how it handles cases in memory: as new cases enter memory, when payoffs exceed the aspiration level, they increase the attraction of the corresponding strategy. This appears to function as explicit accumulation does in the other theories. One important difference between the case-based approach to accumulation and the RL/EWA method is dynamic re-weighting: that is, when the current problem (information vector) changes, the *entire memory* is re-weighted by the corresponding similarity values. There is accumulation of a sort, but that accumulation is information-vector dependent. Accumulation through similarity allows for CBL to re-calibrate attractions to strategies that are based on information in the current and past problem sets.<sup>7</sup>

Depreciation can also be modeled in a natural way in CBL: if time is a characteristic in the information vector, then cases further in the past automatically play a diminishing role in current utility forecasts as they become more dissimilar to the present.
