Edit detail for DecayingLFUCacheExpiry revision 2 of 17

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
Editor: DonovanBaarda
Time: 2014/07/02 10:57:15 GMT-4
Note:

added:
Introduction
------------


changed:
-Except you add an exponential decay to the access counter. At every request all the counters are decayed before the new access is incremented. This means the access counters represent the exponentially decaying average access count over the last N cache accesses, where N is the timeconstant of the exponential decay used. This should prevent the scan cache-thrashing of LRU, while also preventing the page-burst cache sticking of LFU. The decay could be time based, but it probably makes sense to decay per cache access.
Except you add an exponential decay to the access counter. At every request all the counters are decayed before the new access is incremented. This means the access counters represent the exponentially decaying average access count over the last T interval, where T is the timeconstant of the exponential decay used. This should prevent the scan cache-thrashing of LRU, while also preventing the page-burst cache sticking of LFU. 

The decay interval could be measured in real time or cache accesses. It's a bit simpler and may work better to use cache accesses, but using time would work better if cache locality patterns tend to depend on real time more than the cache access rate.

changed:
-An inverted priority queue can be used to keep the expiry order. Before any access all counters are decayed. On a hit, the page is pulled, incremented, and pushed back into the priority queue. On a miss the lowest count in the queue is pulled, and the new loaded page pushed in with a counter of 1. Note that decaying all the counters does not effect their order in the priority queue, since they all decay at the same rate.
-
-To avoid decaying all the counters every access, it is possible to also keep a "time" with every access counter that records when it was last accessed, and the decay calculation can be deferred until that node is updated/compared. Note that the comparison function used in the priority queue needs to calculate the decayed value for comparing. This doubles the metadata size, makes the O(log(N)) pqueue insert/remove operations a bit more expensive, but removes the O(N) count decaying step on each access.
-
-The decay rate could be fixed, but it might make sense to have it dynamically adjust based on the performance of the cache. The overall performance of the cache can be calculated as the average of all the access counters. A number below 1 suggests very bad performance, with less than 1 access per cache entry average over the last N timeconstant accesses. A number >1 suggests good cache hitrates. Note that the cache performance can be incrementally updated every cache access without having to scan all the cache entries. The average is just decayed the same as each 
-counter, if there was a miss, decrement it by the flushed page counter, then add 1 for the new page access.
-
-Note that dynamically adjusting the decay timeconstant messes a bit with deferred delay calculations, since very old pages in the cache are likely to have "decayed" at different rates. It's also possible that the order in the pqueue could be wrong.
Implementation
--------------

An inverted priority queue can be used to keep the expiry order. Before any access all counters are decayed. On a hit, the page is pulled, incremented, and pushed back into the priority queue. On a miss the lowest count in the queue is pulled, and the new loaded page pushed in with a counter of 1.0. Note that decaying all the counters does not effect their order in the priority queue, since they all decay at the same rate.

Note that the decay operation is::

  value = T/(T+dt) * value

Where dt is the interval (in accesses or real time) since the last decay was calculated. If you are measuring the interval in accesses and updating every access, then dt=1. You 
can also just halve the value every T interval.

To avoid decaying all the counters every access, it is possible to also keep a "time" with every access counter that records when it was last accessed, and the decay calculation can be deferred until that node is updated/compared. Note that the comparison function used in the priority queue needs to calculate the decayed value for comparing. This doubles the metadata size, makes the O(log(N)) pqueue insert/remove operations a bit more expensive, but removes the O(N) count decaying step.

Alternatively, the O(N) decaying does not need to be done at every access. Instead it can be done periodically, every fixed number of accesses or time interval. Note that this means you loose "time resolution" differences between accesses within the same "decay interval", so you want to keep it small relative to the cache churn rate.

The counter could be implemented as a float, or a "fixed point" integer. Assuming the fixed point is after 8bits, you would increment it by 1<<8=256 for each access, and decaying could be implemented by a right-shift every T interval. You need to be careful not to overflow the value, making sure the decay rate and updates are sufficient to keep it within the int range. Since each T interval you "shave of a bit" doing the shift-right, you need to keep T smaller than (maxint>>8)/2 accesses.

The priority queue doesn't need to be in the correct order until you need to expire an old entry. This means you can defer updating it until there is a cache miss. You could just O(N) heapify it on each cache miss, but if you are going to do that you might as well loose the priority queue and just O(N) find min. Alternatively you could keep a small list of "dirty" entries that have updated values but have not yet been adjusted in the priority queue, and adjust them all on a cache miss. This avoids the O(ln(N)) pqueue insert/delete operations for duplicate hits between cache misses.


Tuning
------

The T decay rate could be fixed, but it might make sense to have it dynamically adjust based on the performance of the cache. The overall performance of the cache can be calculated as the average of all the access counters. A number below 1.0 suggests very bad performance, with less than 1.0 access per cache entry average over the last T interval. A number >1.0 suggests good cache hitrates. Note that the cache performance can be incrementally updated every cache access without having to scan all the cache entries. The average is just decayed the same as each counter, if there was a miss, decrement it by the flushed page counter, then add 1.0 for the new page access.

Note that dynamically adjusting the T decay timeconstant messes a bit with deferred delay calculations, since very old pages in the cache are likely to have "decayed" at different rates. It's also possible that the order in the pqueue could be wrong. This means when you do adjust the T timeconstant, you need to first decay all the values using the old T, or re-heapify all the values using the new T. Note that re-heapifying means the T change is applied retroactively by varying amounts depending on the last update of each value, so decaying all the values is the better option.



Introduction

The idea is similar to LFU cache expiry;

http://en.wikipedia.org/wiki/Least_Frequently_Used

Except you add an exponential decay to the access counter. At every request all the counters are decayed before the new access is incremented. This means the access counters represent the exponentially decaying average access count over the last T interval, where T is the timeconstant of the exponential decay used. This should prevent the scan cache-thrashing of LRU, while also preventing the page-burst cache sticking of LFU.

The decay interval could be measured in real time or cache accesses. It's a bit simpler and may work better to use cache accesses, but using time would work better if cache locality patterns tend to depend on real time more than the cache access rate.

Note that this is similar to http://en.wikipedia.org/wiki/Page_replacement_algorithm#Aging, but more generalized.

Implementation

An inverted priority queue can be used to keep the expiry order. Before any access all counters are decayed. On a hit, the page is pulled, incremented, and pushed back into the priority queue. On a miss the lowest count in the queue is pulled, and the new loaded page pushed in with a counter of 1.0. Note that decaying all the counters does not effect their order in the priority queue, since they all decay at the same rate.

Note that the decay operation is:

value = T/(T+dt) * value

Where dt is the interval (in accesses or real time) since the last decay was calculated. If you are measuring the interval in accesses and updating every access, then dt=1. You can also just halve the value every T interval.

To avoid decaying all the counters every access, it is possible to also keep a "time" with every access counter that records when it was last accessed, and the decay calculation can be deferred until that node is updated/compared. Note that the comparison function used in the priority queue needs to calculate the decayed value for comparing. This doubles the metadata size, makes the O(log(N)) pqueue insert/remove operations a bit more expensive, but removes the O(N) count decaying step.

Alternatively, the O(N) decaying does not need to be done at every access. Instead it can be done periodically, every fixed number of accesses or time interval. Note that this means you loose "time resolution" differences between accesses within the same "decay interval", so you want to keep it small relative to the cache churn rate.

The counter could be implemented as a float, or a "fixed point" integer. Assuming the fixed point is after 8bits, you would increment it by 1<<8=256 for each access, and decaying could be implemented by a right-shift every T interval. You need to be careful not to overflow the value, making sure the decay rate and updates are sufficient to keep it within the int range. Since each T interval you "shave of a bit" doing the shift-right, you need to keep T smaller than (maxint>>8)/2 accesses.

The priority queue doesn't need to be in the correct order until you need to expire an old entry. This means you can defer updating it until there is a cache miss. You could just O(N) heapify it on each cache miss, but if you are going to do that you might as well loose the priority queue and just O(N) find min. Alternatively you could keep a small list of "dirty" entries that have updated values but have not yet been adjusted in the priority queue, and adjust them all on a cache miss. This avoids the O(ln(N)) pqueue insert/delete operations for duplicate hits between cache misses.

Tuning

The T decay rate could be fixed, but it might make sense to have it dynamically adjust based on the performance of the cache. The overall performance of the cache can be calculated as the average of all the access counters. A number below 1.0 suggests very bad performance, with less than 1.0 access per cache entry average over the last T interval. A number >1.0 suggests good cache hitrates. Note that the cache performance can be incrementally updated every cache access without having to scan all the cache entries. The average is just decayed the same as each counter, if there was a miss, decrement it by the flushed page counter, then add 1.0 for the new page access.

Note that dynamically adjusting the T decay timeconstant messes a bit with deferred delay calculations, since very old pages in the cache are likely to have "decayed" at different rates. It's also possible that the order in the pqueue could be wrong. This means when you do adjust the T timeconstant, you need to first decay all the values using the old T, or re-heapify all the values using the new T. Note that re-heapifying means the T change is applied retroactively by varying amounts depending on the last update of each value, so decaying all the values is the better option.

In python;

class Cache(object):

def __init__(self, size):
self.size=size self.pqueue

etc.