Proposal for improved output cache implementation

Topics: Core, General
Developer
Nov 27, 2014 at 10:07 PM
As discussed in the community meeting on Nov 25 the output cache logic has some issues that we'd all like to address. This is a proposed implementation change, posted here to request feedback from the developers and community. Basically we want to accomplish two things:
  • Multiple requests for the same cache key should not all try to regenerate the same content in parallel - instead one should regenerate for all of them.
  • If possible, existing cached content should be served to those requests who DON'T do the regeneration, instead of having them block.
We have to consider the following challenges:
  • Depending on the underlying storage provider, expiration and evicting an item from the cache is often done by the cache itself, not actively by Orchard. Therefore, in order to serve stale content, regeneration must begin before the cached content expires.
  • Serving from cache is done in one method, while caching rendered content is done in another. This makes it difficult to reliably hold a lock for the duration of the request (no using statements or try/finally is possible) - what if the request fails in such a way that the second part never executes?
  • The time it takes to render a given item is unknown. Any introduction of pre-fetch or grace times will therefore be arbitrary. Too wide a margin and cached content will be regenerated too often - too narrow a margin and a number of requests will have to either block or regenerate at the same time. Ideally the time spans involved need to be configurable in output cache settings, and whether regeneration takes longer or shorter than expected it all needs to be handled gracefully.
  • Web farms. The cache storage might be distribured/synchronized across farm nodes, but thread synchronization primitives are not. Therefore, we must either use database transactions for cross-farm synchronizations, or accept that multiple farm nodes will race to regenerate the same content. I believe the latter is an acceptable trade-off, and should be treated as a benign race condition.
Active pre-fetching in is IMHO unnecessarily complicated.

Instead I propose a configurable grace time (exposed in output cache settings UI). It would work something like the following.

A request for a given item always starts with generating a cache key. After that there are basically 6 possible different scenarios/code paths for any given request.

Scenario 1A: Request does NOT find the cache key in the cache. Request successfully acquires a lock for the cache key (lock objects are stored in a static concurrent dictionary using the cache key as a key) and proceeds to regenerate the content. When regeneration is completed request writes the newly generated content to the cache and releases the lock for the cache key. It serves the generated content as the response.

Scenario 1B: Request does NOT find the cache key in the cache. It tries to acquire a lock for the cache key, but the lock is held by another thread which means the content is in the process of being regenerated already. Request blocks waiting for the lock, with a timeout of 20 seconds (this timeout is only as a safeguard for complete deadlocks, and this timespan might also be configurable). The lock is acquired within the timeout, the request checks the cache again and this time finds the content, and serves it as the response.

Scenario 1C: Same as 1B above, but if the lock acquisition timeout expires, or if the content cannot be found in cache after lock is acquired, request proceeds to regenerate the content, writes it to the cache and serves it as the response.

Scenario 2: Request FINDS the cache key in the cache, and sees that the item Will NOT expire within the grace time. Request serves the cached content as the response.

Scenario 3A: Request FINDS the cache key in the cache, and sees that the item will expire within the grace time. Request successfully acquires a lock for the cache key and proceeds to regenerate the content. When regeneration is completed request writes the newly generated content to the cache (overwriting any existing item that might be there because of benign race conditions) and releases the lock for the cache key. It serves the generated content as the response.

Scenario 3B: Request FINDS the cache key in the cache, and sees that the item will expire within the grace time. It tries to acquire the lock but fails, and concludes the item must therefore be in the process of being regenerated by another request already, and serves the cached content as the response.

Since cached content is served from one method (OnActionExecuting), while caching rendered content is done in another (OnResultExecuted), it also follows that the acquisition of locks must occur in one method while the release of locks must occur in another. There is an apparent risk there that the latter might not execute, for whatevere reason. We have investigated this a bit:
  • Empirically it seems that wherever we can think to throw an exception, the OnResultExecuted always executes. I imagine the only exception to this is crashes on the ASP.NET or IIS level, in which case the AppPool will probably recycle anyway, and all locks would be automatically released.
  • As a safeguard lock acquisition is always done with a timeout (20 seconds seems reasonable). This allows the application to continue to serve a piece of content normally, even in the unlikely event of an orphaned lock for that content - the only exception being that regeneration when that particular item expires from the cache will take 20 seconds longer, until the lock is released or the AppPool is recycled.