For your review: An overview of the various forms of Orchard caching

Topics: Customizing Orchard, General, Installing Orchard
Developer
Mar 16, 2013 at 5:26 PM
Edited Mar 16, 2013 at 7:43 PM
Hi guys,

For my own understanding I decided to try and grasp the various forms of Orchard caching and how they relate to one another and to the underlying platform. After spending a few hours digging through blog posts and source code, I decided to write down my conclusions.

I'd like to know, does this correspond to your understanding? Have I interpreted the caching landscape correctly? Any feedback would be very much appreciated!

First of all, please take a look at the diagram: Orchard caching architecture

Orchard has two built-in forms of caching: application caching and nHibernate second-level caching:
  1. Application caching is part of the core, and it is used by Orchard Core and all modules that have a caching need. Storage is provided by the abstration ICacheHolder. There is a default implementation that simply stores cached data in-memory.
  2. nHibernate second-level caching is enabled by the SysCache module, which configures nHibernate to use the SysCache2 provider for its second-level cache. The SysCache2 provider uses ASP.NET Application Caching under the hood.
Additionally, a third form of caching can be enabled by the Contrib.Cache module written by Sebastien. This module adds an Orchard equivalent of ASP.NET Output Caching. Storage is provided by the abstraction IOutputCacheStorageProvider. There is a default implementation shipped with the module that uses ASP.NET Application Cache under the hood.

ASP.NET provides two forms of caching: output caching and application caching. As mentioned, ASP.NET Application Caching is used by default for both nHibernate second-level caching and Orchard output caching. ASP.NET Output Caching however is never used in an Orchard site.

In the default caching scheme outlined above, all cached data eventually ends up in local memory, which is less than ideal if you’re running your site in a web farm. In such a scenario it is desirable to enable some sort of distributed cache. To enable distributed caching scenarios, there is the Contrib.Cache.Memcached module, also written by Sebastien. This module provides two new implementations of ICacheHolder and IOutputCacheStorageProvider respectively. The former is actually used by the latter, and effectively makes sure that all Orchard output caching and application caching is done in a Memcached service of your choice (this can be configured in the site settings for the Contrib.Cache.Memcached module).

One implementation of Memcached is Windows Azure Cache, which can be enabled in an Azure Cloud Service. Windows Azure Cache provides a Memcached shim that is installed locally in your role instances. This means you can configure the Contrib.Cache.Memcached module to work against a Memcached service running on localhost. The local Memcached shim integrates with the managed Windows Azure Cache client assembly, which in turn talks to the Windows Azure Cache service.

(As a side note, Windows Azure Cache is also wire-level compatible with Memcached, so using the local shim is optional but highly recommended because of some intricate details involving differences in how hash keys are generated in Memcached vs. Windows Azure Cache. Without the shim extra network hops are necessary for cache items to end up on the right cache cluster node, which is bad for performance. Also some of the more advanced caching features are only available when using the local shim.)

The Contrib.Cache.Memcached does not however adress the need to direct the nHibernate second-level cache to a distributed cache. For this purpose, there is another third-party module named Webmoco.AzureMemcached which provides an alternative nHibernate caching provider that also uses Memcached as its underlying storage. This module is preconfigured to automatically cache to a locally running Memcached shim, but could easily be modified to cache to any Memcached service.

Does the above sound about right?
Coordinator
Mar 17, 2013 at 12:05 AM
Edited Mar 17, 2013 at 12:46 AM
\bin\orchard.exe
brain dump cache

You made a mistake, which is to think that Contrib.Cache.Memcached contains a custom implementation for ICacheHolder. And you also missed something, actually there is another set of Caching modules which will be part of 1.7 but already available on Bitbucket and used in production by some. Its Orchard.Caching. But let me explain the whole picture ...

Here are the different levels of caching that Orchard can provide:

Application settings cache using the built in ICacheManager.

This is used to store application settings and can be invalidated based on an extensible set of parameters. By default you find expiration tokens based on time, file system and signals. This is a very powerful caching API but has one drawback: it is not meant to provide farm invalidation, because it has not been designed for this purpose and should not be used for data which is volatile. Using for settings is totally fine, like for Content Types, module settings, ... because those values must not change on a production farm. You never create a content type on production on a farm, or you have to restart all nodes one after the other.
Another reason to use this module (and why it has been done) is that it is not dependent on memory pressure, so entries won't be removed if your system memory consumption grows, as opposed to the ASP.NET cache. All the other cache providers are and must use memory pressure limits.

2nd level NHibernate caching.

This is used to prevent recurring sql queries to the database. Because the accessors are simple and well defined (checking a string in the dictionary) it's safe to use it on a farm as long as the data is store in a central repository, like using Memcached. I tried to implement a simple Memcached provider for it but I failed as there is a tricky situation as you need to configure the provider (location, port), and usually the settings are best placed in the database, but it's the chicken and egg issue and you can't bootstrap it from a module. The only solution is then to have the configuration for it inside the web.config, or maybe the settings.txt when it will be extensible.

Output caching using the Contrib.Cache suite.

Here is the set of relating modules:
The goal of this module is to provide output caching like ASP.NET does, and to provide cache headers management (max-age, Cache-Control, ETag). I recently extended it to be able to define dynamically the storage mechanism for the cache data as distributed setups where more widely used. This is why there are two distributed storage provider, one based on the Database and the other one based on Memcached.

Not a single Orchard website should go into production without this module. Not only does it improves responsiveness but also throughput, and finally it frees your CPU from unneeded cycles. Using the Max Age setting you also enable IIS Kernel caching plus public proxy cache which makes your application even faster. You can get thousands of requests per seconds with a very small server.

Business data caching using the Orchard.Caching suite.

Because of the limitations of the ICacheManager in terms of distributed caching, I have decided that another set of modules where necessary to cache business data which has to be shared across servers. This module can set and get entries by a key only, and invalidate by name or time. This is the only requirement for storage providers in this module, which allows its usage on farms.

Why Memcached ?

Implementing Memcached providers by default is done for a specific reason, which is that Azure Caching Services are binary compatible with it. So this implementation works by default on both custom Memcached servers and also Azure services.

NB: Anyone fell free to copy/paste this information and more links or setup instructions to the documentation.
Coordinator
Mar 17, 2013 at 1:07 AM
Here: http://docs.orchardproject.net/Documentation/Caching

Not sure why you didn't do it yourself ;) You already had the Markdown...
Coordinator
Mar 17, 2013 at 1:58 AM
Let say I'll answer more questions like this and you will gather those answers into official doc topics.
Developer
Mar 17, 2013 at 5:41 AM
Edited Mar 17, 2013 at 5:47 AM
Wow... thank you! That's exactly the kind of overview I was trying to achieve. Maybe I should have just asked you. But then again, what's really baking my cookie is would I have gotten such an extensive response without writing an incorrect one myself first? :)

A few follow-up questions if you don't mind:
You made a mistake, which is to think that Contrib.Cache.Memcached contains a custom implementation for ICacheHolder.
Yeah I realized that one a few hours after posting... I was fooled by the fact that there is a class named *CacheHolder in this module too.
Application settings cache using the built in ICacheManager
This is used to store application settings and can be invalidated based on an extensible set of parameters. By default you find expiration tokens based on time, file system and signals. This is a very powerful caching API but has one drawback: it is not meant to provide farm invalidation, because it has not been designed for this purpose and should not be used for data which is volatile.
I'm not clear on who's the chicken and who's the egg here. ICacheManager does not support farm invalidation, because it was designed for things like module settings, because those values must never change. Because if you change them you have to restart all your nodes. Because they use ICacheManager. And ICacheManager does not support farm invalidation. And so and and so forth. Circular causality?

I'm also not clear on why farm invalidation is needed. If you use a distributed cache as your underlying storage, isn't invalidation basically a non-issue? Any changes to cached entries are immediately seen by all other nodes in the farm. So it's enough if the invalidation occurs on one node right? An expiration token triggers on one of the nodes, the entry is removed from the distributed cache, and all is good?

Assuming most modules use this API for caching their settings, I am starting to see this is perhaps a problem for farm scenarios. While I agree that content types should not be changed haphazardly, it is easy to imagine lots of module settings changes that need to happen simultaneously on all nodes across a farm, and having to redeploy your entire service for that to happen seems excessively disruptive for a simple settings change.
2nd level NHibernate caching
I tried to implement a simple Memcached provider for it but I failed as there is a tricky situation as you need to configure the provider (location, port), and usually the settings are best placed in the database, but it's the chicken and egg issue and you can't bootstrap it from a module. The only solution is then to have the configuration for it inside the web.config, or maybe the settings.txt when it will be extensible.
Good point. The Webmoco module solves this by not making it configurable at all - it simply relies on there being a local Memcached shim on a predetermined port.

Making Settings.txt extensible would be the best solution to this I think. As a side note, I have written an IShellSettingsManager implementation that reads Settings.txt values from Azure Role configuration. I did this for two reasons:
  1. The provided AzureShellSettingsManager reads the Settings.txt file from blob storage, which is cool in a way, but means you can't fully use VIP swapping between staging and production if staging and production use different databases. As soon as your staging instance is swapped into production it gets the production hostname, and therefore also the production site name and its Settings.txt. (Can be solved by using separate storage accounts I guess.)
  2. Reading Settings.txt from blob storage means role instances can't automatically pick up changes. With my implementation they do pick up the changes as soon as you change a role configuration setting though the Azure admin portal.
Output caching using the Contrib.Cache suite
Not a single Orchard website should go into production without this module.
Totally agree. It seems like a very well built module too. Are there any plans to bring it into core for 1.7? If so perhaps it should be renamed as Orchard.OutputCaching?
Business data caching using the Orchard.Caching suite
Because of the limitations of the ICacheManager in terms of distributed caching, I have decided that another set of modules where necessary to cache business data which has to be shared across servers. This module can set and get entries by a key only, and invalidate by name or time. This is the only requirement for storage providers in this module, which allows its usage on farms.
When and for what should a module use this type of caching? Module settings? Data brought in from other sources besides the database (since the database is already cached via nHibernate)? Will any of the core modules be modified to use it for 1.7?
Developer
Mar 17, 2013 at 5:44 AM
BertrandLeRoy wrote:
Here: http://docs.orchardproject.net/Documentation/Caching
Superb! Would you mind also adding a link to it from the start page TOC?
Coordinator
Mar 17, 2013 at 6:13 AM
Sure, done. Please feel free to update the new topic with any changes that may be useful.
Coordinator
Mar 18, 2013 at 6:18 PM
chicken and egg explanation:
If you use the Memcached module you need to configure it, by setting the ip and the port (127.0.0.1 11211). This configuration goes into the database because it is tenant specific, and we need to be able to change it from the dashboard. When orchard starts, it will configure the nhibernate database providers, and if Memcached is enabled, it will be used. To load the settings, it will do a query, so initialize the db configuration, but wait, half of it (the memcached settings) is in the db. So using Memcached as a 2nd level cache provider is hard in Orchard. The only solution right now would be to use web.config custom settings. It's ugly in Orchard terms but it will work. Maybe the module could look for the settings in web.config first, then into the db settings if it can't find it.

farm invalidation:
with the cache API, if a token is expired it recalculates the value. How does the other token on the other node knows that the value is invalidated ?

Orchard.Caching:
typical example of "business" data, you are using a web service which takes long to compute, and want to distribute the cache.
Developer
Mar 18, 2013 at 9:45 PM
Sebastien,

You explained the wrong chicken and egg problem - the nHibernate problem I fully understand, and as I noted, the Webmoco module works around this by not making it configurable at all. What I questioned was the circular causality you described for the ICacheManager API.

Farm invalidation: because, if you use a distributed cache, there is only ever one value. The memory is shared. If one node recalculates the value, that's enough. So my point is, if you use a distributed cache as the underlying store for ICacheManager, farm invalidation is not necessary.
May 15, 2013 at 4:54 PM
Edited May 15, 2013 at 5:39 PM
Has anybody implemented the MemCached Extension on Azure Cache Service ?
Sebastien , If I understand what you are explaining, I install your https://bitbucket.org/sebastienros/orchard.caching.memcached module but use Azure Cache API in place of Memcached ????
Could you develop with some selected instructions :)