Synchronization Issues for Cloud deployment with multiple instances

Topics: Core, General, Installing Orchard
Jan 15, 2014 at 10:36 AM
I deployed a multi-instance installation of Orchard using Orchard.Azure.sln. I have the following modules enabled:

Windows Azure Media Storage
Windows Azure Database Cache
Windows Azure Output Cache

Content is being correctly stored to an Azure SQL database and media to Azure BLOB storage.
All works fine when only 1 instance is running. However, when I scale up to 2 instances I get some synchronization issues. For example, If I create a new content type, this is only displayed intermittently in the Dashboard, depending upon which instance is hit.

The only way I have found to rectify this is to recycle the application pools on both instances - not really a workable solution.

I could also scale down to a single instance to add new content types and then scale up again - a better solution but not suitable in a production environment.

The Windows Azure Database Cache isn't behaving how I expected - or am I missing some configuration? This module implements an NHibernate second-level cache provider targeting Windows Azure Cache. In the Orchard Core/Framework, is there "first-level" caching going on?

I would be grateful for any ideas how to fix this issue.
Jan 15, 2014 at 9:30 PM
I too have been tormented by this issue with multi-instance deployments.

I'm very interested in trying to understand and fix it. I'm the one who wrote the new Windows Azure cache implementations, but while I have a fairly deep understanding of the Windows Azure Cache service, I have very little insight into NHibernate and how its caching model works.

I'm fairly confident that the Windows Azure cache provider does its thing properly, but I am less than convinced that Orchard/NHibernate actually invalidates/updates the cache whenever something is changed at the DB layer. I'm also highly suspicious that there might be more layers of caching going on within Orchard (above the actuall second-level NHibernate caching) for certain things, and that those other layers of caching are probably instance-local only.

I've mostly been seeing these issues when enabling/disabling features. The DB is updated (and presumably the NHibernate second-level cache also) but the other instances don't seem to pick up the change, which would indicate Orchard caches feature state in a higher level cache somewhere.

Perhaps someone with a deeper understanding of other forms or Orchard caching might shed some light?
Jan 18, 2014 at 7:11 PM
Edited Jan 18, 2014 at 7:14 PM
When I ws using SQL Azure with Orchard I noticed the second level NH Cache was bringing problems.
Sebastien has updated the NH libraries used by Orchard to use the Transient Library better adapted to SQL Azure (it automatically manages retries sent by SQL Azure) but incidently I noticed the NH-SQL Azure Library included with Orchard is missing one of its dependencies (see Issues where I reported this ).... never ending story....
Due to all this NH pbs I stopped using SQL Azure with Orchard....keeping it for EF or duirect ADO where it is very performant and resilient.

Concerning Azure I also noticed that some small parts of Orchard code are not adapted to Azure Web Services, as tis one I think is a problem on Azure
Feb 2, 2014 at 12:10 PM
@kevinboldy: I did some digging and it didn't take long to figure out why this is happening. Feature states are read through IExtensionManager and the default implementation relies heavily on the ICacheManager caching framework, which is always instance-local and basically just uses local RAM for storage. Obviously this is a problem in a multi-instance scenario.

If I remember correctly from some discussions with Sebastien a while back, the ICacheManager framework for some reason cannot use any distributed cache implementation because of its notion of events or notifications or something like that. Therefore I think the only solution would be to introduce a new implementation of IExtensionManager that instead directly uses Windows Azure Cache, and configure this implementation for use only in Orchard.Azure.Web (i.e. in a cloud service deployment). The issue obviously also applies to Azure Web Sites, but there we can't really preconfigure it since the deployment pipeline is shared with all other hosting environments too.
Feb 2, 2014 at 12:14 PM
You'll see a solution for multi-server setup and ICacheManager, I've done it :-).
Feb 2, 2014 at 12:16 PM
Tell me more please! Skype if you prefer.
Feb 2, 2014 at 12:31 PM
Edited Feb 2, 2014 at 12:34 PM
Actually on a second look, I was mistaken it seems. The set of loaded features is cached, but the set of enabled features comes from the ShellDescriptor which is read from ShellDescriptorManager directly from DB and never cached anywhere. I'm wondering now if instead this might be caused by the second-level NHibernate cache not being updated properly when the ShellDescriptorRecord is changed and save...
Feb 2, 2014 at 1:41 PM
In one-two weeks we'll release an open source module for enabling the propagation of such local changes through a server farm, without modifying or overriding existing services using them. We're currently testing it.
Feb 2, 2014 at 6:59 PM
Edited Feb 2, 2014 at 7:01 PM
From what I understand from this black hole, features are not related to new items.
Features are related to modules and what they contain, and the cache using them seems Ok as any change in modules definitions is/could be monitored.
ContentItems are related to the 'Content Items filters bus' which is totally local maintained in local dictionary for each ContentIem, this is the problem in web farms and multi role azure Paas.
Without any serious schema and documentation about this ....
Feb 3, 2014 at 8:06 AM
Sorry CSADNT but, as is so often the case, I don't understand what you're talking about or how it's related to this thread.
Feb 3, 2014 at 11:53 AM
Sorry Decorum if you can't understand, kevinboldy was complaining about problems related to new content items being created and not correctly displayed in webfarm.
And not new features being enabled/disabled, you are too focused on Azure and Shell features.
Feb 28, 2014 at 3:56 PM
I am running into multi-server caching issues like crazy. Any chance (Piedone) you have a release / source code enlistment for the work you are doing on multi-server setup / ICacheManager handling in a server farm? I was just looking at implementing your "LockingCacheManager" configuration from your HelpfulLibraries module for the most critical areas where caching is causing big issues and I need a single instance of the cache but a global solution sounds much more appealing.
Feb 28, 2014 at 4:13 PM
@jao28: you just have to wait until Tuesday :-).
Feb 28, 2014 at 4:17 PM
Oh that is mean but it will definitely make me attend the next orchard meeting. That is the "problem" (and fun) with having more clients, things just get more complex and you need to be able to scale over multiple servers / instances. Looking forward to it!
Mar 11, 2014 at 10:52 PM
And here's the module I hinted about: Distributed Events.