Orchard Performance Question

Topics: General
Feb 2, 2015 at 11:12 PM
Hey team,

I had a question on when it makes sense to be adding a separate reverse proxy in conjunction with Orchard. Recently I had a clients traffic scale up from around 5k users a day to 30k. This is spiking at times where there could be 20 requests a second to the Orchard site. I have been running with Orchard.OutputCaching, DB Level Caching, enabled and this is still not able to keep up. Today I turned on the IIS kernel level cache as well in order to get some more performance.

My question is if the kernel level cache doesn't work what is the typical next step? The server is a quad core with 8GB RAM. Would I add another server or should I add a reverse proxy (nginx) to serve cached versions on pages quicker during high traffic times? Any advice would be great.
Feb 2, 2015 at 11:19 PM
A separate reverse proxy will be always better than the orchard cache. There are features you'd miss like automatic content refresh, but that's a limited scenario.

At the same time you might want to check the latest Orchard.Cache source code on 1.x which has been improve for concurrency scenarios, Daniel did a great job on this. This should just work as a replacement of this existing.
Feb 2, 2015 at 11:20 PM
But 20 rps is really low. That is weird.
Feb 2, 2015 at 11:27 PM
You think the kernel cache enabled and Orchard Output Cache should handle 20 requests per second?
Feb 2, 2015 at 11:30 PM
And by requests I mean like 20 people hitting the homepage per second. Not just general hits across the board.
Feb 3, 2015 at 12:41 AM
On my local box it supports hundreds of rps, with tens of users. So yes, definitely.
You can try apache ab command line tool.
Feb 3, 2015 at 12:46 AM
That is with kernel cache on though right? Not just the in memory output cache.

Sent from my Windows Phone

Feb 5, 2015 at 12:32 PM
Just an update on this, and hopefully I can get some more insight from people who have leveraged Orchard in scenarios with more traffic. I am asking for assistance because I have not built many sites with this amount of traffic and need to understand what is "normal" and what is not.

The machine is getting about 5k visitors in an hour. Which is about 83 users a minute in peak times. In a scenario without any caching I think we all agree that would not work at all.I had SysCache on for DB level caching and the OutputCache module on as well to cache pages in memory. This still was not keeping up with this traffic amount. The CPU was at 100% during peak times. Secondly, if any sort of error occurred during these times the site could not recover because it was just getting hit with too many requests while starting up.

The solution I have in place now is having the OutputCache value at 600 seconds (10 minutes) and the Kernel cache value at 300 seconds (5 minutes). This has helped greatly. The CPU is now down to a normal level and handling the load because it is serving the requests out of the IIS kernel and sending the max-age header. Secondly, I added the Warmup Module for the homepage and other popular pages so that while the site is coming up it can send static versions of those pages to visitors while the site gets loaded for the first time. The only downside of some of this more aggressive caching is I needed to make sure I called out my form pages to not be cached in the configuration and also some localization things are giving me a little trouble. But I think I can get around these with some minor efforts.

My questions are :
  1. Does 5k visitors in an hour warrant this setup? This seems like a requirement for any single server hosting model in order to keep up with that load.
  2. Would the recommended next step be to split the load by adding another web server with some load balancing or add a reverse proxy like NGINX to the configuration?
  3. It would be great to just understand some general guidelines when optimizing Orchard for these situations. I know modules being added, number of widgets, navigation complexity all contribute to the response time here but in a standard corporate site using these components what can we expect?
I am happy to write a formal blog post on all this so we can have this documented out there. Thanks everyone!
Feb 5, 2015 at 1:44 PM
Could you give us a link to said site?
Feb 5, 2015 at 1:46 PM
sebastienros wrote:
On my local box it supports hundreds of rps, with tens of users. So yes, definitely.
You can try apache ab command line tool.
If your site exists out of that single page you're testing, then maybe it is a valid test.

For anything else, ab is mostly a nice toy ;)

That said, if you would 'fail hard' at a single url using ab then you DO have a problem ;)
Feb 5, 2015 at 2:21 PM
Sorry, I don't want to put the link out there. I hope you understand.

I can understand it supporting hundreds of RPS and 20-30 users at a time, but I am asking if that is with kernel cache on or not. When Sebastien did those tests did he have kernel cache enabled?
Feb 5, 2015 at 3:23 PM
We're using a custom caching solution for tacx.com (so separate from kernel caching) and can handle a good amount of RPS and it (after lots of core hacks) 'survived' a 500+ flood of users (as reported by google analytics at that time).

In short: we spend lots of time to improving the performance, if your client has the time and money for it, maybe you can do the same?
Feb 7, 2015 at 10:45 AM
What I would do is some load testing and figure out what the maximums really are so you can then plan appropriately. It is important to get real data on what is happening so as a starter I would do the following.

You need a beefy VM in the cloud maybe a few (so you can have enough RAM on the client plus the bandwidth to send to your site).
Install PRTG Web stress tool (its freeware).
Configure performance counters on your host you are about to test. This can be for whatever you want to monitor - RAM, CPU etc.
Enable the mini profiler for Orchard and modify the code so it stores to a database. Be aware that this will have an impact on performance in an of itself.
Carry out a RAMP test for 15 minutes.
Review the data.
Change your site config and retest until you are happy.

This is a poor-mans load test which is what I have done in the past. Ideally one would have Visual Studio Ultimate and make use of Load tests via Visual Studio online that makes use of Azure VMS as the load test clients but if you don't have the budget for that then the above should give some insight.

It's a tricky business but the idea is to break your site so you know what the breaking point looks like in your monitoring and obviously what the bottlenecks are. Sometimes you can do something about those bottlenecks and other times you can't at which point throwing tin at the problem is the best course of action!