Orchard on Azure

Topics: Customizing Orchard, General
Sep 14, 2011 at 2:44 PM

Hi,

We are planning to host Orchard CMS based solution on Azure. The expectation is to exceed 150GB content size over 12-months period even after using images/videos on BLOB. The known limitation of SQL Azure is 50GB. Is there anything in Orchard roadmap that would meet this requirement.

Any pointer or guidelines to achieve this scenario would help.

Thanks.

Coordinator
Sep 14, 2011 at 5:41 PM

If this is images and video, you should be able to write a modified file system abstraction (Orchard has this abstraction built-in precisely to be able to run on different abstractions such as Azure blob) starting from the Azure blob implementation that we ship with, and do some sharding across different blob DBs. So it will take a little work but it should not be that hard. The media module relies on that file system abstraction already.

Sep 15, 2011 at 4:39 AM

Yes we know this. We are already using Azure BLOB for image/videos. Excluding imgae/videos we are getting 150GB content size (Blogs / Forums / Comments / Chapters and etc.,) which is particularly targetting SQL Azure . .. . . Here we need a solution since the max size of SQL Azure is only 50GB.

Coordinator
Sep 15, 2011 at 7:33 AM

Oh, I see. Well, I'm surprised that there would be such a limitation on SQL Azure. Are you sure? Can't it be upgraded? In any case, that's hardly an Orchard issue, is it?

Sep 15, 2011 at 11:32 AM

Though, Orchard can be deployed on Windows Azure, seems its not having Azure friendly design. Think there should be some changes in the underlying design for scalability purpose. If you have any thoughts on this, please share it here. Thanks

Coordinator
Sep 15, 2011 at 4:10 PM

I don't think it's very fair to say that. We have built a lot of components that deal specifically with Azure, such as that file system abstraction. That SQL Azure does not support >50GB databases has nothing to do with Orchard. If anything, it's Azure that is not friendly to large sites, not Orchard that is not friendly to Azure. Did you try reaching out to the Azure people to find out if larger databases can be negotiated on a case by case basis? I can get you in contact with them.

Coordinator
Sep 15, 2011 at 4:39 PM

Quick explanation after some research, this is just a logical limit, for security purpose, you can change it.

http://msdn.microsoft.com/en-us/library/windowsazure/ee336245.aspx#dcasl

SQL Azure Database provides two database editions: Web Edition and Business Edition. Web Edition databases can grow up to a size of 5 GB and Business Edition databases can grow up to a size of 50 GB. The MAXSIZE is specified when the database is first created and can later be changed using ALTER DATABASE. MAXSIZE provides the ability to limit the size of the database. If the size of the database reaches its MAXSIZE, you will receive an error code 40544. When this occurs, you cannot insert or update data, or create new objects, such as tables, stored procedures, views, and functions. However, you can still read and delete data, truncate tables, drop tables and indexes, and rebuild indexes. If you remove some data to free storage space, there can be as much as a fifteen-minute delay before you can insert new data.

From: bertrandleroy [email removed]
Sent: Thursday, September 15, 2011 9:11 AM
To: Sebastien Ros
Subject: Re: Orchard on Azure [orchard:272567]

From: bertrandleroy

I don't think it's very fair to say that. We have built a lot of components that deal specifically with Azure, such as that file system abstraction. That SQL Azure does not support >50GB databases has nothing to do with Orchard. If anything, it's Azure that is not friendly to large sites, not Orchard that is not friendly to Azure. Did you try reaching out to the Azure people to find out if larger databases can be negotiated on a case by case basis? I can get you in contact with them.

Sep 16, 2011 at 9:19 AM

150GB seems like an awful lot of blog/comments/chapters anyway, if you're not including any of the assets which would be part of them (images/videos, etc.). Are you sure you're going to have that much text content?

If you really will (in which case, wow I guess, I'd be interested in knowing what's generating content at that rate) then you could look at sharding across multiple SQL Azure DBs. As Bertrand says, that's not built in to Orchard (unsurprisingly!) but it could probably be done with some modifications/addition to the storage abstraction.

This: http://social.technet.microsoft.com/wiki/contents/articles/how-to-shard-with-sql-azure.aspx might help you get a feel for what would be needed?

Sep 16, 2011 at 12:56 PM
Edited Sep 16, 2011 at 12:57 PM

Definetly it is appreciated for the effort you guys had taken to move Orchard into Azure and open a greater possibility. There is no doubt in it. As you said it is only a limitation in SQL Azure.

What we are looking for is to overcome this limitation by going with Sharding kind of model .... .  however still i'm expecting a level of guidance from you to implement this.

Also we are looking for help from Microsoft to increase the size of SQL Azure on a special case basis, its also in the discussion. Thanks for the pointers.

I have an another question on moving to Azure. This is for Lucene search implementation. It seams the search index is maintained at filesystem. If we move on to Azure, would it be maintained at BLOB storage by default ? or what would be the best solution for this ?

 

Coordinator
Sep 16, 2011 at 3:58 PM

The search index is kept on each node, and each node keeps it up to date.

It works pretty well, but in case you don't trust this architecture, you can always replace it by another one, it's just defined in the Orchard.Indexing module.

Coordinator
Sep 16, 2011 at 4:01 PM

Sebastien: but what happens when there is a change notification on oine node. Do we propagate the notification now?

@mrpraba: please contact me at bleroy at microsoft and I'll put you in contact with Azure folks. For implementing sharding in Orchard, you would probably have to go pretty deep into the nHibernate stuff. Sébastien may have some pointers.

Coordinator
Sep 16, 2011 at 4:30 PM

Yes, there is an IndexingTask table, and each node has a cursor saved locally to keep up to date with this table. This is a completely isolated logic from the event bus, or cache, ...

Yes, we should have such mecanism for general purposes, including this index stuff.

Sep 21, 2011 at 3:44 AM

Thanks Guys, It helps me to take it forward.

@Sebastienros: Assuming we keep the search index on each node, what would happen if a new node is instantiated and a user is trying to search from this node ? Can't we have search index in a central location and let the nodes to get the search data from this location ?

Coordinator
Sep 21, 2011 at 4:57 AM

This issue would be only momentary. After a maximum of 1 minute, the index would be built.

Though it might be interesting to have a module to keep everything centralized, specially in Azure. This could be using cached blobs for instance, or Table storage. It might imply some cost though, so this should be an option only. Microsoft Research had done it previously, using some Google cache I could see the code, which was MS-PL ... it's no more public but it shows it's doable. And there might be some discussions on the web also on using lucecen on Azure.

Nov 8, 2011 at 1:30 AM

Microsoft announced at DevConnection (Oct 2011) that SQLAzure would grow to 150GB limits instead of 50GB.  I would still look at sharding.