Orchard 1.7 Content Deployment feature team

Topics: Announcements
Coordinator
Nov 6, 2012 at 8:12 PM

This thread is to discuss a content deployment feature for Orchard 1.7.

This feature would be built on the import-export infrastructure but would automate the process of moving contents from one Orchard instance to another.

Nov 7, 2012 at 12:37 PM

I'd vote for improving import/export in general so it can be used for large data sets as currently _we_ cannot use export/import since it blows up orchard because of the 70k+ content items (incl. users) that we have.

I read before this is a design error and hard to fix.

If that is indeed (still) the case, maybe we can split up the import/export in tasks?

This way we could have a web/console-based way to see the progress of the import/export and cancel it if wanted. This will not resolve the fact that it'll be damn slow (if the core issue remains) but at least, given enough time, it'll work for large data sets without blowing up.

Nov 7, 2012 at 1:21 PM
Edited Nov 7, 2012 at 4:33 PM

I agree that import is becoming very hard to use, due to performance, once you have even a few thousand content items. Is it possible to improve this? I like the idea of this deployment feature but I worry it might not be usable if it inherits the performance issue of the existing content importing feature. 

EDIT: I have encountered slow performance on Importing even a single menu into a site that already has a lot of content, so it might be possible that even using Import/Export to publish one content item would have problems on sites with a lot of content. I dunno if the recipe I ran to update the MenuParts hit some kind of special case that caused slowness. I think I made a post in this message board about it when it happened. 

Developer
Nov 7, 2012 at 3:55 PM

Take advantage of WebApi and publish versions between environments.

Coordinator
Nov 8, 2012 at 4:29 AM

No, it's not a "design error", it's necessary constraints and missing optimizations. In order to reliably import contents, you need an identity mechanism that is resilient to a move to a different machine, and that handles both collisions and re-imports of the same items. It also needs to handle relationships. It makes it tricky to import large numbers of items because the logic to generate ids is not database-based. Currently, it happens in memory, which is why it breaks down with large numbers of items. The necessarily transactional nature of the whole thing doesn't help either. One possible optimization is to build an index of ids in the database during import so we can still lookup items without having them all in memory. We've wanted to do that for a number of versions but never found time for it. I agree it needs to be done. Import performance should be linear or close to it.

Nov 8, 2012 at 5:18 AM

My use case for this feature may not be exactly the norm, but when I read the feature description it sounded almost exactly what I have been wrestling with for the last few months. I have half a dozen CMS instances which share various content from a central CMS repository (simply another orchard instance). Automated subscriptions or manual imports can be run to import content from one CMS and publish it in another using the projection module queries exposing data recipes and the standard import processes.

The problem is that often there will only be a few content items added / updated in one import session and as the system currently works, potentially all identities for content items in the target orchard instance will be loaded for a single item import. For me this has been the performance pain point. I have worked around this so far by extending the logic in ImportContentSession and allowing modules that use identities to also provide a loader that knows how to query directly for a given content identity (or batch of identities). This maintains the ability for modules to create arbitrary identities while avoiding the need to load every identity before locating a content item. It can still fall back to the default linear search option if a suitable provider is not available. This is working quite well and has reduced the import time for smaller data sets dramatically.

This is a feature I have been working on / needed extensively and it would be great to be involved in getting these type of capabilities built into Orchard.

Coordinator
Nov 8, 2012 at 6:12 AM

Interesting. Thanks for sharing.

Coordinator
Nov 8, 2012 at 5:40 PM

@damoclarke: This is a very nice idea, and we would not need the identity index then, which will be harder to implement because we start from an existing code base, and maintaining indexes up to date not insignificant.

We also need to ensure that a content item can be concretely removed so that big imports don't come to memory limitations.

Something to keep in mind is that we can't decide by ourselves to split the import files, or use separate tasks, because a content item might refer to one in another file. One could think about analyzing the file, but each driver is responsible for storing identities, so it can't be done. It can be done manually though, so this could be a solution if people want to handle big data to import.

 

Dec 28, 2012 at 7:20 AM

I think that the "content deployment" feature should also take in consideration changes made to each records by looking at their last modified date. Else it's not a "complete" deployment process if you just look at ID's or GUID. Right now, if I modify the content of BodyPart record if you only check the ID to push a UPDATE to the database ; I don't think it will update.

I see this feature more like an automated publish from one site to another. In that case, you would need to make a Schema and Data compare on each database and then publish also the changes made in the code/modules to make a complete synchronisation of both websites.

One thing that I know is that by using GUID instead of Int in for primary key's make's this process a lot easier since if you create a new item on the development website it will never create an identical ID on the production website. But the drawback of GUID's is that it slowdown indexation of database's.

I don't know how the import/export module works right now and I don't know how far you guy's want to go with that feature but a complete "content staging" from one site to another would be great !

Feb 6, 2013 at 4:06 AM
I have spent a lot of time working with the Import/Export and recipe features and thought I might share the work I have done in case it is of interest to others. Having been through the forums I know that many others have come across similar issues that I have and it may be useful.

I have created a fork with the changes I have made so far to fill gaps in the Recipe / ImportExport features as well as a new Subscriptions module that allows content to be imported from one CMS instance to another. I would have liked to implement this all as a standalone module, but there were too many limitations with the current Recipes / Import infrastructure, particularly the way that content identities are handled.

If there is any interest in some or all of these features I would be happy to work on getting them included into the project. I could then submit the subscription module to the gallery. Any feedback on my approach or features is more than welcome. A summary of the features:

Wng.PubSub module
  • Using the new webapi capabilities metadata and content items can be imported directly from one CMS instance to another using an interface similar to the current Export UI except displaying content types etc. from the source orchard instance instead of the local one.
  • Select from the list of content types or a preconfigured Query with filters (using the standard Orchard.Projections queries)
  • Additional tracking of date/time when content items are published or deleted
  • New custom export / import step to allow unpublished / deleted content status changes to be imported.
  • Incrementally import new content as it is published (or unpublished / deleted) on the source CMS i.e. data subscriptions
  • Imports run on demand, as a repeating schedule or download as recipe for manual import
  • View subscription history - Success / Failure and journal messages
Orchard Core / Orchard Module Changes
  • Direct lookup of content identities in ImportContentSession with fallback to the existing identity scan. Uses a new ContentManagement.IContentIdentityLoader interface to allow module developers to supply identity lookup capabilities - example implementation IdentityPartHander or these could be bundled together in a separate feature.
  • Optional batching of data imports by applying a BatchSize="xxx" attribute to the 'Data' import step. This allows for shorter transactions. Single Data step is broken into multiple steps and pulls in dependencies from other batches within the recipe step if required. Dependencies are processed within the same transaction as the requesting items.
  • Added an optional priority to ContentIdentity. This is important for moving content between two CMSs where e.g. the alias could change but the Identifier guid is unique and should be used as the primary identity, otherwise duplicates would be created as the current implementation is a combined key only i.e. guid + alias must match.
  • Fixed a bug in DefaultContentManager.Import where import of existing content items / updated items created multiple drafts and did not publish.
  • Added events for recipe execution start, completion and steps to provide hooks for actions related to recipe processing.
  • ImportExport - Displays the RecipeJournal after import so that results of an import can be viewed - Previously just displayed 'Import has been completed' whether it is successful or not. Included failure message in recipe journal as this was fairly sparse on information.
  • Extended IImportExportService with overloads to export arbirary list of content items e.g. results of a Orchard.Projections query
If anyone is interested my fork is located at http://orchard.codeplex.com/SourceControl/network/forks/damoclarke/ImportExportSubscriptions?branch=1.x

I'm fairly new to open source projects so I'm not entirely sure on the process for submitting features. If there is interest in just a sub-set of theses features I can submit as individual issues / pull requests.
Coordinator
Feb 6, 2013 at 5:43 PM
Can we talk someday, I'd really like to hear about what you did, and how to integrate it.
Your thing looks seriously cool. Skype me at sebastienros. And you could also attend the weekly meeting if you are able to, so we can see how it works. I know everyone would love it, but we need to be cautious not to add too much implementation details to the core.
Coordinator
Feb 6, 2013 at 6:51 PM
Edited Feb 6, 2013 at 6:55 PM
This looks spectacular. Can't wait to have the time to try it. +1 on can you demo that at next week's meeting? It's at 12PM Pacific Time at http://orchardproject.net/meeting.
Feb 6, 2013 at 11:23 PM
Would be great to give a demo of what I have at next week's meeting. @sebastienros I have sent you my skype details.
Feb 7, 2013 at 6:48 AM
Nice. looking forward to see this at the next meeting!
Jul 15, 2013 at 12:37 PM
I know that content deployment is on the cards for Orchard 1.8 but with the imminent release of 1.7 the module I have been working on is almost complete. I think it has some great features for managing content between multiple sites. You can:
  • Link to one or more Orchard CMS sites
  • Push individual content items to a remote target on demand
  • Automatically push content when modified or published using Orchard Workflows
  • Queue changed content to be deployed later in a batch
  • Set up custom import or export subscriptions filtering by Orchard Queries, selected content types and change status
  • Preview recipe to be deployed by a subscription
  • View history of all recipe deployments, view import messages and download recipe files
  • Provides extensibility points to add custom deployment targets e.g. FTP
Implemented functionality is complete and compatible with 1.7 but needs a little more testing so I haven't released to the gallery. You can check out the source and screenshots/overview at the projects site:

https://orcharddeployment.codeplex.com/documentation

It would be great to get feedback on the module or contribute to the main project effort.
Jul 15, 2013 at 2:27 PM
Nifty! Thanks for sharing.
Coordinator
Jul 15, 2013 at 3:53 PM
You're back ! Ping me on skype ...
Jun 9 at 7:27 PM
I installed this module in Orchard 1.8 solution with few modifications (changed target framework to 4.5, upgrade dlls)

When i go to "Sources and Targets" and click on "Remote Orchard Deployment" it just redirect me to the 404 error page.

Description: HTTP 404. The resource you are looking for (or one of its dependencies) could have been removed, had its name changed, or is temporarily unavailable. Please review the following URL and make sure that it is spelled correctly.

Requested URL: /OrchardLocal/Contents/DeploymentConfiguration/Create/RemoteOrchardDeployment
Coordinator
Mon at 7:32 PM
The module is not ready. Please don't use this unless you want to work on it, in which case please coordinate with me.