Including Data in Media Part Exports

Topics: Administration, Customizing Orchard
Jan 11, 2014 at 10:21 PM
The new media library is wonderful. Something that puzzles me is that an export of Media items doesn't include the media item itself, only the metadata.

Adding support to export a base64 encoded copy of the media item in the export would enable very simple migrations between storage providers - i.e., you're using the default file system provider and want to switch to Azure based storage, just as an example.

I'm considering implementing this myself, but would ideally like to make it optional - i.e., export an item, but check a box to include "heavy" content or not in the export. Given the architecture or import / export, we're really talking about embellishing ExportContext with settings that would flow thru to the individual drivers managing the export.

I'd be happy to implement this, unless someone thinks this is stupid.

Jan 12, 2014 at 11:41 AM
Edited Jan 12, 2014 at 11:42 AM
Interesting idea. I also thought about how manage media import/export. I think this is something that we could well do as it would enable some scenarios. However I definitely think this should be in a separate feature to be able to toggle it on or off.

However not all files in the Media folder are from Media content items. Thus besides implementing your idea we could also do something with exporting the contents of the Media folder in a general way (this is quite an old request). However here the issues is two-fold:
  • We have to present the user with a way of downloading the media files in a single package (e.g. a zip file).
  • We have to think about not accessing files directly but through IStorageProvider; files can reside anywhere (e.g. also in Azure Blob storage, on a remote server).
One big problem here is that we'd need to create a zip file somehow, but this is a resource-intensive task and it gets more resource intensive as the size of the site grows. If done on the server side it would have a huge impact on the webserver (i.e. it's not a solution that would really scale). Building on the idea written in the linked discussion we could do the following to transfer the work onto the client side:
  • The "export every media file" feature just generates a list with the public URL of all the files in the storage. This is still not a trivial task with a huge amount of media (but with an amount that huge that this would be also problematic media export from the admin UI is probably not a good idea anyway) but reasonable.
  • A client-side javascript program then would take this list, download every file and generate a zip package (like this or this).
Something similar would also be needed for importing.

What do you think?
Jan 13, 2014 at 2:30 AM
I agree with Zoltan: I would not include the media in the XML file. It would be grossly inefficient (imagine a 200MB video file, which is not big, inside a XML file). On the other hand, when we adapt import/export to be a more generic content deployment feature, moving media around is going to be a must, especially now that media are content items.
Jan 13, 2014 at 3:07 AM
I was trying to find a quick fix to a glaring hole. I know encoding blobs in XML isn't necessarily efficient, but it's expedient in providing a solution of some kind that doesn't involve a lot of throwaway code to do something as seemingly straightforward as switching storage providers.

Ideally, we'd define a compressed format (I'm thinking a Zip file) that takes the spirit of something like OOXML in structuring content; I'm imaging a solution whereby the Zip file is composed of folders for each part / type, and each folder contains an XML document for each individual part or type instance. Would solve neatly the containerization problem of converting bytes to base64 for safe transport in XML; store the referenced blob next to the XML export of the part / type itself.

Coupled with passing options to the existing Export / Import infrastructure exposed via drivers, and you could do a couple of things - preserve backwards compatibility with existing Export / Import code in drivers and move to a new format while supporting giving each driver the option to export its BLOBs or not.

I don't like any approach that separates media import export from content import export - they are the same thing, must be deployed together, and must be versioned together. While the overhead of compression on the server side is worrisome to a point, any site with enough media files doing frequent imports and exports likely to encounter high compute consumption is probably running on large enough hardware that they don't care - or perform the exports on dedicated nodes to that purpose (this topology is very common in the SharePoint world, for example).

Client side ZIP creation is neat, and I hadn't ever seen those libs before; it's very cool, but I don't think it's an effective solution. For one, you'll burn a lot of bandwidth sending the uncompressed bits to the client for compression; if we're really just worried about containerization, we can always produce a ZIP file with a crappy compression ratio and save on the compute costs of compressings things at all. Just a thought.
Jan 13, 2014 at 3:22 AM
Interesting suggestions. I agree.
Jan 13, 2014 at 3:54 AM
Bertrand, how do I move forward within the framework of the Orchard project with this concept?
Jan 13, 2014 at 4:14 AM
It would probably be best to discuss this on Tuesday during the community meeting (noon, pacific time, at Open a bug in the issue tracker, fork the code, and when you're done, send a pull request.
Jan 13, 2014 at 4:24 PM
Quick thoughts. Yes being able to import/export media assets is necessary. I don't think it should be in the XML file because that could blow the server as the document has to be loaded into memory. So the only viable solution is a binary solution as you described. But wait, this is exactly what is designed for the Deployment module that is waiting for a taker. On top of that it can import/export themes, modules, static assets.
Jan 13, 2014 at 4:29 PM
I'm not too familiar with future plans, so if I'm proposing something duplicitous, my apologies.

The funny thing is I was actually thinking that this feature idea would form a great basis for a content deployment mechanism, the remaining components would be essentially the transport mechanism.

What version is the deployment piece targeted for? I'd like to help out here, but I recognize that I'm an unknown to you guys, so dropping deployment in my lap would be a pretty big gamble. How can we coordinate, and how can I participate?


Jan 13, 2014 at 5:59 PM
Sebastien, when you say the Deployment module is waiting for a taker, are you referring to development or testing?
Jan 27, 2014 at 9:56 AM
I opened an issue to track this: