Migrating Wordpress blog posts to Orchard

Topics: Administration, General
Sep 30, 2012 at 1:14 PM

Hi,

I'm using Orchard for my website for quite a while now and I'm loving it. On a subdomain of my site (blog.repsaj.nl) I've got a Wordpress installation serving up my blog. I'd really like to migrate those posts to my Orchard installation so I have everything in a single look & feel, as well as in one place when it comes to administrating everything.

But... just moving the posts isn't going to be enough. I don't want to break all the links made to my blog on various forums and stuff like that. So an important requirement is that those links keep working. I'm not sure how Google handles redirects for content; does that affect your page ratings? If not, then a redirect is ok. If so: the content should be served on the same URL's.

Has anyone ever done this? Are there tools available, or do I need to go into both API's and do it myself? I am a decent developer, so I don't expect too much trouble doing it myself; but if I can save myself the time and trouble...

Any help would be appreciated!

Coordinator
Sep 30, 2012 at 6:35 PM

Permanent redirects are perfectly understood by search engines. They are good practice too. You can specify them in web.config, or you can use the URL Rewrite module from Sébastien Ros.

Oct 6, 2012 at 9:51 AM

Hi Bertrand,

Sounds good. So basically I'd have to extract the posts from Wordpress, insert them in orchard and create a redirect for the old Wordpress URL to the new Orchard one. Sounds doable as well! But as Wordpress is PHP based, wouldn't that mean there is no web.config? As there won't be any ASP.NET processing either? On the other hand, after migrating everything there is no need for Wordpress any more, so I could just use an empty virtual directory with just a web.config for the redirects. Thanks, I'll go and try out some stuff.

Coordinator
Oct 10, 2012 at 1:14 AM

Ah yes, sorry, for some reason this didn't click. Yes, you'd have to do permanent redirects on the WordPress side of course. Which, if your WP is in Apache, probably means setting-up mod_rewrite.

Oct 23, 2012 at 5:48 PM

Sorry for the delay, didn't have the time. But I'm now on it again!

So I decided I'm writing a little console app for migration. I've opted to export the Wordpress blogs in XML format (default functionality). Using Linq 2 XML it's easy to create a custom format for each post. But then what? What's the right way to get those posts into Orchard? I noticed the importexport module which reads XML, which would mean I'd have to convert the format. That's fine, but where's the documentation on how the format should look like?

I was kind of hoping there's just an InsertBlogPost-like API call available for use. But I couldn't find such a thing. Doesn't it exist, or did I look in the wrong place? Please advise once more on what way to go.

Developer
Oct 23, 2012 at 9:31 PM

So much easier than that.

1. Export your Wordpress blog as a WXD file from the wordpress admin panel (I think its WXD)

2. Clone to the tip of this repo https://orchardimportexport.codeplex.com/

3. Enable the "Import Export for external schema module"

4. Go to import under the Blogs heading on the admin menu.

5. Choose File, select Wordpress, leave all other default and away you go.

Done!

Oct 25, 2012 at 9:20 AM
Edited Oct 25, 2012 at 9:24 AM

I'm stuck on step 3. Enable the "Import Export for external schema module", where do I have to do that? When I import the Wordpress file in the import/export session; it says my recipe has been imported but the blog section of my site remains empty. So I'm guessing it didn't import anything, which makes sense. Now to find that one option...

Edit: sorry, I missed step 2 as well. What do you exactly mean by that? </n00b>

Developer
Oct 25, 2012 at 9:24 AM

Open up orchard in the Admin screens, then go to Features on the left hand admin menu, you should now see it?

Make sure that the folder you have cloned orchardimportexport into is named Orchard.ImportExport

Oct 25, 2012 at 9:36 AM
Edited Oct 25, 2012 at 9:41 AM

Ok so I activated the module (which is named "Import / Export externa; schemas" by the way). There's no Import link under the blogs headin in admin.

Developer
Oct 25, 2012 at 1:18 PM

Ah yes, A Naming bug I fixed locally but forgot to push. It actually should be called 'External Import/Export' (I have now pushed up)

There should be a import link right under Blogs. You have the taxonomy module installed yeah?

Oct 25, 2012 at 1:58 PM

Ok, I cloned the latest bits and it's now called "External Import/Export". I disabled / enabled the module just to be sure, but there's no link under Blogs. Only Manage, New Post and New Blog. I am running an outdated Orchard version (v.1.4.2.0) I suppose, can that have anything to do with it?

Developer
Oct 25, 2012 at 2:34 PM

That might be it you know. I haven't tested it on anything before 1.5.1.

Anyway you can upgrade? It maybe that certain constructs and signatures have changed

Oct 26, 2012 at 10:47 AM

Yeah seems like a good idea anyway. But upgrading is also new to me, so I'll have to find out what to do first. I'll let you know later, thanks.

Oct 28, 2012 at 1:48 PM

Ok, so I upgraded to 1.5.1.0. Doesn't seem to help, there's still no menu item. Tried disabling and enabling again, nothing. Any ideas?

Nov 2, 2012 at 6:45 AM

*bump*

Developer
Nov 2, 2012 at 7:43 PM

Is there anything suspicious in the log files?

Nov 6, 2012 at 6:58 PM

I disabled / enabled the module and checked the error log in app_data, it's clean. Wouldn't know what else there is to check?

Developer
Nov 6, 2012 at 7:18 PM

Does it work when you try it on a new Orchard installation?

Nov 8, 2012 at 7:00 PM

Just tried. Downloaded the latest Orchard version and copied the modules from my own installation into the new directory. Didn't work.

Then I deleted the module directories, downloaded taxonomies again and downloaded a fresh zip of the importexport module. Installed them and activated the module. No errors, but no link either.

Just to be sure: there's supposed to be a link in the left navigation menu underneath the "Blog" menu option; right!?

Nov 10, 2012 at 2:25 PM

I checked the code. Noticed it's also supposed to install a custom permission set (Import blog / Export blog), right? So I checked to see if those permissions were assigned, but they're nowhere to be found. So the missing link could be caused by missing permissions, which are in turn caused by the permissions not being there at all.

Don't know if that's it, but perhaps it helps. I can also start a new thread on the codeplex site of the module if you want me to?

Nov 11, 2012 at 12:31 PM

Tried navigating to /Admin/Blogs/Import manually. Got this error:

 

An unhandled exception has occurred and the request was terminated. Please refresh the page. If the error persists, go back

The parameters dictionary contains a null entry for parameter 'blogId' of non-nullable type 'System.Int32' for method 'System.Web.Mvc.ActionResult Item(Int32, Orchard.UI.Navigation.PagerParameters)' in 'Orchard.Blogs.Controllers.BlogAdminController'. An optional parameter must be a reference type, a nullable type, or be declared as an optional parameter. Parameter name: parameters

System.ArgumentException: The parameters dictionary contains a null entry for parameter 'blogId' of non-nullable type 'System.Int32' for method 'System.Web.Mvc.ActionResult Item(Int32, Orchard.UI.Navigation.PagerParameters)' in 'Orchard.Blogs.Controllers.BlogAdminController'. An optional parameter must be a reference type, a nullable type, or be declared as an optional parameter. Parameter name: parameters at System.Web.Mvc.ActionDescriptor.ExtractParameterFromDictionary(ParameterInfo parameterInfo, IDictionary`2 parameters, MethodInfo methodInfo) at System.Web.Mvc.ReflectedActionDescriptor.<>c__DisplayClass1.<Execute>b__0(ParameterInfo parameterInfo) at System.Linq.Enumerable.WhereSelectArrayIterator`2.MoveNext() at System.Linq.Buffer`1..ctor(IEnumerable`1 source) at System.Linq.Enumerable.ToArray[TSource](IEnumerable`1 source) at System.Web.Mvc.ReflectedActionDescriptor.Execute(ControllerContext controllerContext, IDictionary`2 parameters) at System.Web.Mvc.ControllerActionInvoker.InvokeActionMethod(ControllerContext controllerContext, ActionDescriptor actionDescriptor, IDictionary`2 parameters) at System.Web.Mvc.ControllerActionInvoker.<>c__DisplayClass15.<InvokeActionMethodWithFilters>b__12() at System.Web.Mvc.ControllerActionInvoker.InvokeActionMethodFilter(IActionFilter filter, ActionExecutingContext preContext, Func`1 continuation) at System.Web.Mvc.ControllerActionInvoker.<>c__DisplayClass15.<>c__DisplayClass17.<InvokeActionMethodWithFilters>b__14() at System.Web.Mvc.ControllerActionInvoker.InvokeActionMethodFilter(IActionFilter filter, ActionExecutingContext preContext, Func`1 continuation) at System.Web.Mvc.ControllerActionInvoker.<>c__DisplayClass15.<>c__DisplayClass17.<InvokeActionMethodWithFilters>b__14() at System.Web.Mvc.ControllerActionInvoker.InvokeActionMethodFilter(IActionFilter filter, ActionExecutingContext preContext, Func`1 continuation)

 

It complains about a missing blogId. Tried adding /19 (blog id) to the URL, and ?blogId=19, didn't fix it. Also in the controller code I can't find a blogId parameter, so the fault seems to be somewhere else.

Developer
Nov 11, 2012 at 12:36 PM

Did you download the latest from here? https://orchardimportexport.codeplex.com/ (REPO: https://hg.codeplex.com/orchardimportexport)

Is the module in the correct folder.... it needs to be in a folder named "Contrib.ImportExport"

Can you check that?

Nov 11, 2012 at 12:49 PM

Ah, hold up! I din't know the folder name mattered. I just downloaded the repo and got a folder named orchardimportexport so I assumed that would be ok. After renaming the folder, suddenly there are links appearing and stuff :) Thanks man, I feel kinda dumb it was such a stupid little small thing.

Nov 11, 2012 at 1:06 PM

Ok so I now ran the import, took a while but it finished without errors.

The blog posts were all imported it seems, title, text and comments.

  • For all posts; the mark-up was completely lost. Even paragraphs, whitelines, all gone.
  • The comments appear in the incorrect order. Which is actually weird because the datetime of each comments does match to the source file. One post for instance lists: Oct 30 2012 at 9:57 PM, Oct 31 2012 at 5:51 PM, Oct 31 2012 at 9:58 AM. Obivously, the second entry should have been the last one?
  • The URL magic is very cool. The new orchard blogs still use /index.php/date/title as URL which exactly matches what it was on the old site. That makes it quite easy to generate some redirect rules (because the main URL changes in this migration as well).

Any ideas on the first two items?

 

 

Nov 11, 2012 at 1:42 PM

Additional remark: when you're importing a big file, the action times out after about 30 seconds.

Exception Details: System.Web.HttpException: Request timed out.

Developer
Nov 12, 2012 at 2:46 AM

You can increase the timeout value using web.config: http://stackoverflow.com/questions/7241046/system-web-httpexception-request-timed-out

Nov 13, 2012 at 6:24 AM

Thanks, but that's not really the main issue. I noticed the import succeedes anyway, wether it times out or not. The content issues are far more relevant.

Nov 21, 2012 at 7:07 PM

*bump*

Developer
Nov 22, 2012 at 2:51 PM

So this

For all posts; the mark-up was completely lost. Even paragraphs, whitelines, all gone.

Is probably down to the DataCleaner class misbehaving. What happens when you run your post in is that this class is called and it transforms your data stripping out all the stuff like empty tags, br tags Microsoft word stuff. What I think we should do Is add a setting in for that. give me two and I will push a fix.

Nick

Developer
Nov 22, 2012 at 3:04 PM

Okay I have pushed to the repo.

On the screen you will see two new fields. 1 to clean data which is now off my default. 2. Fix Urls, this is also off by default.

Let me know if that fixes your content issues

Nick

Nov 24, 2012 at 12:52 PM

Hi Nick,

I ran another import with the checkboxes turned off; that should have fixed it, right? I didn't notice any difference in outcome. Another thing I noticed is that the setting for selecting an existing blog doesn't seem to do anything. I selected an existing blog instance before importing, but it still created a new one with my Wordpress blog name.

Dec 4, 2012 at 6:24 AM

*bump*

Dec 17, 2012 at 8:28 AM

Guys? Sorry I'm a bit impatient :)

Developer
Dec 17, 2012 at 9:01 AM

Hey jsiegmund,

Any chance you could email me the file? Jetski5822 at gmail dot com

This was I can run it locally and see if there is any edge cases I have missed.

Cheers, Nick

Dec 20, 2012 at 6:37 PM

Hi Nick, did you receive my mail?

Jan 5, 2013 at 10:45 AM

*bump*

Jan 21, 2013 at 3:39 PM

Not gonna happen anymore? Still need a good way to import :(

Developer
Jan 21, 2013 at 4:18 PM
Hi jsiegmund, Sorry I have been really busy with other things. I did see that you sent me the Zip file, I have just been snowed under. I promise I will take a look tonight when I get home from work.

Nick


On Mon, Jan 21, 2013 at 4:39 PM, jsiegmund <notifications@codeplex.com> wrote:

From: jsiegmund

Not gonna happen anymore? Still need a good way to import :(

Read the full discussion online.

To add a post to this discussion, reply to this email (orchard@discussions.codeplex.com)

To start a new discussion for this project, email orchard@discussions.codeplex.com

You are receiving this email because you subscribed to this discussion on CodePlex. You can unsubscribe or change your settings on codePlex.com.

Please note: Images and attachments will be removed from emails. Any posts to this discussion will also be available online at codeplex.com


Jan 24, 2013 at 1:56 AM

I've done lots of WordPress to Orchard migrations. Here's my basic method:

1) Set up Orchard the way you want (custom content types and such)
2) Export dummy data you insert and look at the XML file generated. That same XML schema can be used to import content.
3) Export your WordPress data, which is also XML, but a difference schema.
4) Use an XSLT transformation to make the WP export data in the format of the Orchard import schema.
5) Import the new file. 

Every time it needs to be tweaked, because WP versions are different and the content types your Orchard instance has are unique (if you want them to be and I highly recommend doing). That said, I can dig up one of my XSLT files and post a general framework for one.

Jan 24, 2013 at 2:01 AM

Here is a blog to blog XSLT. Notice it is importing to custom textfields I added to that content type.

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 xmlns:wp="http://wordpress.org/export/1.0/"
 xmlns:msxml="urn:schemas-microsoft-com:xslt" 
 xmlns:vb="#VBCustomScript"
 xmlns:content="http://purl.org/rss/1.0/modules/content/">
    <xsl:output method="xml" indent="yes"/>

 <!-- VB Script Formatting Functions -->
 <msxml:script language="VBScript" implements-prefix="vb">  
 <![CDATA[ 
 function gmtToUtc(str) 
   gmtToUtc = Split(str)(0) & "T" & Split(str)(1) & "Z"
 end function
 ]]> 
 </msxml:script>

<!-- A Template to Join Nodes Into a List -->
<xsl:template name="join">
    <xsl:param name="list" />
    <xsl:param name="separator"/>

    <xsl:for-each select="$list">
        <xsl:value-of select="." />
        <xsl:if test="position() != last()">
            <xsl:value-of select="$separator" />
        </xsl:if>
    </xsl:for-each>
</xsl:template>
    
<!-- Top Level Template -->
<xsl:template match="/">
<Orchard>
  <Recipe>
    <Name>Migrate WordPress Blog to Orchard Blog</Name>
    <Author>Planet Telex</Author>
  </Recipe>
  <Data>
    <!-- Select published pages -->
    <xsl:apply-templates select="//channel/item[wp:status='publish']"/>
  </Data>
</Orchard>
</xsl:template>
  
 <!-- Blog Post Template -->
 <xsl:template match="item">
  <xsl:variable name="slug" select="wp:post_name" />
  <xsl:variable name="date" select="vb:gmtToUtc(string(wp:post_date_gmt))" />
 
  <BlogPost Id="/alias=blog\/{$slug}" Status="Published">
   <AutoroutePart Alias="blog/{$slug}" UseCustomPattern="false" />
   <TitlePart>
    <xsl:attribute name="Title">
        <xsl:value-of select="title" />
    </xsl:attribute>
   </TitlePart>
   <CommonPart Owner="/User.UserName=admin" Container="/alias=blog" CreatedUtc="{$date}" ModifiedUtc="{$date}" PublishedUtc="{$date}" />
   <BodyPart>
    <xsl:attribute name="Text">
     <xsl:value-of select="content:encoded/text()" />
    </xsl:attribute>
   </BodyPart>
   <TagsPart>
    <xsl:attribute name="Tags">
       <xsl:call-template name="join">
          <xsl:with-param name="list" select="category[@domain='tag']/@nicename" />
          <xsl:with-param name="separator" select="','" />
       </xsl:call-template>
    </xsl:attribute>
   </TagsPart>
   <xsl:apply-templates select="wp:postmeta[wp:meta_key='_aioseop_keywords'][last()]"/>
   <xsl:apply-templates select="wp:postmeta[wp:meta_key='_aioseop_description'][last()]"/>
   <xsl:apply-templates select="wp:postmeta[wp:meta_key='_aioseop_title'][last()]"/>
  </BlogPost>
 </xsl:template>

<!-- Keywords Template -->
<xsl:template match="wp:postmeta[wp:meta_key='_aioseop_keywords']">
    <TextField.SeoKeywords>
        <xsl:attribute name="Text">
            <xsl:value-of select="wp:meta_value" />
        </xsl:attribute>
    </TextField.SeoKeywords>
</xsl:template>

<!-- SEO Desciption Template -->
<xsl:template match="wp:postmeta[wp:meta_key='_aioseop_description']">
    <TextField.SeoDescription>
        <xsl:attribute name="Text">
            <xsl:value-of select="wp:meta_value" />
        </xsl:attribute>
    </TextField.SeoDescription>
</xsl:template>

<!-- Page Title Template -->
<xsl:template match="wp:postmeta[wp:meta_key='_aioseop_title']">
    <TextField.PageTitle>
        <xsl:attribute name="Text">
            <xsl:value-of select="wp:meta_value" />
        </xsl:attribute>
    </TextField.PageTitle>
</xsl:template>
    
</xsl:stylesheet>
Developer
Jan 26, 2013 at 11:57 AM

@Jasper

Sorry about the super long delay. I have tracked the problem down. Its not an Issue with the import export module.

Its an Issue with TinyMCE What happens is that when you import your text, it takes the string and doesn't touch it.

However, when you go to display your text through the front screens or through the Tiny MCE editor it doesnt understand the line break. In your case its \n\n

I am not sure what encoding this is but it would probably work if it was \r\n... 

May I also point out that your text has not Paragraphs in it, only line breaks.

 

@PlanetTelex

No need to do any of that. My module will take care of that for you automagically.

Jan 26, 2013 at 8:12 PM

automagically; new favourite word

Jan 27, 2013 at 8:40 PM
Edited Jan 27, 2013 at 8:40 PM

One crappy thing I've noticed when importing WordPress sites into Orchard is that WordPress will actually save line break characters from its interface. It has rendering logic that converts those into tags, but the underlying data saved contains line breaks. When exporting they get converted to their character codes and will import as line breaks. I fix that when importing from WordPress, updating line breaks in the export with open and closing tags.

Feb 5, 2013 at 5:47 AM
Maybe I'll try a fresh download tonight, but untill now I'm still getting the same results.
Mar 2, 2013 at 1:50 PM
Bump...
Mar 2, 2013 at 3:39 PM
So today I upgraded to the latest Orchard version and downloaded a new copy of the module sourcecode. I copied it into the Contrib.ImportExport folder and enabled the module. Links are gone again, disabling / enabling the module does nothing. This time I've got the foldername right, unless it's changed in the meantime of course.

Maybe I'll put some effort in the XSLT approach tomorrow, sounds viable. And I've got some XSLT experience, so doable as well.
Mar 7, 2013 at 5:21 AM
Which, if your WP is in Apache, probably means setting-up mod_rewrite. you'd have to do permanent redirects on the WordPress side of course.
Mar 7, 2013 at 9:19 AM
WP and Orchard server is the same one, IIS.
Mar 29, 2013 at 10:55 PM
Tried the module and seeing the same behavior described by jsiegmund: can't find "Import" link under "Blog" menu!