Transactions in long running commands

Topics: Customizing Orchard, General, Troubleshooting, Writing modules
Sep 27, 2011 at 11:59 AM

I'm trying to create a command that will run for some time. It harvests an external webservice and needs to insert or update ContentItems. It's using a service which is injected into the command.

I also want a progress record in the database that I can show in the admin using a custom controller.

 

I wanted to start with this progress bit first but I run into transaction problems. The whole orchard command is running in one big transaction. If I kill orchard.exe, the tranaction is rolled back automatically as expected. However, my command is going to run very long so I need to control my own transactions somehow. Also when the command is running, the record is locked and query's that hit the record either wait very long or time out.

I've tried the code below, but I still experience the locked record (SQL EXPRESS 2008R2 using the management studio), and if I abort the command, the progress record is not committed.

private readonly IRepository<AdLibHarvesterProgressRecord> _progressRepository;

public void FullHarvest(IHarvestContext context)
{

  var progress = new HarvesterProgressRecord { Type = HarvestType.Full, Started = DateTime.Now };
  using (ISession _session = _sessionLocator.For(typeof(HarvesterProgressRecord)))
  {
    using (var tx = _session.BeginTransaction(IsolationLevel.ReadUncommitted))
    {
      _progressRepository.Create(progress);
      tx.Commit();
    }

    for (int i = 0; i < 20; i++)
    {
      Thread.Sleep(2000);
      context.Output.WriteLine("Doing: {0}", i * 100);
      using (var tx = _session.BeginTransaction(IsolationLevel.ReadUncommitted))
      {
        progress.CurrentItem += 100;
        _progressRepository.Flush();
        tx.Commit();
      }
    }
  }
}

 

I'm looking for a way to control the transaction myself (the above method is called from a Command implementing DefaultOrchardCommandHandler) or to tell orchard that I have a sort of delegate/work unit that needs to be done.

I'm don't want to run the work parallel BTW, so I don't feel comfortable to use a scheduled task.

 

Please give me some advice how to approach this task.

Developer
Sep 27, 2011 at 7:18 PM

Why dont you wrap the whole method in one big suppress? then you can suppress the Orchard transaction?

Sep 27, 2011 at 8:41 PM

I've tried this suggestion:

public void FullHarvest(IHarvestContext context)
{
  var progress = new HarvesterProgressRecord { Type = HarvestType.Full, Started = DateTime.Now };
  using (var scope = new TransactionScope(TransactionScopeOption.Suppress))
  {
    _progressRepository.Create(progress);

    for (int i = 0; i < 20; i++)
    {
      Thread.Sleep(2000);
      context.Output.WriteLine("Doing: {0}", i * 100);
      progress.CurrentItem += 100;
      _progressRepository.Flush();
    }
  }
}
which is less complex, but I still experience the locking. Moreover, I get the following exception:

TransactionScope nested incorrectly.

Exception Details: System.InvalidOperationException: TransactionScope nested incorrectly.

Stack Trace:

[InvalidOperationException: TransactionScope nested incorrectly.]
   at System.Transactions.TransactionScope.Dispose()
   at Harvester.Services.HarvesterService.FullHarvest(IHarvestContext context) in c:\inetpub\wwwroot\cms\Orchard.Web\Modules\Harvester\Services\HarvesterService.cs:line 63
   at Harvester.Commands.HarvesterCommands.FullHarvest() in c:\inetpub\wwwroot\cms\Orchard.Web\Modules\Harvester\Commands\HarvesterCommands.cs:line 33

Another unwanted effect that the ProgressRecord is not committed!

 

I've tried several things like making a new WorkContextScope, letting the new scope resolve a new ISession. I was able to loose the locking and getting the commit to work out. But the nested transactionscope exception always came up (though from different code than the above exception):

public void FullHarvest(IHarvestContext context)
{
  var workContextScope = _workContextAccessor.CreateWorkContextScope();

  IRepository<HarvesterProgressRecord> _progressRepository = workContextScope.Resolve<IRepository<HarvesterProgressRecord>>();
  ISessionLocator _sessionLocator = workContextScope.Resolve<ISessionLocator>();

  var progress = new HarvesterProgressRecord { Type = HarvestType.Full, Started = DateTime.Now };
  using (ISession session = _sessionLocator.For(typeof(HarvesterProgressRecord)))
  {
    using (var scope = new TransactionScope(TransactionScopeOption.Suppress))
    {
      _progressRepository.Create(progress);

      for (int i = 0; i < 20; i++)
      {
        Thread.Sleep(2000);
        context.Output.WriteLine("Doing: {0}", i * 100);
        progress.CurrentItem += 100;
        _progressRepository.Flush();
      }
    }
  }
}

Gives me the following exception

[InvalidOperationException: TransactionScope nested incorrectly.]
   at System.Transactions.TransactionScope.Dispose()
   at Orchard.Data.TransactionManager.System.IDisposable.Dispose() in C:\inetpub\wwwroot\cms\Orchard\Data\TransactionManager.cs:line 47
   at Autofac.Core.Disposer.Dispose(Boolean disposing)
   at Autofac.Util.Disposable.Dispose()
   at Autofac.Core.Lifetime.LifetimeScope.Dispose(Boolean disposing)
   at Autofac.Util.Disposable.Dispose()
   at Orchard.Environment.WorkContextAccessor.ThreadStaticScopeImplementation.<>c__DisplayClass8.<.ctor>b__5() in C:\inetpub\wwwroot\cms\Orchard\Environment\WorkContextAccessor.cs:line 106
   at Orchard.Environment.WorkContextAccessor.ThreadStaticScopeImplementation.System.IDisposable.Dispose() in C:\inetpub\wwwroot\cms\Orchard\Environment\WorkContextAccessor.cs:line 111
   at Orchard.Commands.CommandHostAgent.RunCommand(TextReader input, TextWriter output, String tenant, String[] args, Dictionary`2 switches) in C:\inetpub\wwwroot\cms\Orchard\Commands\CommandHostAgent.cs:line 88

I'm getting the idea that I'm just doing things wrong here, but I don't know how to do it differently.

Dec 15, 2011 at 1:57 AM

I'm pretty late to the party, but ran into the same issue.  I worked up a really ugly solution...I don't like it but since I can't think of a better way to do this without changing the way in which transactions are exposed.  Here's what I ended up doing:

const int batchSize = 250;
var i = 0;
var transManager = _orchardServices.TransactionManager;           
while (i * batchSize < elements.Count())
{    
    //start a new transaction if necessary
    transManager.Demand();
    var batch = elements.Skip(i++ * batchSize).Take(batchSize);                
    foreach (var element in batch)
    {
        //Do the batching here...
    }

    //This is really dirty, but I can't find another way to break this into discrete transactions
    var fi = transManager.GetType().GetField("_scope", BindingFlags.NonPublic | BindingFlags.Instance);
    var scope = (TransactionScope)(fi.GetValue(transManager));
    scope.Complete();
    scope.Dispose();
    fi.SetValue(transManager, null);  
}

Developer
Dec 15, 2011 at 8:47 AM

I had some similar issue with Tasks (in the .NET sense). In this module's Tasks Libraries feature (Libs/Tasks) I implemented something that seems to work but I'm not entirely sure it will in all cases. The solution is that for all detached tasks like this one a new WorkContextScope should be created and since this new WorkContext would lack any information normally it contains, the outer (or upper) WorkContext's data should be copied over.

I also experimented with TransactionScope. I came to the conclusion that with tasks it's not necessary to do anything with transactions. Although in the end I found that with some settings a new, nested TransactionScope does seem to work (if I remember correctly this was with TransactionScopeOption.Suppress as mentioned earlier here) but IRepository and ContentManager stopped to flush properly (changed data was not persisted).

Feb 28, 2012 at 8:10 AM
Edited Feb 28, 2012 at 8:33 AM
ldhertert wrote:

I'm pretty late to the party, but ran into the same issue.  I worked up a really ugly solution...I don't like it but since I can't think of a better way to do this without changing the way in which transactions are exposed.  Here's what I ended up doing:

 

const int batchSize = 250;
var i = 0;
var transManager = _orchardServices.TransactionManager;           
while (i * batchSize < elements.Count())
{    
    //start a new transaction if necessary
    transManager.Demand();
    var batch = elements.Skip(i++ * batchSize).Take(batchSize);                
    foreach (var element in batch)
    {
        //Do the batching here...
    }

    //This is really dirty, but I can't find another way to break this into discrete transactions
    var fi = transManager.GetType().GetField("_scope", BindingFlags.NonPublic | BindingFlags.Instance);
    var scope = (TransactionScope)(fi.GetValue(transManager));
    scope.Complete();
    scope.Dispose();
    fi.SetValue(transManager, null);  
}

 

Will the 'default' orchard transaction still work? Or does this code kill it?

Also, isn't there a more proper (read: official way) to do something like this?

edit:

I have 2 needs for this:

1) Import lot of old data

2) I have to revert the changes made to a single object (in a separate transaction) whenever I need to, while the 'default' transaction should still be committed.

Mar 11, 2012 at 4:25 AM

Having trouble similar to this now. A recipe with a lot of data ran fine in dev on local machine, but is timing out after 10 min when I deploy to a server. How can i set the transaction timeout for orchard.web/bin/orchard.exe? Tthe web.config sets it to 30 min., but that doesn't affect the command line interface.

Mar 12, 2012 at 10:08 AM
Edited Mar 12, 2012 at 10:12 AM

Any 'official' feedback on this?

I'd simple like to add records in 'batches' through IRepository (they're not parts) and I keep running into walls :/

edit: Got further I think, I ran into a session closed or transactionscope already disposed exception, but I solved this by first demanding a regular transaction (@ the transaction manager) before creating my new one!

Coordinator
Mar 12, 2012 at 4:43 PM

Can you try to call _contentManager.Flush(); then _contentManager.Clear() in your batch loop ?

The timeout after 10 minutes might disappear after this change.

Mar 12, 2012 at 5:04 PM

For now I just broke up my recipe into three smaller ones and that worked. I'll try the Flush() next time I'm loading data in a migration. Or did you mean to try that from the Driver's Importing() method? 

Jul 18, 2012 at 6:53 PM

I'll bring this one back :)

I have a command that runs for well over 10 minutes, adding files and folders to Orchard as content items and building Alias/Autoroutes for them.

I have tried everything above to make it work, but after 10 minutes it fails. I can't seem to suppress the ambient transaction (or create a new one), using a new workcontext doesn't help, using Flush/Clear also has no effect. 

Anyone have another suggestion?

Jul 18, 2012 at 7:45 PM

Afaik (don't have the code here) I solved it by running everything in a transaction scope in 'batches' (like 1 per X minutes or so) and before completing the transaction, I would flush/clear the contentmanager.

 

Also it 'could' be that you have to demand a transaction @ start, not sure. I got a couple of orchard command line commands working that could run for an hour (if needed)

Aug 1, 2012 at 4:19 PM
Edited Aug 1, 2012 at 4:20 PM

I've figured it out thanks to a response by Bertrand in this thread: http://orchard.codeplex.com/discussions/373031

You inject the IProcessingEngine and schedule the work in batches where at the end of each batch you schedule another batch if there are more to do.

The code to study is the RecipeScheduler in the Recipes module.

I am using a pretty generic worker and queue setup that queues Action<T>'s and could be used to do almost anything this way. I still need to make my code a bit more general purpose (it contains some things that are specific to my usage) but once I have done that I will post the code.

Aug 6, 2012 at 6:57 PM
Edited Aug 6, 2012 at 6:58 PM

 

http://gallery.orchardproject.net/List/Modules/Orchard.Module.Worker

Aug 14, 2012 at 7:00 AM

Hi Folks

Well, so where is the happy end of this thrilling story?

I need simply import some 2000 content types by ImportExport Module.

I use the console command "import" not web.

The process is so slow that 10 min of the transaction timeout is actually not enough.

I am able to process only 400 items in a batch, provided that the Status is not "Published"
otherwise it 10 times less (not more than 50).
What would happen if I needed to import tens of  thousands?!

Please, if it  is possible to use http://gallery.orchardproject.net/List/Modules/Orchard.Module.Worker in ImportExport

I will appreciate any clue or hint....

 

Aug 17, 2012 at 6:32 AM

"What would happen if I needed to import tens of  thousands?!"

It would fail baaadly if you have plenty of content.

Currently it needs to process ALL existing content when importing new content to check for duplicates...

 

Aug 17, 2012 at 12:45 PM
Edited Aug 17, 2012 at 12:55 PM

The Worker Service isn't going to help in the Import/Export process. But it would help if you were building a custom command to perform your import. Here is how the Worker Service works:

 

public class SampleCommands : DefaultOrchardCommmandHandler {
    public IWorkerService _worker { get; set; }
    public IWorkQueue _queue { get; set; }
    public SampleCommands(IWorkerService worker, IWorkQueue queue) {
        _worker = worker;
        _queue = queue;
    }

    [CommandName("DoSomething LongTime")]
    public string LongRunningOperation(){
        //Repeat as many times as nessecary:
        _queue.Queue((w,p) => //DO SOMETHING HERE);

        _worker.Begin(_queue);
        return String.Format("Processing {0} items...", _queue.Count);
    }
}

When queueing work using the Queue method you provide an action with 2 input parameters, in this case w and p. w is the IWorkerService itself and it has properties for interacting further with the queue (as in queuing more work at the end of an operation, etc.) as well as IOrchardServices, Logger and Localizer. And p; the Queue method has a params parameter for passing parameters into your queued action, p is those parameters. Now if you pass in a single object as a parameter, p will be that object. If you pass in many objects p will be an array of parameters. IE:

 

_queue.Queue((w,p) => w.Services.ContentManager.Get(p.Id), new { Id = 1 });

 

OR

 

_queue.Queue((w,p) => w.Services.ContentManager.GetMany(p, VersionOptions.Latest, QueryHints.Empty), 1,3,5,7,9);

 

Also of note: the IWorkQueue also has a BatchSize property where you can control the batch size. I would recommend leaving it at 25. I have built a command that created approximately 10,000 file and folder content items (complete with autoroutes and aliases) into our Orchard based DMS using this service so I can tell you it will work for what your trying to do IF you're willing to write a little code. 

There is 1 caveat to this approach that is important enough to mention all by itself:

Queued work items are run in a separate shell context. Now if you have your work item call a method on your command class, it will work just fine and you can interact with other methods and properties. But don't try to interact with injected services or previously selected content items, cause it will fail or behave VERY badly. This is the reason the queue makes it easy to provide several parameters to work items cause here "You can't take it with you". 

Sep 17, 2012 at 3:32 PM

I too am having problems trying to insert many (c 25000) records using our own CSV uploader.

First some background.  We are developing a site for a pub company who operate hundreds of pubs.  Each pub has a list of events (sports on TV, quizzes, etc.) which can be associated with a pub.  We call these PubEvents.  There is a special part for pubs (PubPart) which has a PubEventContainer record to implement the one to many relationship between pubs and pub events.

The total number of records to be inserted each month is in the region of 10^5.

We have successfully implemented an upload feature which allows up to a couple of hundred records to be inserted, but once we tried this with a larger number we hit the problem with transaction timeouts.

Our uploaders all work as follows:

  • foreach line in uploaded file...
  • read in next line; this includes a "client-friendly" ID of the pub (ClientId) which the new event is for
  • use orchardServices.ContentManager to find the pub part with the given ClientId
  • find the container record for the pub part
  • amend or create a new PubEventRecord using the IRepository<PubEventRecord>
  • update PubPart as CommonPart to set the last edited date, etc.
  • next


I have tried wrapping the whole operation in a "using (var scope = new TransactionScope(TransactionScopeOption.Suppress)){ ..." but this failed to complete properly.

I also tried Flush() inside my loop, with and without the Suppress option, to no avail.  Is WorkContext important here, i.e. do we need a new one?

I had a look at the Worker module, but was unsure how I could use this to fit my needs.  How do I fire off the commands in the first place once a user has uplaoded a CSV file to the site?

I would be grateful if anyone could tell me how we should do this in a "Orchard-like" way.  I really don't want to have to resort to using SQL against the database tables without using the framework.

 

    public class PubEventUploader : IUploader
    {
        private readonly IOrchardServices orchardServices;
        private readonly IClock clock;

        private readonly IRepository<PubEventRecord> pubEventRecordRepository;
        private readonly IRepository<PubEventContainerPartRecord> pubEventContainerPartRecordRepository;

        public PubEventUploader(IOrchardServices orchardServices, IClock clock, IRepository<PubEventContainerPartRecord> pubEventContainerPartRecordRepository, IRepository<PubEventRecord> pubEventRecordRepository)
        {
            this.orchardServices = orchardServices;
            this.clock = clock;
            this.pubEventContainerPartRecordRepository = pubEventContainerPartRecordRepository;
            this.pubEventRecordRepository = pubEventRecordRepository;
        }

        //this method processes one line of the file at a time
        public IEnumerable<string> UploadData(string[] header, string[] data)
        {
            var messages = new List<string>();

            var clientId = data[1];

            var currentPub = orchardServices.ContentManager
                .Query<PubPart, PubRecord>()
                .List()
                .FirstOrDefault(pub => pub.ClientId.Equals(clientId, StringComparison.InvariantCultureIgnoreCase));

            if(PropertyService.DeleteRecord(data)) //checks if first item in "data" is the string "true", meaning remove this record.
            {
                if(currentPub != null)
                {
                    var containerRecord = pubEventContainerPartRecordRepository.Table.FirstOrDefault(p => p.Id.Equals(currentPub.Id));
                    if (containerRecord != null)
                    {
                        var eventRecord = containerRecord.Events.FirstOrDefault(e => MatchesPubEvent(e, data));
                        containerRecord.Events.Remove(eventRecord);
                        pubEventContainerPartRecordRepository.Update(containerRecord);
                    }
                    messages.Add(String.Format("Removed PubEvent from pub {0}.", currentPub.ClientId));
                }
            }
            else
            {
                if (currentPub != null)
                {
                    //get container record for pub
                    var containerRecord = pubEventContainerPartRecordRepository.Table.FirstOrDefault(p => p.Id.Equals(currentPub.Id));
                    //add event to container record
                    UpdatePubEvent(messages, header, data, containerRecord);
                    //update pub part as Common part
                    var commonPart = currentPub.ContentItem.As<CommonPart>();
                    var now = clock.UtcNow;
                    commonPart.ModifiedUtc = now;
                    commonPart.CreatedUtc = now;
                    messages.Add(String.Format("Updated PubEvent for pub {0}.", currentPub.ClientId));
                }
                else
                {
                    messages.Add(String.Format("Pub not found with ClientId {0}.", clientId));
                }
            }
            return messages;
        }

        private void UpdatePubEvent(ICollection<string> messages, string[] header, string[] data, PubEventContainerPartRecord containerRecord)
        {
            var record = pubEventRecordRepository.Table
                .Where(per => per.PubEventContainerPartRecord.Id == containerRecord.Id)
                .AsEnumerable()
                .FirstOrDefault(per => MatchesPubEvent(per, data));

            if(record == null)
            {
                record = new PubEventRecord { PubEventContainerPartRecord = containerRecord, EventDate = DateTime.Now};
                pubEventRecordRepository.Create(record);
                messages.Add("Created new PubEventRecord.");
            }
            else
            {
                messages.Add("Updated PubEventRecord.");
            }
            PropertyService.SetProperties(record, header, data); //uses reflection to set properties on record.
        }

        //check if an event already exists and we are just updating the descriptive text
        private bool MatchesPubEvent(PubEventRecord eventRecord, IList<string> data)
        {
            var eventDate = DateTime.Parse(data[2]);
            var eventStartTime = data[3];
            var eventEndTime = data[4];
            var eventCategory = data[5];
            var eventType = data[6];
            var eventCompetition = data[7];
            var eventInstance = data[8];

            return eventRecord.EventDate == eventDate
                && eventRecord.EventStartTime.Equals(eventStartTime, StringComparison.InvariantCultureIgnoreCase)
                && eventRecord.EventEndTime.Equals(eventEndTime, StringComparison.InvariantCultureIgnoreCase)
                && eventRecord.EventCategory.Equals(eventCategory, StringComparison.InvariantCultureIgnoreCase)
                && eventRecord.EventType.Equals(eventType, StringComparison.InvariantCultureIgnoreCase)
                && eventRecord.EventCompetition.Equals(eventCompetition, StringComparison.InvariantCultureIgnoreCase)
                && eventRecord.EventInstance.Equals(eventInstance, StringComparison.InvariantCultureIgnoreCase);
        }

    }

Coordinator
Sep 17, 2012 at 4:16 PM

It's also a problem in import (see http://orchard.codeplex.com/workitem/19028): breaking one big transaction into smaller ones is not easy. I don't know how to do it, but this bug may be the opportunity to find out. Maybe Sébastien has some ideas about this?

Apr 10, 2013 at 4:05 AM
I couldn't find an adequate solution to this, so I coded one up:

https://orchard.codeplex.com/discussions/439720

I hope it helps.

Cheers
Andrew
Developer
Apr 10, 2013 at 7:32 AM
Seems interesting!
Apr 10, 2013 at 7:50 AM
We have an action queue service that handles ensuring that you have a valid workscope / transaction to work with per queue + powered by Smart Thread Pool (see http://www.codeproject.com/Articles/7933/Smart-Thread-Pool.

You just enqueue an action on it and it'll be executed asap.

What you are doing seems more complex than what we are doing though.